Anda di halaman 1dari 408

The Shared Mind

Converging Evidence in Language


and Communication Research (CELCR)
Over the past decades, linguists have taken a broader view of language and are
borrowing methods and findings from other disciplines such as cognition and
computer sciences, neurology, biology, sociology, psychology, and anthropology.
This development has enriched our knowledge of language and communication,
but at the same time it has made it difficult for researchers in a particular field of
language studies to be aware of how their findings might relate to those in other
(sub-)disciplines.
CELCR seeks to address this problem by taking a cross-disciplinary approach
to the study of language and communication. The books in the series focus on a
specific linguistic topic and offer studies pertaining to this topic from different
disciplinary angles, thus taking converging evidence in language and communi-
cation research as its basic methodology.

Editors
Marjolijn H. Verspoor Wilbert Spooren
University of Groningen Vrije Universiteit Amsterdam

Advisory Board
Walter Daelemans Leo Noordman
University of Antwerp Tilburg University
Cliff Goddard Martin Ptz
University of New England University of Koblenz-Landau
Roeland van Hout
Radboud University Nijmegen

Volume 12
The Shared Mind. Perspectives on intersubjectivity
Edited by Jordan Zlatev, Timothy P. Racine, Chris Sinha and Esa Itkonen
The Shared Mind
Perspectives on intersubjectivity

Edited by

Jordan Zlatev
Lund University, Copenhagen Business School

Timothy P. Racine
Simon Fraser University

Chris Sinha
University of Portsmouth

Esa Itkonen
University of Turku

John Benjamins Publishing Company


Amsterdam/Philadelphia
TM The paper used in this publication meets the minimum requirements of
8

American National Standard for Information Sciences Permanence of


Paper for Printed Library Materials, ansi z39.48-1984.

Library of Congress Cataloging-in-Publication Data

The shared mind : perspectives on intersubjectivity / edited by Jordan Zlatev ... [et al.].
p. cm. (Converging Evidence in Language and Communication Research, issn
1566-7774 ; v. 12)
Includes bibliographical references and index.
1. Intersubjectivity. 2. Language and languages. 3. Communication. 4. Evolution. I.
Zlatev, Jordan.
P107.S535 2008
401--dc22 2008015388
isbn 978 90 272 3900 6 (Hb; alk. paper)

2008 John Benjamins B.V.


No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any
other means, without written permission from the publisher.
John Benjamins Publishing Co. P.O. Box 36224 1020 me Amsterdam The Netherlands
John Benjamins North America P.O. Box 27519 Philadelphia pa 19118-0519 usa
Table of contents

Foreword: Shared minds and the science of fiction:


Why theories will differ vii
Colwyn Trevarthen

1. Intersubjectivity: What makes us human? 1


Jordan Zlatev, Timothy P. Racine, Chris Sinha and Esa Itkonen

Part I. Development

2. Understanding others through primary interaction


and narrative practice 17
Shaun Gallagher and Daniel D. Hutto
3. The neuroscience of social understanding 39
John Barresi and Chris Moore
4. Engaging, sharing, knowing: Some lessons from
research in autism 67
Peter Hobson and Jessica A. Hobson
5. Coming to agreement: Object use by infants and adults 89
Cintia Rodrguez and Christiane Moro
6. The role of intersubjectivity in the development
of intentional communication 115
Ingar Brinck
7. Sharing mental states: Causal and definitional issues
in intersubjectivity 141
Noah Susswein and Timothy P. Racine

Part II. Evolution

8. What is the nature of the gestural communication of great apes? 165


Simone Pika
vi The Shared Mind

9. The heterochronic origins of explicit reference 187


David A. Leavens, William D. Hopkins and Kim A. Bard
10. The co-evolution of intersubjectivity and bodily mimesis 215
Jordan Zlatev
11. First communions: Mimetic sharing without theory of mind 245
Daniel D. Hutto

Part III. Language

12. The central role of normativity in language and linguistics 279


Esa Itkonen
13. Intersubjectivity and the architecture of the language system 307
Arie Verhagen
14. Intersubjectivity in interpreted interactions:
The interpreters role in co-constructing meaning 333
Terry Janzen and Barbara Shaffer
15. Language and the signifying object: From convention to imagination 357
Chris Sinha and Cintia Rodrguez

Author index 379


Subject index 383
Foreword

Shared minds and the science of fiction


Why theories will differ

Colwyn Trevarthen

It is a pleasure to respond to these essays on the collective story-making of cul-


ture: the experience of reality that human beings create together because they are
motivated from birth to experiment with the exchange of fantasies and to find
meaning in them. Human consciousness has the special gift of imaginative travel
through times and places, and it grows through communication of intentions
and interests. Language and the practical tools of our society enrich the products
of the game, but its causes are in the movements and preferences of embodied
minds, minds that have evolved to act in sympathy and to share history and in-
vention, whatever may turn out to be the topic or task. Our common knowledge
and perception of ourselves as knowers of meaningful facts depends upon, and
grows from, our innate capacity for intersubjectivity.
We know, of course, how involved we are with one anothers intentions,
thoughts and feelings, and that much of this intimacy in experience cannot be
carried in words. The mental life of others is, as Stein Brten says, felt immedi-
ately (Brten 1998). And yet mind science and its ambitious extension in brain
science have, and still mainly do, regard us as single heads processing informa-
tion, storing it up in memory for reprocessing, and transferring it symbolically.
Even when we are granted a body that moves, it is a robot that struggles to know
other minds by a hopeless effort of theorizing or simulation. Such unsympa-
thetic entities are science fictions. We need a science of the imaginative fictions
persons so easily share.
The authors of this book accept that human life and culture is incomprehen-
sible without intersubjective processes so, the question mark of the title of the
editors introduction in Chapter 1 must be rhetorical, and ironic. It is added, per-
haps, because our experimental psychology has inherited and largely still pays
homage to a scholastic philosophy of minds as separate experience-registering
systems that act, and think, on what they alone perceive. But in the real sociable
world every act we make, every feeling, has as much power to move others as it
viii Colwyn Trevarthen

has to move our self. With compassion we can see causes of actions in another,
even causes that they themselves fail to comprehend or control. All teaching and
therapy, indeed all cooperative activities, depend on this sympathetic insight into
motive impulses and emotions in human moving. I believe that all the inventions
of culture, including the evolving languages that distinguish our different societ-
ies, and the arts and technologies that are necessary instruments of communal life
and treasures of our history, grow from the ability that every young infant has to
enter into the co-creation of a proto-conversational narrative with an entranced
parent. Our stories of meaning are built on mimetic skills we have inherited from
highly sociable animal ancestors, but we are born with new motives for fictional
elaboration of rituals. Even our personality, the who we are and the narrative
of what we have done and known, grants us the role of one protagonist in a so-
cial drama where significant others live as supportive allies or contentious rivals
(Trevarthen 1993, 1998a, 1998b). Thus we become companions or aliens in rela-
tion to a meaningful world. Shared minds create all we know.
Reading this book we sense the authors are glad to be free of a prison built of
ideas that are unaware and unsympathetic of how we really live. They present an an-
tithesis to the computational or representational mind, and seek to define the special
human mind, which is not just conscious and rational, but has a unique intersubjec-
tive awareness that makes up explanations of a shared and artificial world a mind
that builds cultures with power to change nature. Given the exploratory nature of
the topic, inevitably, they come to somewhat different conclusions.
Because several authors make generous reference to my research on commu-
nication in infancy, and the theory of Innate Intersubjectivity I was rash enough
to propose 30 years ago, I feel I should explain the particular scientific experience
that supported the project, and the influence of teachers and colleagues who were
ahead of me in the story. I was trained as a student of biology to master ways of
observing in detail how plants grow and how animals move in adaptive ways. My
undergraduate teachers were plant ecologists, physiologists and ethologists. My
PhD research was on the experimental neuropsychology of visual consciousness
in monkeys with Roger Sperry, who had proposed in 1952 that perceiving must be
understood as information picked up to guide moving that the science of con-
sciousness or mind in the brain should begin by asking how the brain moves the
body in intelligent ways (Sperry 1952). My experiments with split-brain monkeys
proved that willing to do something can indeed determine what a brain sees.
I began work with infants in 1967, in collaboration in with Jerome Bruner,
who wished to examine infant cognition and learning in a different way from
Piagets experiments on infants object concepts; Berry Brazelton, who was pi-
oneering more sensitive and responsive paediatric care for newborns and their
mothers (Brazelton 1979); and Martin Richards, an ethologist of mammalian
Foreword ix

aternal behaviour. Our aim was to observe what came about, rather than exper-
m
iment with a priori hypotheses about infant perception or cognition to record
in as complete detail as possible what could be seen and heard when a mother and
infant were communicating, and to compare it with what the baby would do when
oriented to an inanimate object. We saw complex conversation-like engagements
in which both infant and mother exhibited intuitive competence for sharing their
impulses, and we realised that there was no science to explain this. At about the
same time two other persons anthropologist and linguist Mary Catherine Bate-
son, and developmental psychiatrist Daniel Stern were discovering the same
phenomena and attributing them to innate motivations of an intersubjective kind
(Bateson 1979; Stern 1971). We were, without knowing one another, exploring
out of the psychological box, free to observe the cleverness in infants and their
companions and free to speculate about their significance for human relation-
ships and for cultural learning. All of us were entranced by the infants rhythmic
sympathy with a parents attempts to communicate, and their joint inventiveness.
Through the 1970s, using film and television to record and patiently micro-
analyse, I charted age-related changes in the play and attributed them to innate
motives, the development of new sensory and motor competences in the infant,
and sensitive intuitive support from the mother (Trevarthen 1974). I took the
term intersubjectivity from an inspiring article Joanna Ryan wrote on the de-
velopment of communicative competence before language (Ryan 1974), and her
comparison of the infants tactics with those Jrgen Habermas had defined as the
intersubjective functions or dialogic universals through which conversational
exchanges and cooperative meaning-making are regulated in society (Habermas
1970). At the same time Jerome Bruner led a neo-Vygotskian transformation of
developmental and educational theory that gave primacy to collaborative learning
in meaningful tasks (Bruner 1968, 1990). Children gain the skills and language of
their culture, and learn how to manage the material world in cooperative ways,
by way of their will to share purposes, interests and objects (Sinha and Rodrguez
this volume).
One of my young colleagues, Penelope Hubley, making a careful longitudi-
nal study of mother-infant companionship in the early 1970s, observed changes
from proto-conversations of two-month-olds, through play in games, first of the
body, then with shared interest in objects, to the remarkable transformation of
the infants motives at 9 months when the baby became a different kind of partner
in intent participation (Hubley and Trevarthen 1979). The enjoyment of play-
ful rituals by six-month olds in games with expressive gestures or toys, enjoying
at a new ritualised level the teasing meta-communication that Gregory Bateson
had identified as the critical element in animal play (Bateson 1955), was replaced
by a more serious intent to do work with objects that had some potential for
 Colwyn Trevarthen

ractical use, which others would acknowledge. Guided by a companions shifting


p
focus of interest, and by exhortations to complete a little project set by move-
ments of intention, the baby became a self-confident partner a co-worker. At
the same age, about 40 weeks after a full term birth, the baby was a self-possessed
and self-conscious performer of many new rituals of social expression. The mu-
tual understanding established in previous months and practiced in games was
transformed into what Michael Halliday called proto-linguistic acts of mean-
ing: vocal and gestural signals of things that might be named (Halliday 1975).
We called it Secondary Intersubjectivity. The relevance to cultural learning of this
trajectory in growth of the infants mind was clear, as was the primary importance
of the mutual attention with a familiar companion. Strangers were too uncom-
prehending to be trusted in such first steps to a conventional world. And sensitive
experimental studies by Lynne Murray proved how important contingent and re-
spectful attention and sympathy of feelings was for the infant to build meaning in
anothers company (Murray and Trevarthen 1985).
True, as Susswein and Racine (this volume) say, my account of the develop-
ments, from Primary Intersubjectivity, through Games of the Person and Games
with Objects to Secondary Intersubjectivity, which we continued beyond the first
use of words to the Imaginative Play of 2 and 3 year olds, was descriptive. Yes, it
was a taxonomy of stages in behaviour, but it was meant to be more than that.
It implied and explored a theory, and I sought many kinds of evidence for the
causes of change, especially causes in the infants growing mind. I was convinced
that the only useful explanation was one that assumed that the fundamental adap-
tations of body and brain for intersubjectivity were innate, as were the direction
and stages of developmental change through the early years, and the learning that
was so obviously assisted by the companionship of the parent (Trevarthen 1979,
1989). True to my biological principles, and starting with a theory of how neural
systems could generate motives, I looked for explanations in the ontogeny of the
brain and body of the child, and for correlations with known age-related changes
in brain anatomy and function. It was not difficult to collect evidence that an
embryogenetic specification of the motor and sensory functions in somatotopic
(body mapping) arrays was essential, as were the theories of experience antici-
pating motor images of Sperry (1952) and Bernstein (1967). At first we did not
have a clue how the intersubjective transfer of these intentional images could be
mediated, or what the evident emotional regulations were, but in the last two
decades brain science has come some way to closing that gap (cf. Gallese 2005;
Pankepp2005; Barresi and Moore this volume).
One discovery of major significance for any theory of the causal factors or
processes of intersubjectivity, whether of humans or animals, is that the rhythmic
Foreword xi

timing and modulation of energy in moving is a code or principle of conduct


that makes motives share-able. A breakthrough in the exploration of human com-
munication before language has come from the demonstration of its special poly-
rhythmic musicality (Trevarthen 1999, 2008). The science of time in the mind
or biochronology is discovering how impulses regulating the pace and harmony
of moving pass from actor to perceiver of action. How, as Ellen Dissnayake claims
(Dissnayake 2000), the temporal arts originate in intrinsic dynamic processes, al-
ready exquisitely present in a newborn baby, that keep the voluntary and conscious
self whole and in well-being, and make them public for sharing in intimacy. Mi-
mesis, which I see as richer already in the neonate than Zlatev (this volume) does
(because I am sure that self and other are distinct negotiants in the newborn babys
mind) is the parent of linguistic narrative, as Merlin Donald (2001) proposes, and
musical semantics as defined by Ole Khl sets the stage for reference with symbols
(Khl 2007). I think these are the natural foundations for Itkonens normative
practices that keep languages, and other cultural creations, coherent, productive
and changing (Itkonen this volume). The story is new and there is plenty of room
for different plots, but we have an open prospect and a sense of adventure. The sci-
ence of the shared mind looks like the best game in town.

References

Barresi, J. and Moore, C. this volume. The neuroscience of social understanding.


Bateson, G. 1955. A theory of play and fantasy. Psychiatric Research Reports, Series A 2: 3951
Bateson, M.C. 1979. The epigenesis of conversational interaction: A personal account of research
development. In Before Speech: The Beginning of Human Communication, M.Bullowa (ed.),
6377. London: Cambridge University Press.
Bernstein, N. 1967. Coordination and Regulation of Movements. New York: Pergamon.
Brten, S. 1998. Intersubjective communion and understanding: Development and perturba-
tion. In Intersubjective Communication and Emotion in Early Ontogeny, S. Brten (ed.),
372382. Cambridge: Cambridge University Press.
Brazelton, T.B. 1979. Evidence of communication during neonatal behavioural assessment. In
Before Speech: The Beginning of Human Communication, M. Bullowa (ed.), 7988. London,
Cambridge University Press.
Bruner, J.S. 1968. Processes of Cognitive Growth: Infancy. (Heinz Werner Lectures, 1968)
Worcester, Mass: Clark University Press with Barri Publishers.
Bruner, J.S. 1990. Acts of Meaning. Cambridge, Mass.: Harvard University Press.
Dissanayake, E. 2000. Art and Intimacy: How the Arts Began. University of Washington Press,
Seattle and London.
Donald, M. 2001. A Mind So Rare: The Evolution of Human Consciousness. New York, NY and
London, England: Norton.
xii Colwyn Trevarthen

Gallese, V. 2005. Embodied simulation: From neurons to phenomenal experience. Phenom-


enology and the Cognitive Sciences 4: 2348.
Habermas, J. 1970. Towards a theory of communicative competence. Recent Sociology Vol. 12:
115148, London: Macmillan.
Halliday, M.A.K. 1975. Learning How to Mean: Explorations in the Development of Language.
London: Edward Arnold.
Hubley, P. and Trevarthen, C. 1979. Sharing a task in infancy. In Social Interaction During
Infancy: New Directions for Child Development, 4, I. Uzgiris (ed.), 5780. San Francisco:
Jossey-Bass.
Itkonen, E. this volume. The central role of normativity for language and linguistics.
Khl, O. 2007. Musical Semantics. (European Semiotics: Language, Cognition and Culture.
No.7). Bern: Peter Lang.
Murray, L. and Trevarthen, C. 1985. Emotional regulation of interactions between two- month-
olds and their mothers. In Social Perception in Infants, T.M. Field and N.A.Fox (eds), 177
197. Norwood, NJ: Ablex.
Panksepp, J. (2005). On the embodied neural nature of core emotional affects. Journal of Con-
sciousness Studies 12: 15884.
Ryan, J. 1974. Early language development: Towards a communicational analysis. In The Inte-
gration of a Child into a Social World, M.P.M. Richards (ed.), 185213. London: Cambridge
University Press.
Sinha, C. and Rodrguez, C. this volume. Language and the signifying object: From convention
to imagination.
Sperry, R. W. 1952. Neurology and the mind-brain problem. American Scientist 40: 291312.
Stern, D.N. 1971. A micro-analysis of mother-infant interaction: Behaviors regulating social
contact between a mother and her three-and-a-half-month-old twins. Journal of Ameri-
can Academy of Child Psychiatry 10: 501517.
Susswein, N and Racine, T.P. this volume. Sharing mental states: Causal and definitional issues
in intersubjectivity
Trevarthen, C. 1974. Conversations with a two-month-old. New Scientist 2 May: 230235.
Trevarthen, C. 1979. Instincts for human understanding and for cultural cooperation: Their
development in infancy. In Human Ethology, M. von Cranach, K. Foppa, W. Lepenies and
D. Ploog (eds), 530571. Cambridge: Cambridge University Press.
Trevarthen, C. 1989. Motives for culture in young children their natural development through
communication. In The Nature of Culture (Proceedings of the International and Interdisci-
plinary Symposium, Ruhr Universitt, Bochum, October 711, 1986), W. Koch (ed.), 80119.
Bochum: Brockmeyer.
Trevarthen, C. 1993. The self born in intersubjectivity: An infant communicating. In The Per-
ceived Self: Ecological and Interpersonal Sources of Self-Knowledge, U. Neisser (ed.), 121
173. New York: Cambridge University Press.
Trevarthen, C. 1998a. The concept and foundations of infant intersubjectivity. In Intersubjec-
tive Communication and Emotion in Early Ontogeny, S. Brten (ed.), 1546. Cambridge:
Cambridge University Press.
Trevarthen, C. 1998b. The nature of motives for human consciousness. Psychology: The Jour-
nal of the Hellenic Psychological Society (Special Issue: The Place of Psychology in Con-
temporary Sciences, Part 2. Guest Editor, T. Velli) 4(3): 187221.
Foreword xiii

Trevarthen, C. 1999. Musicality and the Intrinsic Motive Pulse: Evidence from human psy-
chobiology and infant communication. Musicae Scientiae, Special Issue, 19992000,
Rhythms, musical narrative, and the origins of human communication, 157213. Lige:
European Society for the Cognitive Sciences of Music.
Trevarthen, C. 2008. The musical art of infant conversation: Narrating in the time of sympa-
thetic experience, without rational interpretation, before words. In Musicae Scientiae, Spe-
cial Issue Narrative in music and interaction. (in press), M. Imberty & M. Gratier (eds).
Lige: European Society for the Cognitive Sciences of Music.
Zlatev, J. this volume. The co-evolution of intersubjectivity and bodily mimesis.
chapter 1

Intersubjectivity
What makes us human?

Jordan Zlatev, Timothy P. Racine, Chris Sinha and Esa Itkonen

1. Introduction

The title of this book, The Shared Mind, conforms to a linguistic schema The
X Mind that has become common within the interdisciplinary fields of cogni-
tive science and consciousness studies. The current volume thus stands in a line
of succession from The Embodied Mind (Varela, Thompson and Rosch 1992), The
Discursive Mind (Harr and Gillet 1994), The Conscious Mind (Chalmers 1996),
The Extended Mind (Clark and Chalmers 1998) and The Social Mind (Valsiner
and van der Veer 2000). Like some of its predecessors, The Shared Mind advances
an anti-thesis to the classical Computational (Jackendoff 1987) or Representa-
tional Mind (Fodor 1987), with its oft-criticised neglect of the role of the body,
phenomenal experience, social interaction and culture.
At the same time, the present volume advances a position (or rather, a set of re-
lated positions) that has not been sufficiently explored by its predecessors. Non-hu-
man animals also have an embodied mind, and there are no good reasons to deny
that at least birds and mammals also have a conscious mind (Edelman1992).
However, although other species may have varying degrees of awareness, they do
not seem be fully aware of the subjectivity of others. And whereas human beings
go on to engage in discursive practices and rely on material and symbolic culture,
both of which have powerful formative effects on the human mind, something
more ontogenetically and phylogenetically basic seems required to be able to ben-
efit from these central aspects of human social life. This foundation seems to be
provided by a uniquely human capacity for intersubjectivity.
In the simplest terms, intersubjectivity is understood by the authors repre-
sented in this book as the sharing of experiential content (e.g., feelings, perceptions,
thoughts, and linguistic meanings) among a plurality of subjects. Although some
non-human species manifest some aspects of the capacity or capacities that make
up intersubjectivity, they appear to lack others. On the other hand, no human be-
ing is entirely devoid of the human intersubjective potential even though they
 Jordan Zlatev et al.

may be delayed or challenged in the expression of some of its manifestations,


such as is the case for people with autism. These considerations underlie our
bold contentions that the human mind is quintessentially a shared mind and that
intersubjectivity is at the heart of what makes us human.

2. Intersubjectivity vs. Theory of mind

The hitherto dominant approach in psychology, cognitive science and philosophy


has been to analyze what has come to be known as social cognition in terms of
a theory of mind (or mentalizing) that purportedly solves the philosophical
and developmental problem of other minds. Consider, for example, the title of a
recent volume with an apparently similar theme to the present one: Other minds:
How humans bridge the divide between self and others (Malle and Hodges 2005).
Despite the important empirical findings and hypotheses generated by the Theory
of Mind (ToM) approach, it is our contention that its framing of the research ques-
tion has significantly obscured rather than clarified what needs to be explained.
The basic assumptions of the ToM approach can be formulated as follows:

There is a primary separation between the self and (the minds of) others.
The individual must bridge this separation either by some form of theory or
simulation of the others mind, a process that is more or less fallible.
The main bodily structures that are directly relevant for the process are
those innate or acquired modules engaged in the inferential or simulation
processes.
Cognition develops essentially from the inside out, with innate or acquired
cognitive skills being eventually transferred or projected onto others for the
purpose of explaining and predicting their behaviour.

From such a point of departure, it is unsurprising that there appears to be not only
a divide, but a veritable gulf between self and others, one that is so wide that it is
doubtful whether it could ever truly be bridged. Such a pessimistic assessment of
the human condition is hard to justify how, if it were so, would young children
be able to coordinate their basic activities with others, and eventually acquire a
shared public language? How could we account for such universal forms of hu-
man experience as mutual affection and sympathy? In contrast to the four claims

. In stating this we are aware that profound and multiple intellectual impairments may raise
empirical questions about this claim, but we make it as a generalization with a fundamental
theoretical status. We also stress that, even given cases of empirical doubt, our claim does not
imply that such individuals should be thought of as not having the status of human beings.
Intersubjectivity: What makes us human? 

listed above, the contributors to the present volume broadly agree on the follow-
ing propositions:

Human beings are primordially connected in their subjectivity, rather than


functioning as monads who need to infer that others are also endowed with
experiences and mentalities that are similar to their own.
The sharing of experiences is not only, not even primarily, on a cognitive level,
but also (and more basically) on the level of affect, perceptual processes and
conative (action-oriented) engagements.
Such sharing and understanding is based on embodied interaction (e.g., em-
pathic perception, imitation, gesture and practical collaboration).
Crucial cognitive capacities are initially social and interactional and are only
later understand in private or representational terms.

The main precursors and originators of these views in the last century were
Husserl, Vygotsky and Wittgenstein. Husserl, the founder of phenomenology,
has only recently been properly understood in the Anglo-Saxon world to be con-
cerned not with the nature of private experience, but with structures of experience
which give us a common life-world, serving as the pre-condition of any objectivity
(Zahavi2003; Moran 2005). Furthermore, he was the first modern thinker to em-
phasize the role of the body for the emotional tone and the perceptual richness of
the life-world, and for our transparent relations with others (cf. Gallagher 2005).
For example, he stated:
I do not first constitute my things and my world solipsistically, then grasp by em-
pathy the other I which too grasps itself solipsistically as constituting its world,
and then and only then, the constituted unity of both are to be identified; my self
unity (Sinneinheit) exists because of the facts that the foreign multiplicity is not
different from mine, it is eo ipso the same
 (Husserliana 14:10, translated and quoted by Moran 2005:225)

Other scholars such as Merleau-Ponty (1962), Scheler (1954) and Schutz (1966)
continued this tradition and developed complementary accounts of intersubjec-
tivity (cf. Zahavi 2001) whose common theme is that the basic forms of under-
standing others are not inferential, but rather direct (cf. the chapter by Gallagher
and Hutto). Scheler stresses the implications of this for accounts of perception in
a way that is reminiscent of J. J. Gibsons (1979) ecological psychology:
For we certainly believe ourselves to be directly acquainted with another persons
joy in his laughter, with his sorrow and pain in tears, with his shame in blushing,
with his entreaty in his outstretched hands And with the tenor of his thoughts
in the sound of his words. If anyone tells me that this is not perception, for it
 Jordan Zlatev et al.

cannot be so, in view of the fact that a perception is simply a complex of physical
sensations I would beg him to turn aside from such questionable theories and
address himself to the phenomenological facts.
 (Scheler 1954, cited in Gallagher 2005:228)

Compare Scheler also to the later Wittgenstein (1980:570), who similarly at-
tempted to dispel the myth of the isolated subject:
We see emotion. As opposed to what? We do not see facial contortions
and make the inference that he is feeling joy, grief, boredom. We describe a face
immediately as sad, radiant, bored, even when we are unable to give any other
description of the features.

Through enigmatic aphorisms such as Nothing is hidden and Understanding is


not a mental phenomenon, Wittgenstein highlighted the essential dependence of
thinking on public criteria, and concentrating on the linguistic aspect of this issue,
he rendered the notion of a private language self-contradictory (cf. the chapter
by Itkonen). Wittgenstein also drew attention to the fact that body, mind and
behaviour are different aspects of the unity that we call persons. Thus, although
such aspects of persons are non-identical and we therefore cannot reduce one to
the other, Wittgenstein argued that they are necessarily and, hence conceptually,
related and that we typically talk about one via the other.
Vygotsky was a more multi-faceted thinker, creatively combining philosophy,
psychology, literature, primatology and education. From a broadly Marxist per-
spective (though he was accused of idealism by the guardians of Soviet ideologi-
cal orthodoxy), he famously asserted the general principle of the primacy of social
interaction in the development of what he considered to be the specifically human
higher mental functions, such as memory, reasoning and language:
Every function in the childs cultural development appears twice: first, on the
social level, and later, on the individual level; first between people (interpsycho-
logical), and then inside the child (intrapsychological). This applies equally to
voluntary attention, to logical memory, and to the formation of concepts. All the
higher functions originate as actual relations between human individuals. 
 (Vygotsky 1978:57)

Like Wittgenstein, the crucial social semiotic mediational tool for Vygotsky was
language, but he also considered the role of other semiotic resources such as arti-
facts and gestures (cf. the chapters by Rodrguez and Moro; Sinha and Rodrguez)
in the childs cultural development.
Intersubjectivity: What makes us human? 

3. Perspectives

Although there is a good deal of coherence between the positions of the phenom-
enologists, Wittgenstein and Vygotsky (as well as those of other classic theorists
who feature in the discussions in the following chapters, such as Durkheim, Mead
and Bateson) with respect to what they reject that is, the notion of a monadic,
individual mind, ultimately incapable of reaching out beyond its confines to the
world and others there are important differences between the positions that
they advocate. In a similar vein, while all the authors represented here agree on
the crucial role of intersubjectivity in human communication and consciousness
of self and other, they offer (as the subtitle of this volume suggests) different an-
swers to questions such as the following:

What is or are the precise sense or senses of the term intersubjectivity, at


what level of organization does it exist, and how does it relate to other notions
of the shared mind such as common knowledge (cf. the chapters by Itkonen
and Sinha and Rodrguez).
More specifically, should we understand the term intersubjectivity as per-
taining primarily to a mental or inter-mental capacity, or to the actual in-
stances of participatory practice that both depend upon, and are instrumen-
tal in developing, this human capacity? Are these merely different aspects or
emphases, or do they constitute fundamentally different perspectives (cf. the
chapter by Susswein & Racine)?
To what extent is there a species-specific, biological basis for the human ca-
pacity for intersubjectivity per se, and to what extent is it the consequence of
social, ecological and cultural factors (cf. the chapters by Leavens, Hopkins
and Bard, Rodrguez and Moro and Sinha and Rodrguez)?
To what extent is human intersubjectivity brought about by language, and
what might be the prerequisite conditions for developing or evolving a capac-
ity for language (cf. the chapters by Gallagher and Hutto, Hutto and Zlatev)?
Does intersubjectivity involve an irreducibly mental aspect that is accessible
to consciousness, or is this an (over-) attribution based on manifest behaviour
(cf. the chapters by Brinck and Leavens, Hopkins and Bard)?
What aspects of human intersubjectivity (e.g., the mutual understanding be-
tween two subjects that they are attending to the same object) might play a
causal role in guiding action and which are definitional rather than causal (cf.
the chapter by Susswein and Racine)?
 Jordan Zlatev et al.

Because they are addressed within the chapters that follow, we will not attempt
to answer these questions here. But we wish to highlight the following points of
(possible) disagreement between some of the authors, to which the reader may
wish to pay special attention:

The chapters by Gallagher and Hutto, Barresi and Moore and Zlatev adopt
stage-based accounts of the development (and evolution) of intersubjectivity,
while Brinck argues for more continuous development, involving partially
independent capacities.
The chapters by Pika, Zlatev and Verhagen focus on both continuities and
discontinuities between animal and human intersubjectivity, while Leavens,
Hopkins and Bard find support for a strong form of continuity, prior to the
emergence of language.
Hobson and Hobson base their account of autism on an impairment of a spe-
cifically human capacity for identification with others, while Brinck seeks an
account in which more simple skills and developmental patterns combine
and interact in order to yield more general cognitive and emotional endow-
ments.
Susswein and Racine argue that intersubjectivity is primarily a taxonomic
term, used to group together certain kinds of social interactions which we by
definition take to involve one or another form of experiential sharing, rather
than a term denoting hidden mental or neurological processes (in contradis-
tinction to e.g., Barresi and Moore).
Finally, while most authors adopt a definition of intersubjectivity such as the
sharing and understanding of experiential content, Sinha and Rodrguez con-
clude the volume by stressing the primacy of participatory engagement, and
the need to extend inter-mentality to encompass inter-corporeality and
inter-objectivity.

In another sense of the word perspectives, this volume brings together approach-
es and insights from a variety of disciplines: philosophy, linguistics, primatology,
evolutionary theory, neuroscience and typical and atypical human development.
The authors do not limit themselves to their disciplinary confines, a fact that
strengthens the dialogic aspect of the book. Nevertheless, for the sake of perspi-
cuity, we have organised the contributions thematically into three parts dealing
primarily with Development, Evolution and Language respectively.
Intersubjectivity: What makes us human? 

4. Overview of the chapters

It is appropriate that Part I should focus on ontogenetic development, since it


is largely through the path-breaking work of Colwyn Trevarthen (1979) on pri-
mary (dyadic) and secondary (triadic) intersubjectivity in the first year of life, and
Daniel Stern (1985) on the interpersonal world of the infant, that the concept of
intersubjectivity has emerged as a key issue for the contemporary sciences of the
human mind.
Departing from Trevarthens seminal work, Gallagher and Hutto present an
account of the progressive emergence of intersubjective skills in childhood, argu-
ing that understanding others requires neither a theory theory nor a simula-
tion theory of mind, but is made possible by a sensitivity to bodily movements,
gazes, facial expressions and, in secondary intersubjectivity, interactions in prag-
matic contexts. In reply to claims that a theory of mind is necessary to account
for the more sophisticated interactions of older children and adults involving the
concepts of folk psychology, Gallagher and Hutto propose the Narrative Practice
Hypothesis, according to which it is through direct encounters with stories about
reasons for acting in interactive contexts with caregivers that children become
familiar with the core structure of folk psychology. Gallagher and Hutto demon-
strate how combining insights from the phenomenological and the Wittgenstei-
nian traditions can yield a productive and novel account of a range of empirical
findings, consistent with recent developments in neuroscience.
Barresi and Moore elaborate on this neuroscientific theme, going much fur-
ther than the now familiar (but rarely explanatory) references to mirror neurons.
They present an up-dated version of their theory of social understanding called
Intentional Relations Theory, consisting of four levels, through which first-person
and third-person information is matched, yielding both self-other equivalence
and differentiation. Interestingly, the two models of the development of intersub-
jectivity presented in these two chapters, while formulated independently, appear
to be largely compatible. At the same time, the relationship between intentional
relations and narrative would merit further consideration, since the latter does
not seem to play a crucial role in the model offered by Barresi and Moore. Finally,
Barresi and Moore outline an application of their stage or level-based model to
autism, suggesting that it is precisely the inability to combine proprioceptive and
sensorimotor information about the self with exteroceptive information about
others that is at the core of autism spectrum disorders.
 Jordan Zlatev et al.

Autism is further discussed by Hobson and Hobson, who highlight the piv-
otal significance of the human propensity to identify with other persons, which
they suggest is compromised in children with autism. In reviewing a number of
their recent studies, the authors argue that in order to share experiences, typi-
cally developing children are psychologically linked to the other person and at
the same time differentiated. Hobson and Hobson argue further that these early
forms of sharing, and the varieties of communication that they support, provide
the foundations for the conceptual understanding of self and other that emerges
around the middle of the second year of life. Methodologically, the authors show
that our human capacity for reflective intersubjectivity is necessary for analysts to
be able to make their judgments in the coding of overt behaviour, and that such
rating is itself a second-person phenomenon, rather than a matter of detached
observation.
Rodrguez and Moro also adopt the use of a qualitative, clinical method in
analyzing parent-infant triadic interactions. They draw on Vygotsky and Wittgen-
stein to show that long before a child is able to produce his or her first symbolic
and ostensive uses of objects, the first pointing gestures and the first words, the
adult acts with the child as a symbol maker. That is, the adult produces ostensive
actions with objects, points to them to make clear his or her intentions, using
them in a canonical manner and talking to the child almost constantly. Rodrguez
and Moro argue that this implies a long process and a variety of levels of intersub-
jective adult-baby agreement on the use of objects, which serves as a precondi-
tion for subsequent social and cognitive development. On a more general level
they argue against the assumption (basic to ToM approaches) of the opacity of
social reality.
Brinck proposes a conceptual clarification of intersubjectivity as the shar-
ing of experiences, suggesting that it divides into a multitude of sub-concepts
depending on the understanding of the terms sharing and experiences. She
distinguishes, following Stern (1985), three kinds of shared experiences: emotion,
attention, and intention. However, Brinck argues for a more external, behaviour-
based way of defining them (thus implicitly disagreeing methodologically with
Hobson and Hobson). Brinck argues that different combinations of these forms of
intersubjectivity enable corresponding intentionally communicative behaviours,
providing a novel explanation of why intentional communication first appears at
the end of the first year, despite the fact that its ingredients are manifest much
earlier. It is, she claims, the ability to decontextualize and combine the various
intersubjective skills in a flexible manner that underlies the emergence of inten-
tional communication, rather than this being the culmination of a stage model.
Susswein and Racine take a more deflationary approach than the preceding
chapters, exploring the distinction between causal and definitional issues in order
Intersubjectivity: What makes us human? 

to distinguish between explanations of what an organism is doing and how they


are doing it, as well as between different types of causal explanation. They argue
that intersubjectivity is a taxonomic rather than a causal explanatory concept,
i.e., a technical concept used to classify interactive behaviours and abilities rather
than to denote vehicles or causes of those behaviours and abilities. They critically
examine the idea that intersubjective engagement involves the sharing of mental
states, and argue that the role of mental states and experience in intersubjective
engagement is often misconstrued. Finally they apply this approach to human
activity to distinguish reflective and practical understanding and consider the
meaning of declarative pointing in early childhood.
Part II of the volume deals with the capacities for intentional communica-
tion and intersubjectivity of non-human primates, and especially of our closest
relatives in the animal kingdom, the Great apes; and more generally, with the
question of the evolution of human intersubjectivity. Comparative psychology is
a field rife with controversies, with persistent disagreement between those who
emphasize discontinuities and those who argue for (strong) continuity between
non-human and human capacities. This debate is reflected by the diverging posi-
tions taken by the first two chapters in this section.
Pika compares the communicative gestures of bonobos, chimpanzees, gorillas
and orangutans and shows that Great apes have multifaceted gestural repertoires.
She demonstrates that these gestures are performed in multiple contexts and are
used flexibly, but unlike those of human children seem to be learned mainly via
an individual learning process. Being intentional and referential acts, the gestures
of the Great apes display at least a nascent understanding of intentionality, and
can plausibly have served as a stepping stone in the evolution of language. On
the other hand, Pika also emphasizes the differences between human and ape
gestures, suggesting an innate bias for human cultural learning.
In contrast, in analyzing the pointing gesture, Leavens, Hopkins and Bard ar-
gue against such an innate bias on the basis of the fact that although both captive
and wild apes are sampled from the same gene pool, captive apes spontaneously
point without overt training whereas wild apes almost never point. The authors
review empirical evidence of the development of pointing in apes and in human
children, highlighting the significance of species-typical motor development, cul-
turally-specific patterns of child restraint, and stages of cognitive development
for the development what they call explicit reference. Their major claim is that
a capacity for explicit reference emerges in our nearest living relatives when they
experience similar circumstances to those of human children, involving the rich
emotional contours of human affectivity.
Zlatev distinguishes between five evolutionary (and developmental) levels of
intersubjectivity and suggests that the latter has co-evolved with bodily mimesis:
10 Jordan Zlatev et al.

the use of the body for communicative and representational purposes. He reviews
evidence from primatology to suggest that feral and captive apes are at least to a
degree capable of the first two levels (involving e.g., empathy, shared attention
and imitation), but not of the third level, triadic mimesis, which involves an un-
derstanding of communicative intentions. Enculturated, language-trained apes,
on the other hand, show some aspects of triadic mimesis, suggesting how our
predecessors could have bootstrapped themselves to this level without language
and may have inherited a biological bias for it. The emergence of language, on
the other hand, opened the way to higher levels, allowing (consistently with the
proposal of Gallagher and Hutto in Part I) the understanding of beliefs and the
use of folk psychology.
Hutto develops a similar evolutionary scenario in more detail, arguing that
an innate theory of mind mechanism is neither necessary nor sufficient to ac-
count for the evolutionary course of hominid social interaction. Such a purported
module, he argues, fails to explain the remarkable technical advances and the
imitative capacities of Homo ergaster/erectus lying somewhere between those of
apes and modern humans. Hutto considers the evidence instead for a Mimetic
Ability Hypothesis and invokes the notion of re-enactive imagination in deter-
mining what this ancient adaptation was likely to have involved. He argues that
this hypothesis provides a better explanation of the kinds of intersubjectivity that
would have been necessary for the development of language, thereby undercut-
ting the strongest argument for positing the existence of innate theory of mind
modules, namely the support they lend to intention-based semantics.
Part III takes up and develops the theme of the relationship between inter-
subjectivity and language. It is widely acknowledged that language requires inter-
subjectivity, most straightforwardly because one needs to know what another is
referring to in order to learn a language. But this consensus conceals a good deal
of debate concerning the nature of language and its relation to thought and con-
sciousness, and the extent to which language might stretch and reshape our basic
intersubjective capacities.
Itkonen opens this section by focusing on the central role of normativity for
language and linguistics, an issue discussed in brief by Zlatev in Part II, as well as
the other contributors of this section. He supports his argument by Wittgensteins
private language argument, showing that language is impossible without pub-
lic criteria of correctness. Itkonen further presents and defends the ontology of
the social as third-level common knowledge, and suggests a dialectical synthesis
between collectivism and individualism: common knowledge consist of men-
tal states in particular configurations, being therefore both based on and irreduc-
ible to (individual) consciousness. Finally, he spells out a number of ramifica-
tions of his arguments for theoretical and empirical linguistics and concludes by
Intersubjectivity: What makes us human? 11

inpointing a priori (cf. devout: Honderich 2006) physicalism (there is nothing


p
but matter and energy fields) as the roots for the anti-normative bias in linguis-
tics and related fields.
Verhagen argues that intersubjectivity is systemically encoded in natural lan-
guages, contrasting this view with the standard informational conception of
linguistic communication. He reviews constructions at different levels of gram-
matical organization to show that linguistically coded relations of intersubjective
coordination exhibit the specific character of being rhetorical or argumenta-
tive. He argues that although these forms may result in cooperation, their im-
mediate function is to influence another persons mind and to make discourse
proceed in a particular direction. Because of this, he suggests that language is to
some extent analogous with other animal communication systems, which also
involve the management and assessment of other organisms, notably conspecif-
ics. On the other hand, he argues for an important discontinuity, in that such
management and assessment is oriented in human beings to other minds, rather
than to immediate behavior.
Janzen and Shaffer take this theme into a different context, noting that be-
cause interlocutors make constant assumptions about what information is active
within each others consciousness, this process creates an interesting challenge in
third-party interpretation from one language to another. Their major point is that
when an interpreter is introduced into a discourse event this affects the nature of
the interchange, since the interpreter will also make assumptions about each of
the interlocutors knowledge states. They use examples from ASL-English inter-
pretation and discuss the notion of expansions, which have been claimed to be
grammatically required in ASL. The authors persuasively argue against this claim.
Rather they suggest that understanding contextualization as a successful discourse
strategy provides a more appropriate approach to the expression of shared and
non-shared knowledge and discuss the implications of contextualization for the
notion of intersubjectivity in general.
Sinha and Rodrguez end the book where this introductory chapter began,
by contrasting the approach to the human mind (and social cognition) based on
intersubjectivity with that of theory of mind, arguing for the advantages of the
former. By combining Durkheims notion of social facts with that of Searle (1995)
they argue for the irreducibility of the social to the individual, in a way that con-
verges with the chapter by Itkonen. They also argue against a purely mentalistic
conception of intersubjectivity, as a property of the unmediated mind. Sinha
and Rodrguez point out the need to consider inter-corporeality, which extends
beyond the body to encompass inter-objectivity. Objects, they claim, are not
only referents of language, but signifiers in their own right, and it is through par-
ticipatory engagement with the social and material world that children enter the
12 Jordan Zlatev et al.

realm of language. Their chapter concludes by connecting with Gallagher and


Huttos Narrative Practice Hypothesis, stressing the importance of narrativity in
the construction of both complex human cognition and shared social cultural
identity.

5. Conclusion

As is obvious from these summaries, the perspectives expressed in these chap-


ters do not converge on a single, univocal notion of intersubjectivity, but rather
point to a complex phenomenon, or a set of related phenomena, in which expe-
riential, behavioural, genetic and neural processes and levels are interwoven in
both potentiating and actualizing what it means to be human. We hope that
this introductory chapter has conveyed to the reader our enthusiasm in work-
ing on this interdisciplinary and intersubjective project. The increasing body of
research on social cognition in developmental and comparative psychology, and
the prefixing of the term social to previously rather individual-oriented fields
such as cognitive linguistics and cognitive neuroscience, reflect a changing intel-
lectual context in which we hope that The Shared Mind will make a significant
contribution to rethinking some of the fundamental questions of our fields. Such
a rethinking is an essential, but radically challenging enterprise. The conceptual
difficulties encountered by the dominant tradition in the cognitive sciences in at-
tempting to explain the nature of social cognition, language and communication
are not accidental. They stem from the epistemological and methodological indi-
vidualism inherited from the possessive individualist cast of Western culture
(and capitalism), and the dominant position accorded in this tradition to natural
science and technology vis-a-vis the humanities and social sciences (Macintyre
1997; Taylor 1989).
Three of us have collaborated for over 15 years, and our conversations have
often revolved around the need for a coherent theoretical statement addressing
the centrality of intersubjectivity and normativity for linguistics, psychology and
cognitive science. The press of other work, and the difficulty of integrating the
various perspectives, has led to the repeated postponement of that venture. In the
meantime, however, both the need for, and the timeliness of, a book like the pres-
ent volume has become increasingly evident.
Tim Racine joined the group at the Jean Piaget Society meeting in Vancouver
in the summer of 2005, where the idea for the book emerged from two sympo-
sia in which half of the authors represented in the volume participated. Reading
(and writing) a large number of drafts of the chapters of this book, and actively
Intersubjectivity: What makes us human? 13

c omparing the points of view of the entire volume, helped us, as editors, real-
ize how much our agreement outweighs whatever differences remain. Although
this book is a polyphonic enterprise, the voices of the different chapters do not
always blend harmonically. This is only to be expected in addressing such a quint-
essentialy interdisciplinary topic as intersubjectivity. In this introductory chapter,
we have attempted to show the reader both how a focus on intersubjectivity offers
a different (and we believe more productive) perspective to social cognition than
the theory of mind approach, and to highlight some of the controversies within
this approach, thereby contributing to defining prospects for further empirical
and conceptual investigations. Without meaning to seem unduly nave, we offer
this volume to the reader as a source for reflection on human nature, and on the
possibilities for good and ill that are potentiated by the Shared Mind in our shared
world.

References

Chalmers, D. 1996. The Conscious Mind. Oxford: Oxford University Press.


Clark, A. and Chalmers, D.J. 1998. The extended mind. Analysis 58: 1023.
Edelman, G. 1992. Bright Air, Brilliant Fire: On the Matter of the Mind. New York: Basic Books.
Fodor, J.A. 1987. Psychosemantics; The Problem of Meaning in the Philosophy of Mind. Cam-
bridge, MA: MIT Press.
Gallagher, S. 2005. How the Body Shapes the Mind. Oxford: Oxford University Press.
Gibson, J.J. 1979. The Ecological Approach to Visual Perception. Boston, Houghton Mifflin.
Harr, R. and Gillet, G. 1994. The Discursive Mind. London: Sage Publications.
Honderich, T. 2006. Radical externalism. Journal of Consciousness Studies 13 (78): 213.
Jackendoff, R. 1987. Consciousness and the Computational Mind. Cambridge, Mass.: MIT Press.
Merleau-Ponty, M. 1962 [1945]. Phenomenology of Perception. London: Routledge and Kegan
Paul.
MacIntyre, A. 1997. After Virtue. London: Duckworth.
Malle, B.F. & Hodges, S.D. (eds.). 2005. Other Minds. New York: Guilford Press.
Moran, D. 2005. Edmund Husserl: Founder of Phenomenology. Cambridge: Polity Press.
Searle, J.R. 1995. The Construction of Social Reality. New York, NY: The Free Press.
Scheler, M. 1954 [1913]. The Nature of Sympathy (translated by P. Heath). Hamden, CT: Archon
Books.
Schutz, A. 1966. Collected Papers III: Studies in Phenomenological Philosophy. The Hague: Mar-
tinus Nijhoff.
Stern, D. 1985. The Interpersonal World of the Infant. New York, NY: Basic Books.
Taylor, C. 1989. Sources of the Self: The Making of Modern Identity. Cambridge: Cambridge
University press.
Trevarthen, C. 1979. Communication and cooperation in early infancy. A description of prima-
ry intersubjectivity. In Before Speech: The Beginning of Human Communication, M.Bullowa
(ed.), 99136. London: Cambridge University Press.
14 Jordan Zlatev et al.

Valsiner, J. and van der Veer, R.2000. The Social Mind: Construction of the Idea. Cambridge,
UK: Cambridge University Press.
Varela, F., Thompson, E. and Rosch, E. 1992. The Embodied Mind: Cognitive Science and Human
Experience. Cambridge, MA: MIT Press.
Vygotsky, L. 1978. Mind in Society. The Development of Higher Psychological Processes. Cam-
bridge, MA: Harvard University Press.
Wittgenstein, L. 1980. Remarks on the Philosophy of Psychology, Volume 2 (translated by
C.G.Luckhardt and M.A.E. Aue). Oxford: Blackwell.
Zahavi, D. 2001. Beyond empathy. Phenomenological approaches to intersubjectivity. Journal
of Consciousness Studies 8: 151167.
Zahavi, D. 2003. Husserls Phenomenology. Stanford, CA: Stanford University Press.
part i

Development
chapter 2

Understanding others through primary


interaction and narrative practice

Shaun Gallagher and Daniel D. Hutto

We argue that theory-of-mind (ToM) approaches, such as theory theory and


simulation theory, are both problematic and not needed. They account for
neither our primary and pervasive way of engaging with others nor the true
basis of our folk psychological understanding, even when narrowly construed.
Developmental evidence shows that young infants are capable of grasping the
purposeful intentions of others through the perception of bodily movements,
gestures, facial expressions etc. Trevarthens notion of primary intersubjectivity
can provide a theoretical framework for understanding these capabilities and
his notion of secondary intersubjectivity shows the importance of pragmatic
contexts for infants starting around one year of age. The recent neuroscience of
resonance systems (i.e., mirror neurons, shared representations) also supports
this view. These ideas are worked out in the context of an embodied Interaction
Theory of social cognition. Still, for more sophisticated intersubjective interac-
tions in older children and adults, one might argue that some form of ToM is
required. This thought is defused by appeal to narrative competency and the
Narrative Practice Hypothesis (or NPH). We propose that repeated encounters
with narratives of a distinctive kind is the normal route through which children
acquire an understanding of the forms and norms that enable them to make
sense of actions in terms of reasons. A potential objection to this hypothesis is
that it presupposes ToM abilities. Interaction Theory is deployed once again to
answer this by providing an alternative approach to understanding basic narra-
tive competency and its development.

1. Introduction

Our intention in this chapter is to explicate an account of how we come to un-


derstand others, without appealing to the dominant theory-of-mind (ToM) ap-
proaches of theory theory (e.g., Leslie 1987; Gopnik 1993) or simulation theory
(e.g., Gordon 1986; Goldman 2002). We have elsewhere provided good reasons to
18 Shaun Gallagher and Daniel D. Hutto

doubt that either of these theories can give an accurate or adequate account of our
everyday intersubjective abilities for understanding the intentions and the behav-
iours of other persons (see Gallagher 2001, 2004, 2006, 2007a, 2007b; Hutto 2004,
2005, 2006a, 2007a, 2007b, 2008). We will briefly summarize that critique here,
but our main purpose is to set out a more positive account of just these everyday
intersubjective abilities and show that they are not reducible (or inflatable) to the
mind-reading or mentalizing described by approaches to social cognition which
presume a theory-of-mind.
This positive account involves three kinds of processes which together are
sufficient to deliver the nuanced adult capacity for understanding (as well as for
mis-understanding) others. These processes include (1) intersubjective percep-
tual processes, (2) pragmatically contextualized comprehension, and (3) narrative
competence. We argue on the basis of evidence from developmental psychology
that the capacity for understanding others is, on average, well established by the
time the child reaches four or five years of age, and that it continues to be enriched
on the basis of further experience as we become mature adults.

2. A brief critique of the dominant approaches to social cognition

Theory theory (TT) and simulation theory (ST), the standard and dominant
approaches to social cognition, share the important supposition that when we
attempt to understand the actions of others, we do so by making sense of them
in terms of their mental processes to which we have no direct access. That is,
we attempt to mind read their beliefs, desires, and intentions, and such mind
reading or mentalizing is our primary and pervasive way of understanding their
behaviour. Furthermore, both TT and ST characterize social cognition as a pro-
cess of explaining or predicting what another person has done or will do. TT
claims that we explain another persons behaviour by appealing to either an
innate or acquired theory of how people behave in general, a theory that is
framed in terms of mental states (e.g., beliefs and desires) causing or motivat-
ing behaviour. ST claims that we have no need for a theory like this, because
we have a model, namely, our own mind, that we can use to simulate the other
persons mental states. We model others beliefs and desires as if we were in
their situation.
Claims that such theory or simulation processes are explicit (conscious) are
dubious from a phenomenological point of view. That is, if in fact such processes
are primary, pervasive, and explicit, they should show up in our experience in
Primary interaction and narrative practice 19

the way that we experience others and they rarely do. The phenomenological
critique also rejects the idea, clearly found in TT, that our everyday dealings with
others involve an observational, third-person stance toward them observing
them and trying to come up with explanations of their behaviour. Rather, our
everyday encounters with others tend to be second-person and interactive.
Claims that the processes described by TT or ST are implicit (or not explic-
itly conscious) run into a different set of objections. In the case of TT, there is no
evidence that such processes are implicit, or even clarity about what precisely that
means. Moreover, although TT appeals to false-belief experiments, such experi-
ments are set up to test for explicit rather than implicit theory-of-mind processes
(Gallagher 2001) subjects are asked to explicitly consider the meanings of an
observed third-partys behaviour. Implicit approaches to ST appeal to the neu-
roscience of mirror neurons and shared representations (cf. Barresi and Moore
this volume), but there is no justification for calling these subpersonal processes
simulation, since according to ST, simulation involves the instrumental use of
a first-person model to form third-person as if or pretend mental states. In
subpersonal processes, (1) there is no first- or third-person (activation of mir-
ror neurons, for example, are considered to be neutral in regard to who the
agent is) (see e.g., deVignemont 2004; Gallese 2005; Hurley 2005; Jeannerod and
Pacherie2004); (2) nothing (or no one) is using a model; and (3) neuronal pro-
cesses cannot pretend. As vehicles neurons either fire or they dont. More impor-
tantly, in terms of relevant content, if they are neutral with respect to first- and
third-person, pretence in just these terms (I pretend to be you) is not possible.
In effect, simulation, as defined by ST, is a personal-level concept that cannot be
legitimately applied to subpersonal processes.

. This is not to deny that in some circumstances, for example, in observing puzzling cases
of another persons behaviour, we may in fact explicitly appeal to theory or employ simulation.
The claim here is simply that most of our everyday interactions are not of this sort. Puzzling
cases are the exception.
. Goldman and Sripada (2005:208), acknowledging the discrepancy between the ST defini-
tion of simulation and the working of subpersonal mirror processes, propose a minimal defini-
tion of simulation: Applied to mindreading, a minimally necessary condition [for simulation]
is that the state ascribed to the target is ascribed as a result of the attributors instantiating, un-
dergoing, or experiencing, that very state. In the case of successful simulation, the experienced
state matches that of the target. If this is a necessary condition, it cannot be a sufficient one,
because on this minimal definition and without something further, its not clear what would
motivate me to ascribe the state that I was undergoing to someone else. Furthermore, if this
were as automatic as mirror neurons firing, then it would seem that we would not be able to
attribute a state different from our own to someone else. But we do this all the time. Practically
speaking, this proposal also raises puzzles about interacting with more than one other person.
20 Shaun Gallagher and Daniel D. Hutto

In addition to these phenomenological and logical objections to TT and ST,


there is good evidence from developmental psychology that our ability to under-
stand others emerges much earlier than TT or ST would predict. An objection
can also be raised against the idea that a general theory (folk psychology) would
have sufficient explanatory power to explain the particularities of a large diversity
of behaviours found in everyday life, or that it could be very reliable in the face of
multiple possibilities for motivation. Similarly it has been objected that running
a first-person simulation routine, that is, a process that is based on ones own
mental states, seems inadequate to explain the diversity of behaviours found in
the world.
These objections throw doubt on TT and ST approaches. The question, how-
ever, is whether there is a positive account that can avoid these objections. We
turn now to the construction of that alternative account, in three parts: intersub-
jective perception, pragmatically contextualized comprehension, and narrative
competency.

3. Intersubjective perception and interaction

Long before the child reaches the age of four, the capacities for human inter-
action and intersubjective understanding are already accomplished in certain
embodied practices practices that are emotional, sensory-motor, perceptual,
and nonconceptual. These practices include proto-mimesis (Zlatev, this vol-
ume), imitation, the parsing of perceived intentions (Baldwin, Baird, Saylor and
Clark2001), emotional interchange (Hobson 2004), and generally the processes
that fall under the heading of primary intersubjectivity (Trevarthen 1979). These
embodied practices constitute our primary access for understanding others, and
they continue to do so even after we attain our more sophisticated abilities in this
regard (Gallagher 2001).
In most intersubjective situations, that is, in situations of social interaction,
we have a direct perceptual understanding of another persons intentions because
their intentions are explicitly expressed in their embodied actions and their ex-
pressive behaviors. This understanding does not require us to postulate or in-
fer a belief or a desire hidden away in the other persons mind. What we might

Is it possible to simulate the neural/mental/emotional states of two other people at the same
time if in fact our simulations must be such that we instantiate, undergo, or experience, those
two (possibly very different) states? (see Gallagher 2007b). We suggest that these issues would
also have to be addressed by Barresi and Moore (this volume) in order to clarify their proposal
for a matching system.
Primary interaction and narrative practice 21

r eflectively or abstractly call their belief or desire is expressed directly in their


actions and behaviours. This phenomenologically direct understanding is likely
made possible by the above mentioned complex neuronal processes described
as the mirror neuron system(s) and shared representations. In contrast to inter-
preting these neuronal resonance processes as implicit simulations, which on the
functional level would involve cognitive processes over and above the perception
of action, Gallagher (2005, in press) has argued that they in fact instantiate a form
of enactive social perception.
A primary, perceptual sense of others is already implicit in the behaviour
of the newborn. In neonate imitation, which depends not only on a contrast, in
some sense, between self and non-self, and a proprioceptive sense of ones own
body, but also a responsiveness to the fact that the other is of the same sort as
oneself (Bermdez 1996; Gallagher 1996; Gallagher and Meltzoff 1996), infants
are able to distinguish between inanimate objects and people. The fact that they
imitate only human faces (see Legerstee 1991; Johnson 2000; Johnson et al. 1998)
suggests that infants are able to parse the surrounding environment into those
entities that perform human actions (people) and those that do not (things)
(Meltzoff and Brooks 2001). An intermodal tie between a proprioceptive sense
of ones body and the face that one sees is already functioning at birth. For the
infant, the other persons body presents opportunities for action and expressive
behaviour opportunities that it can pursue through imitation. There is, in this
case, a common bodily intentionality that is shared by the perceiving subject
and the perceived other. From early infancy humans, and perhaps some animals
(see e.g., the studies by Myowa-Yamakoshi 2001; Myowa-Yamakoshi et al. 2004;
also cited by Zlatev this volume) have capabilities for primary-intersubjective
interaction with others.
The early capabilities that contribute to primary intersubjectivity constitute
an immediate, non-mentalizing mode of interaction. Infants, notably without the
intervention of theory or simulation, are able to see bodily movement as goal-di-
rected intentional movement, and to perceive other persons as agents. This does
not require advanced cognitive abilities; rather, it is a perceptual capacity that is
fast, automatic, irresistible and highly stimulus-driven (Scholl and Tremoulet
2000:299). Evidence for this early, non-mentalizing interpretation of the inten-
tional actions of others can be found in numerous studies. Baldwin and col-
leagues, for example, have shown that infants at 1011 months are able to parse
some kinds of continuous action according to intentional boundaries (Baldwin
and Baird 2001; Baldwin et al. 2001). The infant follows the other persons eyes,
and perceives various movements of the head, the mouth, the hands, and more
general body movements as meaningful, goal-directed movements. Such per-
ceptions give the infant, by the end of the first year of life, a non-conceptual,
22 Shaun Gallagher and Daniel D. Hutto

action-based understanding of the intentions and dispositions of other persons


which does not involve inferences about beliefs or desires understood as mental
states (Allison, Puce and McCarthy 2000; Baldwin 1993; Johnson 2000; Johnso,
Slaughter and Carey 1998).
Primary intersubjectivity also includes affective coordination between the
gestures and expressions of the infant and those of caregivers with whom they
interact. Infants vocalize and gesture in a way that seems tuned [affectively
and temporally] to the vocalizations and gestures of the other person (Gopnik
and Meltzoff 1997:131). Infants at 5 to 7 months detect correspondences be-
tween visual and auditory information that specify the expression of emotions
(Walker1982). The perception of emotion in the movement of others, however,
does not involve taking a theoretical stance or creating a simulation of some inner
state. It is a perceptual experience of embodied comportment (Bertenthal, Proffitt
and Cutting 1984; Moore, Hobson and Lee 1997). This kind of perception-based
understanding, therefore, is not a form of mind-reading. In seeing the actions
and expressive movements of the other person one already sees their meaning; no
inference to a hidden set of mental states (beliefs, desires, etc.) is necessary.
The capabilities involved in primary intersubjectivity suggest that before
we are in a position to wonder what the other person believes or desires, we al-
ready have specific perceptual understanding about what they feel, whether they
are attending to us or not, whether their intentions are friendly or not, and so
forth. There is, in primary intersubjectivity, a common bodily intentionality that
is shared across the perceiving subject and the perceived other. As Gopnik and
Meltzoff indicate, we innately map the visually perceived motions of others onto
our own kinesthetic sensations (1997:129), and the evidence from recent re-
search on mirror neurons and resonance systems in social neuroscience supports
this. Thus, before we are in a position to theorize, simulate, explain or predict
mental states in others, we are already in a position to interact with and to under-
stand others in terms of their expressions, gestures, intentions, and emotions, and
how they act toward ourselves and others. Furthermore, primary intersubjectivity
is not primary simply in developmental terms. Rather it remains primary across
all face-to-face intersubjective experiences, and it underpins those developmen-
tally later, and occasional, practices that may involve explaining or predicting
mental states in others (see e.g., Sterns (1985) idea of a layered model in which

. In citing Gopnik and Meltzoff s claim about the necessity for innate mappings we are not
thereby endorsing their theory-theoretic construal of what this involves. Indeed, much of the
evidence developed by Meltzoff and cited by Gopnik and Meltzoff supports the idea of a strong
intersubjective perceptual capacity in the infant.
Primary interaction and narrative practice 23

evelopmentally primary understandings are not superseded but remain and


d
operate in parallel to more advanced ones).

4. Pragmatic intersubjectivity

If human faces are especially salient, even for the youngest infants, or if we contin-
ue to be capable of perceptually grasping the meaning of the others expressions
and intentional movements, such face-to-face interaction does not exhaust the
possibilities of intersubjective understanding. Expressions, intonations, gestures,
and movements, along with the bodies that manifest them, do not float freely in
the air; we find them in the world, and infants soon start to notice how others
interact with the world. When infants begin to tie actions to pragmatic contexts,
they enter into what Trevarthen calls secondary intersubjectivity. Around the age
of 1 year, infants go beyond the person-to-person immediacy of primary intersub-
jectivity, and enter into contexts of shared attention shared situations in which
they learn what things mean and what they are for (see Trevarthen and Hubley
1978). Behaviour representative of joint attention begins to develop around 914
months (Phillips, Baron-Cohen and Rutter 1992). In such interactions the child
looks to the body and the expressive movement of the other to discern the inten-
tion of the person or to find the meaning of some object. The child can under-
stand that the other person wants food or intends to open the door; that the other
can see him (the child) or is looking at the door. This is not taking an intentional
stance, i.e., treating the other as if they had desires or beliefs hidden away in their
minds; rather, the intentionality is perceived in the embodied actions of others.
They begin to see that anothers movements and expressions often depend on
meaningful and pragmatic contexts and are mediated by the surrounding world.
Others are not given (and never were given) primarily as objects that we encoun-
ter cognitively, or in need of explanation. We perceive them as agents whose ac-
tions are framed in pragmatic contexts. It follows that there is not one uniform
way in which we relate to others, but that our relations are mediated through the
various pragmatic circumstances of our encounters. Indeed, we are caught up in
such pragmatic circumstances, and are already existing in reference to others,
from the very beginning (consider for example the infants dependency on others

. Of course, the fact that anothers feelings can be hidden is completely consistent with ex-
pressivism of this sort. As Wittgenstein says One can say He is hiding his feelings. But that
means that it is not a priorithey are always hidden (Wittgenstein 1992:35e). The point is that
our initial, basic engagements with others are not estranged, even if sophisticated creatures like
us are capable of hiding or faking their emotions.
24 Shaun Gallagher and Daniel D. Hutto

for nourishment), even if it takes some time to sort out which agents provide sus-
tenance, and which ones are engaged in other kinds of activities.
As we noted, children do not simply observe others; they are not passive ob-
servers. Rather they interact with others and in doing so they develop further
capabilities in the contexts of those interactions. If the capacities of primary in-
tersubjectivity, like the detection of intentions in expressive movement and eye
direction, are sufficient to enable the child to recognize dyadic relations between
the other and the self, or between the other and the world, something more is
added to this in secondary intersubjectivity. As noted, in joint attention, begin-
ning around 914 months, the child alternates between monitoring the gaze of the
other and what the other is gazing at, checking to verify that they are continuing
to look at the same thing. Indeed, the child also learns to point at approximately
this same time. Eighteen-month-old children comprehend what another person
intends to do with an instrument in a specific context. They are able to re-enact
to completion the goal-directed behaviour that someone else fails to complete.
Thus, the child, on seeing an adult who tries to manipulate a toy and who appears
frustrated about being unable to do so, quite readily picks up the toy and shows
the adult how to do it (Meltzoff 1995; Meltzoff and Brooks 2001).
Our understanding of the actions of others occurs on the highest, most ap-
propriate pragmatic level possible. That is, we understand actions at the most rel-
evant pragmatic (intentional, goal-oriented) level, ignoring possible subpersonal
or lower-level descriptions, and also ignoring interpretations in terms of beliefs,
desires, or hidden mental states. Rather than making an inference to what the
other person is intending by starting with bodily movements, and moving thence
to the level of mental events, we see actions as meaningful in the context of the
physical and intersubjective environment. If, in the vicinity of a loose board, I
see you reach for a hammer and nail, I know what your intentions are as much
from the hammer, nail, and loose board as from anything that I observe about
your bodily expression or postulate in your mind. We interpret the actions of oth-
ers in terms of their goals and intentions set in contextualized situations, rather
than abstractly in terms of either their muscular performance or their beliefs.
The environment, the situation, or the pragmatic context is never perceived neu-
trally (without meaning), either in regard to our own possible actions, or in re-
gard to the actions and possibilities of others. As Gibsons theory of affordances

. Our understanding of the performance of mimes who work without props depends on
their excellent ability to express intentions in their movements, but also on our familiarity with
contexts. The mimes talent for expressive movements is clearly demonstrated in contrast to
what we often experience in the game of charades or pantomime when we havent a clue about
what the player is trying to represent.
Primary interaction and narrative practice 25

(e.g., Gibson 1979) suggests, we see things in relation to their possible uses, and
therefore never as a disembodied observer. Likewise, our perception of the other
person, as another agent, is never of an entity existing outside of a situation, but
rather of an agent in a pragmatic context that throws light on the intentions (or
possible intentions) of that agent.
Theory-of-mind approaches, which involve theory (as an application of folk
psychology) or simulation, and which focus on the acquisition of the concept of
mental states (like belief) around age 3 or 4 years, miss some basic and impor-
tant capacities for social cognition. Yet, the acknowledgement of capabilities for
understanding others that define primary and secondary intersubjectivity the
embodied, sensory-motor (emotion informed) capabilities that enable us to per-
ceive the intentions of others (from birth onward), and the perceptual and action
capabilities that enable us to understand others in the pragmatically contextual-
ized situations of everyday life (from 1218 months onward) is not sufficient to
address what are clearly new developments around the ages of 2, 3 and 4 years.
The elephant in the room around the age of 2 years is, of course, language. But
if language development itself is something that depends on some of the capabili-
ties of primary and secondary intersubjectivity, language also carries these capa-
bilities forward and puts them into service in much more sophisticated social
contexts (on this point, from a different perspective, also see Zlatev this volume).
Do children, upon passing explicit false-belief tests, acquire the final con-
ceptual component needed for their mature understanding of reasons, as is the
pervasive claim in the theory-of-mind literature? Or does their newfound un-
derstanding of false belief simply equate to a capacity to recognize that the other
(whether Maxie, or Sally-Ann, or Snoopy, etc.) has a divergent point of view from
their own, and no more? And, what lies at the root of this sort of understanding?
Is this sort of mastery of the concept of belief a natural consequence of the matu-
ration of theory-of-mind modules, grounded in introspective acts of ostensive
denotation or the product of extensive, evidence-based theorizing on their part?
We propose that none of these proposals hold up well under close scrutiny (see
Hutto 2008:Chs. 9 and 10). If so it is more plausible to think that an understand-
ing of divergent cognitive perspectives is the result of children beginning to par-
ticipate in conversations of the kind that require recognition of conflicting points
of view. This sort of activity can be seen as a natural extension of those forms of
imaginative pretend play that require children to occupy different character roles
and adopt personas that are different to their own (Hutto 2008:Ch. 7).
A childs initial understanding of the concept of belief is likely to depend on
many things but it is notable that many false-belief tests are presented in the form
of a narrative and could be interpreted as tests for a certain level of narrative
competency. It also worth observing that the strongest data concerning successful
26 Shaun Gallagher and Daniel D. Hutto

false-belief performance stems from experiments conducted almost entirely on


European and American subjects, whose early lives are awash with folk psycho-
logical narratives encountered in fairy tales, children books, comic books, televi-
sion and films (Richner and Nicolopoulou 2001:408; Nelson 2003:22). The form,
content and focus of the stories and storytelling practices are much the same in
these cultures. Indeed, they even share many of the same canonical texts.
Even more important, we must ask, what role does this mature understanding
of false-belief play in the lives of children? And, what drives its development and
facilitates its incorporation into larger explanatory schemas of explicitly making
sense of actions in terms of reasons (in which attributions of belief plays an im-
portant but nevertheless limited part)? In addressing these questions it is vital to
be aware, as Carpendale and Lewis (2004:91) stress, that:
Proponents of the dominant theories have been notably quiet about what happens
in development after the childs fifth birthday. However research that explores
whether 5-year-olds can use simple false belief knowledge to make inferences
about their own and others perspectives finds that they singularly fail to do so.

5. Making sense of reasons

The ability and motivation to use ones knowledge of false belief in a wider explan-
atory context, it seems, is late-developing. It comes into play only after children
gain an explicit, practical mastery of the concept of belief. This suggests that false
belief understanding is not the crowning moment in their early understanding of
other minds; children must develop further still if they are to make sense of ac-
tions in terms of reasons. What does this involve?
Lets focus on an example. Someone might ask: Why is Laura going to India?
If I dont really know Laura, and if Ive never heard her say why she is going to
India, then I may attempt to get at her reasons in the third-person. This is surely
something we do regularly. This sort of speculative attempt at folk-psychological
explanation might run as follows. Laura is a young, American college student.
Why do young American college students travel to India? Laura, like many young
American college students, may believe that India is a romantic place and that
she can learn about Eastern meditation practices there and have an adventure. So
Laura might desire to go to India for such reasons. One reaches this conclusion
by calling on background knowledge general knowledge or beliefs about what
American college students tend to think and value as well as ones knowledge and
beliefs about widely held beliefs about India. The attributed reason may be correct
Primary interaction and narrative practice 27

or incorrect in Lauras case, but lacking detailed information about Laura, one is
forced to appeal to generalizations informed by knowledge of an impersonal sort.
Two things are worthy of note. First, this kind of speculation is not likely to be
very reliable in most interesting cases. Second, there is no obvious reason to think
that the background knowledge or beliefs in question is theoretical. To say that
one is operating with theories about India and theories about the belief-forming
tendencies of American students in such cases is surely to stretch the notion of
theory beyond reasonable limits.
Lets modify the example slightly. If I know Laura, but do not know precisely
why she is going to India, I will be able to make a more informed guess about her
reasons. Laura is the kind of person who really wants to help children in the third
world, so that is probably why she is going to India. I will have learnt this about her
from my previous exchanges with her or on the basis of what others have told me
about her. In this case too, my attribution is knowledge-based but the knowledge
in question this time is particular and personal. Although, again, hardly theoreti-
cal my attribution remains speculative and suppositional.
Heres a third case. Knowing Laura I may already know her reason for going
to India or I might get at it by a much more reliable means. I may know why she is
going because she may have already told me so. If not, I could always ask her. Of
course, she may be lying or self-deceived, but even acknowledging those possibili-
ties direct conversation is undeniably the most secure route to her reasons.
It is important to stress that in each of these cases the capacity to understand
why Laura acted (or might have acted), and our ability to digest these answers
is framed by the activity of checking to see if her reason, as it were, makes sense.
Guessing at or learning of a persons reason is only a small part of the story of
our everyday understanding of why others act. It is also necessary to situate and
evaluate reasons in wider contexts and against certain normative assumptions.
Would it make sense for anyone go to India for that sort of reason? In particular,
does it make sense for Laura to go? Is doing so in line with her character, her
larger ambitions, her existing projects, or her history? What does it say about her?
Does it make her a generous person, an idealist or merely nave? Understanding
reasons for action demands more than simply knowing which beliefs and desires
have moved a person to act. To understand intentional action requires contextu-
alizing these, both in terms of cultural norms and the peculiarities of a particular
persons history or values.
In this light, reasons for acting are best thought of as the elements of a pos-
sible storyline (Velleman 2000:28). As such, making explicit a persons narrative
is the medium for understanding and evaluating reasons and making sense of
28 Shaun Gallagher and Daniel D. Hutto

actions. Such narratives allow us to understand a persons rationale when this is


not immediately obvious.
Sometimes there is a need to frame and justify our reasons but more often
than not, when all proceeds normally there is simply no need. This does not im-
ply that in such cases we quietly grasp and deploy a set of explicit generalizations
about how others will act. Rather, it is through shared training about the roles and
rules of our common world that I learn how I ought to behave in various circum-
stances, and at the same time I learn how you ought to behave as well, ceteris pa-
ribus. Knowledge of what I ought to do in certain circumstances supplies a handy
guide to the likely behaviour of others, in so far as they do not step out of line.
Such learning does not take the form of internalizing explicit rules (at least not
as a set of theoretical propositions), nor does it depend on our applying ones that
are somehow already built-in subpersonally. Rather our expectations of others
results from our becoming accustomed to local norms, coming to embody them,
as it were, through habit and practice. This, we suggest, and not the wielding of
theoretical generalizations, is the crucial backdrop against which we make sense
of reasons for action via narratives of the folk psychological variety.

6. The narrative practice hypothesis

How do we get this sort of complex and nuanced understanding of why people
do what they do? People do not wear their reasons for action on their sleeves
and they cannot be readily discerned or understood by deploying the kind of
embodied heuristics described earlier in this paper. We suggest that the pervasive
presence of narrative in our daily lives, and the development of specific kinds of
narrative competency, can provide a more parsimonious alternative to theory or
simulation approaches, and a better way to account for the more nuanced un-
derstandings (and mis-understandings) we have of others. Competency with dif-
ferent kinds of narratives enables us to understand others in a variety of ways.
Distinctive kinds of narrative encounters are what first allow us to develop our
folk psychological competence. Hutto calls this the narrative practice hypoth-
esis. It claims that children normally achieve [folk psychological] understand-
ing by engaging in story-telling practices, with the support of others. The stories
about those who act for reasons i.e., folk psychological narratives are the foci
of this practice. Stories of this special kind provide the crucial training set needed
for understanding reasons (Hutto 2007b:53).
Accordingly, children acquire their skilled competence in understanding rea-
sons by being exposed to and by engaging with narratives when appropriately and
actively supported by their care givers. For example, in acts of storytelling, such
Primary interaction and narrative practice 29

active support takes the form of children being prompted to answer certain ques-
tions and by having their attention directed at particular events. In the case of folk
psychological narratives this will normally involve jointly attending to mentalistic
terms such as wish, believe and know and discussing what the story charac-
ters know, feel and want. During this process children learn how these states of
mind behave in relation to each other and other terms in the psychological fam-
ily. Importantly, these attitudes exist in a wider context such that children learn
how and why these attitudes matter to the protagonists of such stories. Time and
time again reasons for acting, of different types and complexity, are put on show
in this way.
By attending to enough of these exemplars, it is possible for children to de-
velop an implicit practical understanding of how to make sense of persons as
those who act for reasons. This is nothing like fashioning the concepts of the
attitudes by means of theorizing or having a core theory about how they inter-
relate. Coming to understand what it is to act for a reason to understand folk
psychologically requires being trained by means of a specific kind of narrative
practice. They can achieve this because even simple folk psychological narratives,
like their more sophisticated cousins represent the moment by moment experi-
ences of fictional minds, as well as the coloration that those experiences acquire
from the characters broader cognitive and emotional stances towards situations
and events (Herman 2007:147).
This proposal is consistent with a number of recent empirical studies that
have established that there are important links between narrative abilities and
our capacity to understand others (Astington 1990; Dunn 1991; Feldman, Bruner,
Renderer and Spitzer 1990; Lewis 1994, Lewis, Freeman, Hagestadt and Douglas
1994; Nelson 2007, Peterson and McCabe 1994). Exposure to stories is a critical
determiner of folk-psychological abilities and it has been shown that this relation
is stronger than mere correlation. Apparently narrative training causally influ-
ences what are considered to be basic theory-of-mind skills for the better (Gua-
jardo and Watson 2002). Controlled studies have shown that narrative training
is responsible for improving performances on false belief tasks. Thus, it has been
concluded that narrative is an effective tool for at least modest improvements in
childrens theory of mind development (Guajardo and Watson 2002:320). Simi-
larly, it has been observed that frequent conversations about the mind can accel-
erate growth of a ToM (Garfield, Peterson and Perry 2001:513).
A complementary idea is that other kinds of narrative competencies enable
a less mediated interpretation of the others actions and intentions, that is, with-
out the mediation of folk psychology. After all, folk psychological explanation
is just one kind of narrative practice. We argue here that how we go about de-
veloping a nuanced understanding of others may involve one or both of these
30 Shaun Gallagher and Daniel D. Hutto

paths employing a narrative-informed folk psychology, and/or a less mediated


narrative practice and which one is appropriate will depend on the context.

7. Folk psychological and other kinds of narratives

What are narratives? This is a tricky question and providing a good answer to it is
beyond the scope of this paper. A very minimal definition will suffice for our pur-
poses. Larmarque tells us that for something to be a narrative at least two events
must be depicted in a narrative and there must be some more or less loose, albeit
non-logical relation between the events. Crucially, there is a temporal dimension
in narrative (Lamarque 2004:394; see also Lamarque and Olsen 1994:225). This
neutral characterisation easily lends itself to the idea that there are different types
of narratives and that these can be classified by such common features as their
constituents and subject matter. Folk psychological narratives as exemplified
by Little Red Riding Hood are distinguished by being about agents who act for
reasons. Importantly, narratives of this kind can play their special role in devel-
opment by being the objects of joint attention in early learning. That is the core
claim of the NPH.
In this light it should be emphasised that, as social cognizers, we do not use
folk psychological narratives nearly as often as the tradition supposes. They are
not, for example, the basis of all interpersonal interaction. On the contrary, they
generally only come into play in those cases in which the actions of others deviate
from what is normally expected in such a way that we encounter difficulty under-
standing them. In such cases the others actions become noticeable, falling into
the spotlight for special attention and explanation and potentially, explanations
of a specific sort that involve understanding the others reasons for taking the
particular action where this is not in some way obvious or already known. Folk
psychology is needed only in rare cases where we are not already familiar with
the other persons story, or are perplexed by anothers actions. For When things
are as they should be, the narratives of folk psychology are unnecessary (Bruner
1990:40). Appeal to folk psychology may come into play when culturally-based
expectations are violated. For the most part, well-rehearsed patterns of behaviour
and coordination dominate. By and large, we get by without having to make any
folk psychological attributions at all and without seeking explications from others
because most everyday social interaction takes place in normal (and normalized)
environments.
Again, we can learn a great deal from developmental psychology. Around the
age of two, children are in secure possession of an early intentional understand-
ing of persons having internal goals and wants that differ from person to person
Primary interaction and narrative practice 31

(Wellman and Phillips 2001:130; Bartsch and Wellman 1995). Young children are
somewhat practiced in understanding things as other people understand them in
pragmatic contexts, and when the capacities associated with primary and second-
ary intersubjectivity are combined with several other newly acquired capacities,
young children are ready to understand things and people in emerging narrative
structures. And in this context it must be acknowledged that many other kinds of
narratives those of the non-folk psychological variety can take us a long way
to the understanding we seek, without resorting to the folk psychological frame-
work per se (or at least without always having to do so).
We learn to make sense of persons (others as well as ourselves) in dramatic
and narrative ways as young children. When children listen to stories, or play-act
(and the same applies to adults who are exposed to parables, plays, myths, novels,
etc.) they become familiarized with sets of characters and with a range of ordi-
nary or extra-ordinary situations, and the sorts of actions appropriate to them, all
of which helps to shape their expectations. An education in narratives of many
sorts even of the more general and less personal variety provides knowledge
of what actions are acceptable and in what circumstances, what sort of events are
important and noteworthy, what can account for action, and what kind of expla-
nations constitute the giving of good reasons.
Moreover, children are well supported in this process. Typically, they are
provided with running commentaries on stories that teach them not only which
actions are suited to particular situations but also which reasons for acting are
acceptable and which are not. It is by absorbing such standards that we first learn
how to judge an actions appropriateness (though, of course, in time such stan-
dards are sometimes questioned and overturned). Quite generally, stories real
or fictional teach us what others can expect from us, but just as importantly,
what we can expect from others in certain situations. This is not just coming to
know what others ought to (and thus are likely to) do, but what they ought to (and
thus are likely to) think and feel, as indexed to the sort of people they are. Narra-
tives provide an important source of guidance for staking out the boundaries of
what is acceptable and what is not. Through them we learn the norms associated
with social roles that pervade our everyday environments shops, restaurants,
homes and theatres.

. There are two aspects of childrens narrative activity which are too often treated in mu-
tual isolation: the discursive exposition of narratives in storytelling and their enactments in
pretend play (see Richner & Nicolopoulou 2001:408). Childrens first narrative productions
occur in action, in episodes of symbolic play by groups of peers, accompanied by rather than
solely through language. Play is an important developmental source of narrative (Nelson
2003:28).
32 Shaun Gallagher and Daniel D. Hutto

Engaging with narratives is not a passive affair: it presupposes a wide range of


emotive and interactive abilities. To appreciate such stories children must initially
be capable, at least to some degree, of imaginative identification and of responding
emotively, just as they do in basic social engagements. In this respect conversa-
tions about written and oral stories are natural extensions of childrens earlier ex-
periences with the sharing of event structures (Guajardo and Watson 2002:307).
Through them children discover why characters act as they do in particular cases,
becoming accustomed to standard scripts scenarios, characters, plots, etc.
The kind of emotional resonance that one finds already in infancy, in pri-
mary intersubjectivity, seems to play an important role in gaining narrative com-
petency. Decety and Chaminade (2003) have shown this connection as it plays
out in the brain. In their fMRI study, subjects were presented with a series of
video clips showing actors telling sad and neutral stories, as if they had personally
experienced them. The stories were told with either congruent or incongruent
motor expression of emotion. Subjects were then asked to rate the mood of the
actor and how likeable they found that person. Watching sad stories versus neu-
tral stories was associated with increased processing activity in emotion related
structures (including the amygdala and parieto-frontal areas, predominantly in
the right hemisphere). These areas were not activated when the narrator showed
incongruent facial expressions. The reasonable hypothesis is that conflict between
what we sense as the emotional state of the other person, simply on the basis of
seeing their faces and actions, and the narrative content they present, is disrup-
tive to understanding. Whatever is going on in the brain correlates not simply
to features of action and expression (and the subjectivity of the other person)
but to the larger story, the scene, the circumstance of the other person, and how
features of action and expression match or fail to match those circumstances. If
the emotional character of the other person is not in character with the narrative
framework with the story that I could tell about her and her circumstances it
is difficult to understand that person, the story, or both.

8. Narrative competency and landscape of consciousness

We have argued that the abilities for intersubjective interaction and understanding
that start with primary and secondary intersubjectivity, develop along a route that
in most ordinary cases exploits narrative competency rather than the procedures,
subpersonal or explicit, associated with traditional theory-of-mind accounts.
This should provide the means of staving off a common worry about the
NPH. Janet Astington (1990) has argued that acquiring narrative competency
requires having a theory of mind. Citing Bruners concept of the landscape of
Primary interaction and narrative practice 33

consciousness (what those involved in the action know, think, or feel, or do not
know, think, or feel [Bruner 1986:14]), she suggests that to understand narra-
tive we need access to the characters minds, and to have the latter requires us to
have a theory-of-mind. But Bruner himself offers good experimental evidence
against the necessity of the landscape of consciousness (LC) for understanding
narratives. Feldman, Bruner et al. (1990), in a study of narrative comprehension
in adults, presented two different versions of the same story to two groups, re-
spectively. The first and original story mentioned the mental states of the charac-
ters as the story develops, and so was rich in LC. The second story was the very
same story stripped of mental terms, leaving only the landscape of actions (LA).
The results showed no significant differences (1) in subjects using reader-related
mental verbs when they recount the LC narrative; (2) in recounting the facts of
the stories the retellings were virtually indistinguishable; (3) in recounting the
order of events; and (4) when providing a meaning summary (gist) for the story
there is no version difference in the kind of gist given. A likely explanation of
these results is that the structure of these person-narratives, as revealed explicitly
in basic plots, can be identified, responded to and described on several levels and
ways. Often this happens all at once. But not everyone is equally proficient at this.
It is possible to be alive to the major events in a drama without always being able
to decipher, with full clarity or perhaps not at all, the reasons why a protagonist
will have acted. It is thus possible to have some sense of what is going on in an
unfolding drama without understanding it in toto (this is apparently a common
experience for those first encountering Shakespearean plays).
What is important is that seeking a narrative understanding of the others
reasons is not a matter of characterizing the others inner life if this is under-
stood as a series of causally efficacious mental states. What we are attempting to
understand is much richer, it is the others reasons as they figure against the larger
history and set of projects, and that is best captured in a narrative form. Coming
to understand anothers reasons should not be understood as designating their
discrete mental states but their attitudes and responses as whole situated persons.
I encounter the other person, not abstracted from their circumstances, but in
the middle of something that has a beginning and that is going somewhere. I see
them in the framework of a story in which either I have a part to play or I dont.
The narrative is not primarily about what is going on inside their heads; its about
the events going on in the world around them, the world that we share with them,

. For further discussion of the distinction between properly folk psychological narratives
and those dramatic re-enactments which only involve intentional attitudes, yet which share the
same basic formats (see Hutto 2006).
34 Shaun Gallagher and Daniel D. Hutto

and in their lives and the way they understand and respond to such events. Cru-
cially, coming to appreciate the others story to see why they are doing what they
are doing does not require a capacity for mentalizing inferences or simulations.
Our understanding of others is ordinarily not based on attempts to get into their
heads; typically we do not need to access a landscape of consciousness since
we already have access to a landscape of action which is constituted by their
embodied actions and the rich worldly contexts within which they act contexts
that operate as scaffolds for the meaning and significance of actions and expres-
sive movements.

9. Conclusions

In this chapter we have argued that there is no need to appeal to standard theory-
of-mind and simulative explanations of how we understand others as the basis
for making sense of them folk psychologically. What begins as perceptual and
emotional resonance processes in early infancy, which allow us to pick up the
feelings and intentions of others from their movements, gestures, and facial ex-
pressions, feeds into the development of a more nuanced understanding of how
and why people act as they do, found in our ability to frame their actions, and our
own, in narrative ways. Our everyday abilities for intersubjective engagement and

. This is not to deny that some narratives are more psychological than others those of James
Joyce or Dostoyevsky, as Jordan Zlatev suggests (private correspondence). Luckily Joyce, Dos-
toyevsky and other novelists put us in the heads of their characters and we do not have to theo-
rize or simulate our way in there. The NPH does not deny that human beings are complicated
psychological creatures, or that the psychological lives of Stephen Dedalus or Raskolnikov are
not fascinating in ways that outstrip an understanding in folk psychological terms. The issue is
how we come to understand people in our everyday interactions with them.
. The idea that narrative understanding does not rest on or presuppose ToM abilities per
se (including simulation capacities that involve making belief/desire predictions and explana-
tions) is in line with Greg Curries (2007) recent claim that our skills in comprehending narra-
tives involve the adoption of frameworks through which we identify with (and are effectively
asked to take on) certain personas, which can be understood as embodied stances that par-
ticular narratives invite us to adopt. The activity of framework adoption is quite distinct from
understanding a storys content as detailed in its plot or fabula. As Currie characterizes it,
adoption or attention to a narrative framework activates our subpersonal mechanisms for imi-
tative and emotional responding thus it is something that engages us viscerally. He contrasts
this with the idea that attention to narrative framework involves developing a theory (even if
a not very explicit one) about the persona embedded in narrative; although he does not wholly
reject the latter proposal since he acknowledges it may have a role when it comes to communi-
cating about narratives.
Primary interaction and narrative practice 35

interaction are, in the later stages of childhood, transformed by encounters with


narratives. It is exposure to these complex objects of joint attention and not
facility with theoretical knowledge or simulative routines that is responsible for
the development of sophisticated folk psychological abilities and understanding;
abilities which remain importantly in play in our adult life.

References

Allison, T., Puce, Q. and McCarthy, G. 2000. Social perception from visual cues: role of the STS
region. Trends in Cognitive Science 4 (7): 267278.
Astington, J. 1990. Narrative and the childs theory of mind. In Narrative Thought and Nar-
rative Language, B.K. Britton and A.D. Pellegrini (eds.), 15171. Hillsdale, New Jersey:
Erlbaum.
Baldwin, D.A. 1993. Infants ability to consult the speaker for clues to word reference. Journal
of Child Language 20: 395418.
Baldwin, D.A. and Baird, J.A. 2001. Discerning intentions in dynamic human action. Trends
in Cognitive Science 5 (4): 17178.
Baldwin, D.A., Baird, J.A., Saylor, M.M. and Clark, M.A. 2001. Infants parse dynamic action.
Child Development 72 (3): 70817.
Barresi, J. and Moore, C. this volume. The neuroscience of social understanding.
Bartsch, K. and Wellman, H. 1995. Children Talk About the Mind. New York: Oxford University
Press.
Bermdez, J.L. 1996. The moral significance of birth. Ethics 106: 378403.
Bertenthal, B.I., Proffitt, D.R. and Cutting, J.E. 1984. Infant sensitivity to figural coherence in
biomechanical motions. Journal of Experimental Child Psychology 37: 21330.
Bruner, J. 1986. Actual Minds, Possible Worlds. Cambridge, MA: Harvard University Press.
Bruner, J. 1990. Acts of Meaning. Cambridge, MA: Harvard University Press.
Carpendale, J.I.M. and Lewis, C. 2004. Constructing an understanding of the mind: The de-
velopment of childrens social understanding within social interaction. Behavioural and
Brain Sciences 27: 79151.
Currie, G. 2007. Framing narratives. In Narrative and Understanding Persons, D.D. Hutto
(ed.), 1742. Cambridge: Cambridge University Press.
deVignemont, F. 2004. The co-consciousness hypothesis. Phenomenology and the Cognitive
Sciences 3 (1): 97114.
Decety, J. and Chaminade, T. 2003. Neural correlates of feeling sympathy. Neuropsychologia
41: 127128.
Dunn, J. 1991. Understanding others: Evidence from naturalistic studies of children. In Natu-
ral Theories of Mind, A Whiten (ed.), 5161. Oxford: Blackwell
Feldman, C.F., Bruner, J., Renderer, B. and Spitzer, S. 1990. Narrative comprehension. In Nar-
rative Thought and Narrative Language, B.K. Britton and A.D. Pellegrini (eds.), 178. Hill-
sdale, NJ: Lawrence Erlbaum Associates.
Gallagher, S. 1996. The moral significance of primitive self-consciousness. Ethics 107: 129
140.
36 Shaun Gallagher and Daniel D. Hutto

Gallagher, S. 2001. The practice of mind: Theory, simulation or primary interaction? Journal
of Consciousness Studies 8(57): 83108.
Gallagher, S. 2004. Understanding interpersonal problems in autism: Interaction theory as an
alternative to theory of mind. Philosophy, Psychiatry, and Psychology 11 (3): 199217.
Gallagher, S. 2005. Phenomenological contributions to a theory of social cognition. Husserl
Studies 21: 95110.
Gallagher, S. 2006. The narrative alternative to theory of mind. In Radical Enactivism: In-
tentionality, Phenomenology, and Narrative, R. Menary (ed.), 22329. Amsterdam: John
Benjamins.
Gallagher S. 2007a. Logical and phenomenological arguments against simulation theory.
In Folk Psychology Re-Assessed, D. D. Hutto and M Ratcliffe (eds.), 6378. Dordrecht:
Springer.
Gallagher, S. 2007b. Simulation trouble. Social Neuroscience 2 (34): 35365.
Gallagher, S. in press. Direct perception in the intersubjective context. Consciousness and Cog-
nition.
Gallagher, S. and Meltzoff, A. 1996. The earliest sense of self and others: Merleau-Ponty and
recent developmental studies. Philosophical Psychology 9: 213236.
Garfield, J.L., Peterson, C.C. and Perry T. 2001. Social cognition, language acquisition and the
development of the theory of mind. Mind and Language 16: 494541
Gallese, V. 2005. Being like me: Self-other identity, mirror neurons and empathy. In Per-
spectives on Imitation, S.L. Hurley and N. Chater (eds.), 101118. Cambridge, MA: MIT
Press.
Gibson, J.J. 1979. The Ecological Approach to Visual Perception. Boston, Houghton Mifflin.
Goldman, A. I. 2002. Simulation theory and mental concepts. In Simulation and Knowledge of
Action, J. Dokic and J. Proust (eds.), 119. Amsterdam: John Benjamins.
Goldman, A. and Sripada, C.S. 2005. Simulationist models of face-based emotion recogni-
tion. Cognition 94: 193213.
Gopnik, A. 1993. How we know our minds: The illusion of first-person knowledge of inten-
tionality. Behavioral and Brain Sciences 16: 114.
Gopnik, A. and Meltzoff, A. 1997. Words, Thoughts, and Theories. Cambridge, MA: MIT Press.
Gordon, R. 1986. Folk psychology as simulation. Mind and Language 1: 158171.
Guajardo, N.R. and Watson, A. 2002. Narrative discourse and theory of mind development.
The Journal of Genetic Psychology 163: 30525.
Herman, D. 2007. Cognition, emotion and consciousness. In The Cambridge Companion to
Narrative, D. Herman (ed.), 245259. Cambridge: Cambridge University Press.
Hurley, S.L. 2005. Active perception and perceiving action: The shared circuits model. In
Perceptual Experience, T. Gendler and J. Hawthorne (eds.), New York: Oxford University
Press.
Hutto, D.D. 2004. The limits of spectatorial folk psychology. Mind and Language 19: 54873.
Hutto, D.D. 2005. Knowing what?: Radical versus conservative enactivism. Phenomenology
and the Cognitive Sciences 4(4): 389405.
Hutto, D.D. 2006. Narrative practice and understanding reasons: Reply to Gallagher. Radi-
cal Enactivism: Intentionality, Phenomenology, and Narrative, R. Menary (ed.), 231247.
Amsterdam: John Benjamins.
Hutto, D.D. 2007a. Folk psychology without theory or simulation. In Folk Psychology Reas-
sessed, D.D. Hutto and M. Ratcliffe (eds.), 115135. Dordrecht: Springer.
Primary interaction and narrative practice 37

Hutto, D.D. 2007b. The narrative practice hypothesis. In Narrative and Understanding Per-
sons, D.D. Hutto (ed.), 4368. Cambridge: Cambridge University Press.
Hutto, D.D. 2008. Folk Psychological Narratives: The Sociocultural Basis of Understanding Rea-
sons. Cambridge, MA: MIT Press.
Jeannerod, M. and Pacherie, E. 2004. Agency, simulation, and self-identification. Mind and
Language 19 (2): 11346.
Johnson, S.C. 2000. The recognition of mentalistic agents in infancy. Trends in Cognitive Sci-
ence 4: 2228.
Johnson, S. Slaughter, V. and Carey, S. 1998. Whose gaze will infants follow? The elicitation of
gaze-following in 12-month-old infants. Developmental Science 1: 23338.
Lamarque P., 2004. On not expecting too much from narrative. Mind and Language 19:
393408.
Lamarque P. and Olsen S. 1994. Truth, Fiction and Literature. Oxford: Oxford University
Press.
Legerstee, M. 1991. The role of person and object in eliciting early imitation. Journal of Ex-
perimental Child Psychology 51: 42333.
Leslie, A.M. 1987. Pretense and representation: The origins of theory of mind. Psychological
Review 94: 412426.
Lewis, C. 1994. Episodes, events and narratives in the childs understanding of mind. In Chil-
drens Early Understanding of the Mind, C. Lewis and P. Mitchell (eds.), 457478. Hillsdale,
New Jersey: Erlbaum
Lewis, C., Freeman, N.H., Hagestadt, C. and Douglas, H. 1994. Narrative access and produc-
tion in preschoolers false belief reasoning. Cognitive Development 9: 397424
Meltzoff, A.N. 1995. Understanding the intentions of others: Re-enactment of intended acts by
18-month-old children. Developmental Psychology 31: 83850.
Meltzoff, A.N. and Brooks, R. 2001. Like Me as a building block for understanding other
minds: Bodily acts, attention, and intention. In Intentions and Intentionality: Foundations
of Social Cognition, B. Malle, L.J. Moses and D.A. Baldwin (eds.), 171191. Cambridge,
MA: MIT Press.
Moore, D. G., Hobson, R. P. and Lee, A. 1997. Components of person perception: An inves-
tigation with autistic, non-autistic retarded and typically developing children and adoles-
cents. British Journal of Developmental Psychology 15: 401423.
Myowa-Yamakoshi, M. 2001. Evolutionary foundation and development of imitation. In Pri-
mate Origins of Human Cognition and Behavior, T. Matsuzawa (ed.), 349367. Dordrecht:
Springer.
Myowa-Yamakoshi, M., Tomonaga, M., Tanaka, M. andMatsuzawa, T.2004. Imitation in neo-
natal chimpanzees (Pan troglodytes). Developmental Science 7 (4): 43742.
Nelson, K. 2003. Narrative and the emergence of a consciousness of Self. In Narrative and
Consciousness, G. D. Fireman, T. E. J. McVay and O. Flanagan (eds.), 1736. Oxford: Ox-
ford University Press.
Nelson, K. 2007. Young Minds in Social Worlds. Cambridge, MA: Harvard University Press
Peterson, C. and McCabe, A. 1994. A social interactionist account of developing decontextu-
alised narrative skill. Developmental Psychology 30: 93748
Phillips, W., Baron-Cohen, S. and Rutter, M. 1992. The role of eye-contact in the detection
of goals: Evidence from normal toddlers, and children with autism or mental handicap.
Development and Psychopathology 4: 37583.
38 Shaun Gallagher and Daniel D. Hutto

Richner E.S. and Nicolopoulou A. 2001. The narrative construction of differing conceptions of
the person in the development of young childrens social understanding. Early Education
and Development 12: 393432.
Scholl, B.J. and Tremoulet, P.D. 2000. Perceptual causality and animacy. Trends in Cognitive
Sciences 4 (8): 299309.
Stern, D. 1985. The Interpersonal World of the Infant: A View from Psychoanalysis and Develop-
mental Psychology. New York: Basic Books.
Trevarthen, C. 1979. Communication and cooperation in early infancy. A description of
primary intersubjectivity. In Before Speech: The Beginning of Human Communication,
M.Bullowa (ed.), 99136. London: Cambridge University Press.
Trevarthen, C. and Hubley, P. 1978. Secondary intersubjectivity: Confidence, confiding and
acts of meaning in the first year. In Action, Gesture and Symbol: The Emergence of Lan-
guage, A. Lock (ed.), 183229. London: Academic Press.
Velleman, J.D. 2000. The Possibility of Practical Reason. Oxford: Oxford University Press
Walker, A.S. 1982. Intermodal perception of expressive behaviors by human infants. Journal
of Experimental Child Psychology 33: 51435.
Wellman H. and Phillips A. 2001. Developing intentional understandings. In Intentions and
Intentionality, B. Malle, L. J. Moses and D.A. Baldwin (eds.), 125148. Cambridge, MA:
MIT Press.
Wittgenstein L. 1992. Last Writings on the Philosophy of Psychology Volume 2: The Inner and the
Outer. Cambridge: Blackwell.
Zlatev, J. this volume. The co-evolution of intersubjectivity and bodily mimesis.
chapter 3

The neuroscience of social understanding

John Barresi and Chris Moore

How do we understand and engage with the purposeful, emotional and mental
activities of other people and how does this knowledge develop? What can
recent work on mirror neurons in monkeys and human beings teach us about
how the brain supports social understanding? According to Intentional Rela-
tions Theory (Barresi and Moore 1996), the understanding of the self-other
equivalence requires concurrent knowledge of mind from both a first- and a
third-person point of view and that any mental concept must directly match
and link these two ways of knowing it. In this chapter we will argue that Inten-
tional Relations Theory is consistent with and can help interpret recent neuro-
physiological findings on mirror neurons that fire equivalently for intentional
relations (i.e., object-directed actions, emotions, and mental activities) of self
and other.

1. Introduction

Human beings, like many other social animals, spend an enormous amount of
time engaged in activities that require quick adjustments to socially transmitted
information. By observing others we learn to adapt effectively to changes in the
environment as well as to the actions and reactions of our social peers. How do
we do it? To what extent do we need to understand the mental processes gov-
erning our own and others actions or can we function socially based on simple
mechanisms by which we come to share psychological states with others, without
understanding them? In other words, to what extent does a skill at mind sharing
function as a form of social understanding well before we come to a level of mind
understanding? Furthermore, how do these two capacities mind-sharing and
mind-understanding relate to each other?
In the Theory of Mind (ToM) approach to social understanding emphasis is
placed on sophisticated abilities to understand mental states in particular the
ability to attribute representational mental states such as beliefs to self and other.
It is the ability to attribute false beliefs that is taken as a hallmark of the specifical-
ly human form of mentalistic social understanding that characterizes a theory of
40 John Barresi and Chris Moore

mind. However, social understanding is a more general phenomenon that occurs


both in many social species that seem to have no ToM of this kind, and in chil-
dren well before the late preschool period when the understanding of false beliefs
develops. Indeed, an early form of social understanding is evidenced essentially
from birth as neonates show a particular sensitivity to human social stimuli. We
suggest that the kinds of social sensitivity observed in infants as well as in many
social animals should be seen as forms of non-reflective social understanding, de-
pendent on an array of mechanisms that yield an ability to share mental states
with others without necessarily recognizing that those shared mental states are in
fact attributable to individual agents. A satisfactory account of the development
of social understanding will require an explanation of how these original mecha-
nisms that enable early social responsiveness combine with later developing skills
to yield more sophisticated forms of intersubjectivity. In parallel, such an account
must specify how engaging in shared understanding or shared mental activities
with others facilitates the later more individualistic understanding of mind.
In the present chapter we will approach these problems with a focus on recent
findings in the neuroscience of social understanding. With the discovery in mon-
keys of pre-motor mirror neurons that respond to the actions of others as well as
to their own motor plans, there is reason to believe that even monkeys somehow
understand actions of both self and others in a similar object-directed way. But
should such a common code between perception and action be treated merely
as an instance supporting the common coding hypothesis (Prinz 1997; Knoblich
and Jordan 2002) or a more elaborate understanding for what we have called ac-
tion intentional relations (Barresi and Moore 1996)? Even if it seems unlikely that
monkeys represent these actions as full-blown mental events involving conscious
intentions of the other, distinct from their own, it is still a question of how simple
is their understanding here and how it connects to more elaborate forms of social
understanding. Perhaps their understanding occurs more simply as sharing in
the goal-directed nature of the activity of the other by entering into a comparable
goal-directed pre-motor state, while not themselves engaging in the activity. Such
a sub-personal level of understanding of the action of another would in effect
convert it into a first-person representation of ones own actions, but it would
not yet represent that action as what we call an intentional relation, involving a
representation of an agent as well as the object-directed action. Nevertheless, such
sub-personal matching between goal-directed actions of self and other provides
a basis for eventual understanding of full blown intentional relations that can be
applied to agent-oriented actions directed at objects at a personal level, whether
of self or other. We believe that this is the way that these phenomena should be
understood and that this matching between aspects of the observing monkeys in-
tentional relations (IRs) and the IRs of others provides evidence for the matching
The neuroscience of social understanding 41

hypothesis that we have previously proposed as the basis of social understanding.


We believe that our general account, which we have named Intentional Relations
Theory (IRT) is superior to alternative accounts of the origins and development
of intersubjectivity, and in the present chapter will bring neuroscientific evidence
involving humans as well as monkeys to support our position.

2. The matching problem of social understanding and three approaches


to intersubjectivity

A fundamental aspect of human social understanding is what we have previously


referred to as self-other equivalence. Human beings understand self and other to
be essentially the same kind of thing namely a human agent or person that can
engage in a variety of intentional relations with objects or states of affairs. This
aspect of human social understanding is quite obvious and passes unnoticed in
commonsense psychology and yet it hides a significant epistemic problem. How
can we attribute the same meaning to actions of other individuals that we attribute
to our own actions when the third-person information that we have of the actions
of others is radically different from the first-person information that we have of
our own actions? The information we get about others actions is apparently in-
formation about the overt aspects of behavior, while the object towards which
the action is directed is often not obvious (or even opaque in the case of mental
states such as beliefs). In contrast the information we get about our own actions
is apparently information about our orientations towards the objects and events
we witness or imagine but does not typically include information about ourselves
as the actor or agent being so oriented. So how are these qualitatively different
forms of information recognized to be tokens of the same type expressions of
intentional relations between an agent (self or other) and some object or state of
affairs? In the recent history of research on social understanding, there are three
fundamentally different answers to this question.
According to the theory theory (TT) approach humans have innately, or ac-
quire early in development, a ToM mechanism that can be applied uniformly to
self and other based purely on inference from behavior (e.g., Gopnik 1993; Leslie
1987). Self-other equivalence in this account is based on the fact that one can
interpret ones own behavior in the same way that one can interpret the behavior
of others. For instance, consider an example of what we have called an emotional
intentional relation the case of love. Since love is a public concept, whose main
criterion of application is supposedly based on behavior, a person can know when
she or another person is in love by noticing the same kinds of behavior of self and
other directed toward the object of love.
42 John Barresi and Chris Moore

In contrast, Simulation theorists (ST) would take a different view from TT


on how a person knows about her own love versus another persons love (e.g.,
Goldman 1992; Gordon 1986; Humphrey 1984; Harris 1989). On their view, love
may have some behavioral consequences that can be used to identify it in another
person, but it is fundamentally a subjective mental state, and without a personal
appreciation of the feeling state that usually goes with the overt behavior, we
cannot truly understand love as a psychological state. We understand love direct-
ly in our own case, but only indirectly and by simulation in the case of another
person. We must imagine what someone else feels when we observe their behavior
in context (e.g., around the object of love), in order to understand the psychologi-
cal, intentional, and subjective meaning of their behavior. In our own case, our
behavior is a consequence of this subjective state, so no inference is necessary
from our own behavior to the mental state that we are in. Although we need to re-
flect on these states to categorize them, we do not need knowledge of comparable
states in other people to form these categories and concepts.
A third kind of theory invokes the notion of matching or sharing attitudes
or psychological states between self and other and is represented in a range of
different accounts (e.g., Gallagher and Hutto this volume; Gallese, Keysers and
Rizzolatti 2004; Hobson 1991, 1998; Hobson and Hobson this volume; Wilson
and Knoblich 2005; Zlatev this volume). Although the various theories in this
third group can all be considered to invoke some form of intersubjectivity un-
derstood widely as involving matched or shared mental states between or among
individuals they vary on the extent to which they provide an account of the
foundations or the origins of intersubjectivity and on the processes by which in-
fants are hypothesized to move from forms of intersubjective sharing of mental
states to understanding that self and other are persons or selves that might have
distinct mental states. Several of these theories (e.g., Gallagher and Hutto this
volume; Hobson and Hobson this volume) invoke Trevarthens (e.g., Trevarthen
and Hubley 1978) concepts of primary intersubjectivity and secondary inter-
subjectivity to describe early phases of development. However, while the capac-
ity for mind-sharing is evident in these forms of intersubjectivity, what isnt clear
is how the infant moves from sharing mental states with others to understanding
mental phenomena as distinct and possibly different in self and other. In the case
of Gallagher and Hutto, this latter form of understanding is thought to rely on the
acquisition of language and of the differentiating roles of self and other in situated
narratives, some of which involve folk psychological terms.
Our own Intentional Relations Theory (Barresi 2001, 2004; Barresi and Moore
1996; Moore 1999, 2006, 2007) does not differ substantially from these accounts
in its interpretation of the early phases of development of social understanding
that involve mind-sharing through processes that produce interpersonal matching
The neuroscience of social understanding 43

of self and other. However, it differs from these other accounts, as well as from
ToM accounts, in explicitly addressing the genesis of the recognition of self-other
equivalence and difference, as involving a developmental shift from mind-sharing
to mind-understanding. The key notion in IRT is that the first-person informa-
tion that we have about our own IRs (e.g., the feeling of love for someone) is
distinctly different from the third-person information that we have about the IRs
of others (e.g., anothers behavior toward the object of love), and that in order
to develop uniform concepts or representations of IRs that can be applied equally,
but distinctly, to self and other, we need to match these two types of information
in a single concept or form of knowledge that contains both types of information.
In Barresi and Moore (1996) we posited an intentional schema to integrate this
multimodal combination of first- and third-person information initially derived
from self and other. On this view, being in love should not be defined primarily
as a private, subjective experience, as in the ST view, nor as a mental intentional
state that can be inferred from behavior, as in the TT view, but as an embodied IR
between the agent and object, that, in the case of love, involves both feelings and
concomitant behavioral expressions. Moreover, in learning the concept of love
or any other IR, it is supposed that we must learn both the first-person, inner
aspect, of the IR, as well as the third-person, outer aspect; otherwise, we fail to
have the concept. For instance, one can be in love, say for the first time, without
knowing it, because all one knows about love is the outer aspect, and one does not
recognize this outer aspect in ones feelings for another until ones concomitant
behavior is pointed out to one. Of course, love in our culture is primarily a social
concept and learned to a large extent through language. But other more basic IRs,
like fearing, seeing, or picking up are more fundamental, and may be understood
to some extent by an organism without the mediation of language.
In the rest of the chapter, we consider in more detail Intentional Relations
Theory and specifically the issue of how 1st and 3rd person information about in-
tentional relations are integrated. We go on to review the neuroscientific findings
that support this approach to social understanding. We then consider autism as a
case of failure to integrate 1st and 3rd person information in the understanding
of self and other.

3. Matching of 1st and 3rd person information and their integration

In Barresi and Moore (1996) we developed a model of social understanding


that focused on the origins of understanding of IRs. We distinguished 4 levels
of understanding IRs and used these levels to interpret both developmental and
phylogenetic differences in social understanding (cf. Zlatev this volume, for a
44 John Barresi and Chris Moore

similar multilevel model). At level 1, the organism represents the activities of self
and other in distinctly different ways and neither in terms of IRs. We suggested
that most animals typically operate at this level and it may also characterize so-
cial understanding in certain forms of psychopathology such as autism. We will
return to consider this level and the case of autism in Section 5. In the rest of this
section we review 3 levels of social understanding in which first and third person
information about IRs are integrated. We devote most attention to how such inte-
gration is possible in the first place.

3.1 Interactive routes to matching

In order to understand IRs at all, the organism must be able to combine first
person information about IRs with third person information about IRs into in-
tegrated representations involving an agent, an intentional relation and an object
that can be equally applied both to the IRs of self and the IRs of others or to the
joint activity of self and other. This combination occurs at level 2 of our model
when there is matched first- and third-person information about intentional ac-
tion available to the organism. There are various ways in which such matching can
come about. Our suggestion is that matching occurs normally in human develop-
ment when infant and mother engage in interactions, initially dyadic and later
triadic. These interactions are typically patterned in such a way that the infant
and mother both express and experience similar psychological activity. For ex-
ample, in dyadic interactions, infant and mother may smile and vocalize in close
synchrony. Whether the synchrony between an infant and adult in interactions
of this sort is based on innate contagious mechanisms, or occurs through a form
of mimicry initiated at first by the adult, it seems clear that there is a matching in
such cases, where first-person information about self can be experienced concur-
rently with matched third-person information about the other. We believe that
in such early dyadic communicative interactions the infant acquires integrated
knowledge of first- and third-person aspects of emotional expressions, though
not yet of intentional relations involving those expressions directed at objects.
Dyadic interactions do not revolve around objects so the intentionality of
the shared psychological activity is at best implicit. However, in the triadic inter-
actions that develop at about 9 months of age, the patterned interaction is now
object-focused so that both infant and mother may share psychological activ-
ity to a particular object they may look at the same object or produce similar
object-directed actions through imitation. We have argued that such interactive
experiences are crucial for the development of understanding IRs because it is
The neuroscience of social understanding 45

in these interactions that the infants first-person experience of her own object-
directed psychological activity is coordinated reliably with their corresponding
third-person experience of the mothers object-directed psychological activity.
Reliable coordination of the available first- and third-person information allows
the construction of representations of intentional activity that integrate both
forms of information and are thereby applicable to the joint activity of self and
other, and subsequently with further development to individual activities of ei-
ther self or other.

3.2 Noninteractive routes to matching

Although dyadic and triadic interactions provide the normal context for the shar-
ing of psychological activity in human development, it is probably not necessary
for there to be joint engagement of either dyadic or triadic kinds for a degree
of matching of intentional relations to occur. For instance, as indicated earlier,
research on monkeys seems to show that they can represent the goal-directed ac-
tions of another organism in the same manner as they represent their own actions
(Gallese et al. 1996; Rizzolatti et al. 1996). The pre-motor mirror neurons me-
diating these representations fire in the planning and execution of the monkeys
own actions, but also in perceiving comparable goal-directed actions in another
animate being. While we do not wish to exclude the possibility of innate forms of
matching between self and other, for instance in emotional expressive domains
where unlearned forms of mimicry may be the basic mechanism for matching,
in the case of action understanding a learning mechanism needs to be involved.
Matching between perception and action may come about because for certain
forms of psychological activity such as object-directed reaching, the organism
gains information about its own action via more than one perceptual modality
(Keysers and Perrett 2004). When a monkey reaches for objects, it is reliably pro-
vided with both visual and proprioceptive information about its own reaching,
and an integrated multimodal representation of the action will result. Then vi-
sion may mediate the connection to the action of others. The same multimodal
representation will later be activated by only the relevant visual information and
thereby can be applied to the experience of seeing another organism perform
the action. Vision here serves as a third-person bridging modality that can be
applied to both self and other, thus linking the strictly first-person information
of proprioception to the available third-person information about goal-direct-
edness. In this way a representation of action that is similarly applicable to the
actions of both self and other may be achieved. However, it should be noted that
46 John Barresi and Chris Moore

all that is involved here is the understanding of the action, per se, not of an agent
performing the action. Thus the representation is at a sub-personal rather than
at a personal, or agent, level of representation. Hence, an organism does not here
understand intentional relations involving agents, but only sub-personal actions
directed at objects. There is evidence that such a process may also operate in early
human development. Woodward (1998) has shown that infants are able to rec-
ognize the goal-directed reaches of others at about the same time as they them-
selves engage in visually guided reaching. Importantly, teaching infants to make
object-directed reaches at an early age is correlated with their representation of
similar reaching actions of another person (Sommerville, Woodward, and Need-
ham 2005). Thus, at least for simple actions, it seems that learning to succeed at
an action, which involves coordination of first-person (e.g., proprioceptive) and
typically third-person (e.g., visual) information of ones own action, may be cor-
related with representing the similar actions of others.

3.3 Sub-personal and interpersonal forms of understanding IRs

It will be recognized that the latter route to representations of actions that are
equally applicable to self and other will only serve for those actions, such as man-
ual reaching, for which the same perceptual information is available for both self
and other. It is in such circumstances that a common code for the perception and
production of action can bear fruit both in monkeys and humans, with a sub-per-
sonal level of understanding of goal-directed actions. However, in the understand-
ing of intentional relations more is required. The difference between the human
case and the cases of monkeys is that the dyadic and triadic interactive contexts
of early human development provide multiple instances in which there are richly
elaborated structures of shared intentional relations. For example, in a typical
episode of a joint attentional (triadic) interaction, there may be shared emotional
experience (e.g., smiling), shared object-directed action (e.g., object exchange)
and shared epistemic activity (e.g., gaze following). These interactive structures
therefore provide not just experiences in which a particular, simply observable,
action intentional relation is shared but experiences in which a variety of different
yet complementary intentional relations of various types are shared. As a result,
there is the opportunity for infants to acquire complex representations of inten-
tional activity that combine and integrate the first-person information pertaining
to their own activity and the third-person information pertaining to the activity
of others across a range of intentional relations. This difference between the hu-
man and animal cases, such as monkeys, is important because it may explain why
humans step onto the path of development that leads ultimately to an agent level
The neuroscience of social understanding 47

form of social understanding, whereas monkeys appear not to. To see why, it is
important to examine whether the earliest forms of integrated representations of
intentional relations are recognized to be at a personal or at a subpersonal level.
Some authors (e.g., Tomasello 1999) have argued that the phenomena of triadic
interactions arising at about 9 months signal the development of a concept of an
intentional agent that can be applied equally to self and other agents. However,
a plausible alternative is that concepts of intentionality are initially acquired in a
more piecemeal way. For example, Woodward and her colleagues research (for a
review see Woodward 2005) has shown that infants represent the object-direct-
edness of different actions at different points in development. Whereas reaching
is represented as object-directed before 6 months, gaze is not represented as ob-
ject-directed until the end of the first year. Furthermore, when such intentional
relations are first being acquired, the acquisition does not appear to be correlated
so that infants who represent gaze as object-directed may not represent pointing
as object-directed and vice versa. To explain this pattern of results, Moore (2006)
proposed the notion of intentional islands (cf. Tomasello 1992, on verb islands
in language acquisition), whereby intentional representations start out as separate
sub-personal islands relevant to particular object-directed actions and are only
gradually integrated into more complex concepts at a personal level relevant to
goal-directed agents. We suggest that it is the richly structured patterns of in-
tentional relations that occur in triadic interactions, which allows the generation
of the more complex representations of goal-directed agents. In contrast, while
other animals such as monkeys may acquire sub-personal integrated represen-
tations of object-directed actions, such as reaching, without experience of rich
combinations of shared intentional relations, they do not proceed to construct
representations of goal-directed agents.

. Great apes provide evidence that they stepped onto a new path similar to, but not the same
as, our own. Chimpanzees, and probably other apes, engage in intense social interactions that
promote an understanding of others actions on an individual level, through what Zlatev (this
volume), ascribes to dyadic mimesis and which, we (Barresi and Moore 1996) originally hy-
pothesized was associated with their general imitative ability. Recent research suggests that the
evolutionary path taken here may be different from our own in that while learning in dyadic
interactions between infant and mother chimpanzees involves an apprenticeship relationship
(Matsuzawa 2007) human dyadic and triadic relationships between human infant and adults is
much more intensely communicative and collaborative (Tomasello et al. 2005). A consequence
of this latter form of interaction results in what Zlatev calls triadic mimesis, which is roughly
similar to level 2 interactions transforming to level 3 interactions in our own model.
48 John Barresi and Chris Moore

3.4 Individualistic understanding of IRs

So far we have advanced from a sub-personal understanding of the simple actions


of self and other that do not explicitly code for agent to the capacity for under-
standing shared IRs evident at level 2 of our model. This sharing entails the ex-
istence of representations of IRs that are interpersonal, though probably not ex-
plicitly represented as interpersonal. Rather the interrelated and similar IRs of self
and other are understood using a uniform representational form that codes for
the concurrent identity between first-person information of self and third-person
information of the other. But it is not yet the case that agents are recognized to
be individual centres of intentional activity. The next level of understanding IRs
(Barresi and Moore1996) requires the ability to reflect on, or imagine IRs as prop-
erties of individual agents. According to IRT this requires the use of imagination to
fill in the third-person information for IRs of self and first-person information for
IRs of others. Without this ability it would not be possible to represent diversity of
intentional relations across self and other when the same object is involved.
In the developmental account given in Barresi and Moore (1996), children
attain level 3 of understanding IRs during the second year of life. A variety of
phenomena evidence this change (see Moore 2007). On the one hand the child
becomes capable of recognizing the self as an individual agent as seen by phe-
nomena such as mirror self-recognition. On the other hand, children become
able to appreciate that others may have a different intentional orientation to an
object from the self. For example, 18-month-olds understand that someone else
may like something that they do not and vice versa (Repacholi and Gopnik 1997)
and they understand that they may see something that someone else does not and
vice versa (Moll and Tomasello 2005). At this point in development, therefore,
children are able to attribute some forms of mental states, those exhibited in pres-
ent activities, to individual agents, both self and other.
This level of understanding goes beyond mind-sharing toward a conceptual
understanding of individuals as embodied agents with points of views that may
differ from each other. In some respects our account here is similar to the simula-
tion account. However, whereas ST proposes that we simulate the mental state
of the other through imaginative substitution of our own mental states, we here
suggest that only the first-person aspect of the intentional relation of the other
requires imaginative construction, as the third-person aspect is pragmatically
available in the situation. Moreover, we suggest that at this same time the infant
acquires the skill to understand its own intentional relations by imagining the
third-person aspect that goes with the currently first-person experience of the
intentional relation, something the ST does not even attempt to explain. Our ac-
count also differs from Gallagher and Hutto, since we do not think that language
The neuroscience of social understanding 49

alone mediates the conceptual development that occurs at this time, which allows
one to distinguish ones own from the others embodied mental states. Indeed,
their narrative interpretation of how children distinguish mental states of self and
other, seems to focus on only representational mental states such as false beliefs, a
capacity for which we provide a separate account in the next section.

3.5 Representation of mental agents

In the fourth year, pre-school children achieve yet another level of social under-
standing, when they can imagine both first- and third-person properties of a
mental state. This results in children developing knowledge of mental representa-
tion as such, which allows them to show evidence of the conceptual understand-
ing of mind seen in traditional ToM tasks. However, according to IRT, the levels of
intentional understanding at which there is an understanding of individual minds
derive from previous shared intentional activities where first- and third-person
information originally became associated. It is the derivation from shared psy-
chological activity that enables the concepts of mind that humans have, yielding
notions like love having both internal bases involving feelings and external bases
involving behavior. All levels of social understanding which depend originally on
the integration of first- and third-person information are held to be different from
Level 1 forms of understanding of self and other, which rely separately on first-
person information alone to understand self and third-person information alone
to understand others. Consideration of level 1 will become important later in the
chapter when we discuss autism (see Section 5). We turn now to research on the
neuroscience of social understanding to see to what extent there is support for
the model of social understanding we have outlined here. We should note, how-
ever, that whereas the evidence from neuroscience indicates a particular pattern
of brain organization underlying social understanding in adult human beings as
well as nonhuman primates, there is of course no guarantee that the same organi-
zation exists at all earlier stages of human development.

4. Neuroscience and social understanding

In reviewing research on the neuroscience of social understanding, we will or-


ganize the initial review into sections dealing with action IRs, emotion IRs, and
epistemic IRs, respectively. In these sections our concern will be to identify brain
regions and processes that deal primarily with first- and third-person informa-
tion separately, from areas where first- and third-person information meet and
50 John Barresi and Chris Moore

where their integration makes possible relatively uniform application of these


representations to both self and others. Where first- and third-person informa-
tion is separated we would expect them to apply differentially to self and other,
with first-person information tending to apply mostly to self and third-person
information mostly to other. Where they are integrated, the question becomes
how we use this integrated information to distinguish between self and other. We
will also identify regions in which lower level perceptual processing can be distin-
guished from higher level metacognitive processing. Finally, we identify research
indicating that first- and third-person information is sometimes represented in-
dependently, in particular in the case of autistic individuals. Figure 1 depicts es-
sential compontents of IRT along with possible anatomical correlates that will be
described in subsequent sections of this chapter.

4.1 Action intentional relations

Since the discovery of mirror neurons in the premotor cortex in monkeys that
respond to the goal-directed actions of others (Rizzolatti et al. 1996), studies have
investigated whether evidence can be found for similar neural structures in hu-
mans. A standard paradigm used in a number of these studies is to compare an
observation condition, where participants watch the activity of another person, an
execution condition where participants perform the action on cue, and an imitation
condition, where participants perform the action that they observe another per-
son perform. Transitory Magnetic Stimulation (TMS) studies affecting processing
in the relevant neural systems have attempted either to facilitate/produce actions
in observation conditions, or to interfere with actions in action or imitation con-
ditions (see Iacoboni 2005, for a review). Taken together these studies affirm that
premotor and parietal cortices in humans show mirror properties similar to those
in individual neurons of monkeys. Both of these areas are active when performing
the actions or observing the actions of others, and more active than in either of
these conditions when these actions are both observed and imitated. In contrast
to the additional activation found in these two regions (premotor and parietal)
when imitating compared to mere observing, a third region, the Superior Tem-
poral Sulcus (STS), tends to show the same level of activation in both observation
and imitation conditions but is inactive in the action-only condition.
Iacoboni, Kaplan and Wilson (in press) have proposed a model incorporating
IRT in accounting for these findings. They propose that the STS provides third-
person visual information of the action that is being performed. This information
is transferred to the Posterior Parietal, where it is matched with first-person infor-
mation on the kinesthetic, kinematic and somatosensory properties that might go

Figure 1. Essential components of Intentional Relations Theory and possible anatomical correlates. Third-person information involves exte-
rior senses, which tend to apply more to others than to self; first-person information involves action intentions and interior senses, which tend
to apply more to self than to others. Intentional schemas are posited to involve multimodal association areas where first- and third-person
information get integrated. Although our main focus is on object-directed intentional relations, non-object directed integration are expected
to occur at body-schema levels as well. The central site for the integration of first- and third-person information involving agents in intentional
relations is hypothesized to be the temporal-parietal junction (TPJ) and/or inferior parietal (IP), which is hypothesized to involve an egocen-
tric or first-person representation of the agent in space in the right hemisphere and a connected allocentric or third-person representation of
the agent in the left. Second order, or reflective representations of intentional relations are hypothesized to occur in the prefrontal cortex. The
directions of arrows represent the dominant direction of information processing, though feedback and other connections also occur between
anatomical regions both within and between boxes of the model.
The neuroscience of social understanding
51
52 John Barresi and Chris Moore

with the action information provided by internal first-person sources of infor-


mation integrated in the inferior parietal (see Figure 1). This matched representa-
tion of embodied action is then forwarded to the pre-motor area where alterna-
tive action plans can be compared to this input. This feed forward mechanism, is
then matched to information being fed back from alternative pre-motor plans,
and an interpretation is made, in the inferior parietal, between alternative inter-
pretations. In their model, both the pre-motor area and the inferior parietal areas
involve matching between first- and third-person properties and so are attributed
to involve integration of first- and third-person information by intentional sche-
mas. One way to conceive of the relationship between these two areas is that the
inferior parietal (and/or nearby Temporal/Parietal Junction TPJ) provides an
egocentric, body-centered representation of the source of action of an agent-in-
world, while the pre-motor area represents the goal or object of the action. Both
require matching of first- and third-person information and together provide a
full representation of the action intentional relation.
From the point of view of IRT, the more important area of integration of
first- and third-person information is the inferior parietal or TPJ, rather than
the pre-motor area, particularly as this area seems to reappear on complex ToM
tasks, and may be crucial for distinguishing self and other as intentional agents.
Whereas mirror neurons in the pre-motor area may be insensitive to the differ-
ence between self and other and focus mainly on the goals of actions, something
that monkeys and young infants can represent, we would hypothesize that left
and right parietal regions represent agents in intentional relations, and might be
used to distinguish self from other as intentional agents. Studies by Decety and his
colleagues (see Decety and Grezes 2006 for a review) provide support for the idea
that the TPJ is the locus of a body-centered integration of first- and third-person
information that applies both to self and to other but that may also be used to
distinguish self from other. In these studies, imitations of other-by-self or self-by-
other are compared. The general finding is that TPJ (they include studies citing
inferior parietal as well as posterior STS) is more active on the right side when
other imitates self, but more active on the left side, when self imitates other. One
way to interpret this difference is that left TPJ is more active, when a third-person
representation of a human body in space is more dominant than a first-person
representation, and that the reverse is true for right TPJ. In other words, when the
participant is the original source of the action, the right hemisphere is dominant
and when the participant is imitating the other, the left hemisphere is dominant.
More typically, we would suggest that when left TPJ is dominant, another person
is being represented, where third-person information is perceived but first-per-
son information is imagined (what might be called an allocentric representation
of a person in space). However, when right TPJ is dominant, it is the self that
The neuroscience of social understanding 53

is typically being represented, where first-person information is perceived and


third-person information imagined (what might be called an egocentric repre-
sentation). Independent support for this idea comes from studies of brain damage
on these two sides. As we shall see, damage at the left TPJ is found to be associated
with failure at false belief tasks involving representations of others, whereas other
studies have demonstrated that damage at the right TPJ is associated with spatial
neglect, a distortion of egocentric or first-person perspective of space (see Halli-
gan, Fink, Marshall and Vallar 2003). Furthermore, damage at the TJP (or IP) has
recently been shown to produce autoscopic hallucinations seeing oneself with
right-sided damage associated with a non-egocentric out-of-body experience of
self, and left-sided damage associated with an egocentric seeing of ones double
(Blanke and Mohr 2005).
Taken together, these findings support Iacoboni et al.s application of the IRT
to their imitation studies, and their attribution of our notion of intentional sche-
ma to the inferior parietal, as they provide independent evidence that the inferior
parietal or TPJ is the main center for an integrated representation of a person in
space, whether it is self or other. But these findings also highlight how we can
distinguish self from other through the source of information that drives the rep-
resentation, third-person if it is other and first-person if it is self. These findings
also provide a basis for connecting the more complex human activities involved
in traditional false belief tasks, which have been shown also to require representa-
tions involving the TPJ and more mundane actions that are investigated in imita-
tion tasks.
However, in considering imitative tasks, it should be noted that imitation of
novel actions requires skills that do not appear in monkeys, and only appear in
humans in a full blown state during the second year of life, when the infant is
forming its concept of an intentional agent. Indeed, two-year-olds find it particu-
larly fascinating to engage in mutual imitation, where they take turns leading and
following each other in novel intentional actions, in a manner analogous to the
Decety studies. This play behavior can be interpreted as working out possibilities
made available at this time by developments in the use of the intentional schema,
both to understand self and other individually and to discriminate self from other
even in contexts, where both actors are performing similar actions.

4.2 Affective and motivational intentional relations

Typically, when dealing with action IRs, first-person information directly involves
motor plans, proprioception, and kinesthetic feedback, while third-person infor-
mation directly involves visual and auditory information. The integration of these
54 John Barresi and Chris Moore

sources of information yields representations of a body acting in space with these


first- and third-person resources integrated into a representation that can be ap-
plied to self or other, possibly through the use of vision and audition as bridging
modalities that provide information about actions of self as well as other. Even
so, there is a residual motor component, including a readiness to act (see, e.g.,
Ramnani and Miall 2004), as well as the sense of agency previously discussed that
tends to distinguish self from other. When it comes to affective and motivational
IRs, the focus is more on sensation than on action. So, the distinction between
first-person information and third-person information and their integration, will
tend to focus more on internal states within the body rather than on external
appearances and expressions. Research involving such affective intentional rela-
tions has been consistent in showing the importance of integrated somatic rep-
resentations of internal feeling states of a person whether such representations
are applied to self or other. Generalizing such representations of internal states to
another person occurs even when there is no social judgment involved in the task
and where the participant merely observes the other. Recent research on pain has
been particularly revealing. With respect to pain in self and other, single cells in
the Cingulate Cortex (CC) have been found to respond not only to own pain, but
also to the appearance of pain in another (Hutchinson et al. 1999). This response
occurred even though no instructions to empathize were involved. In an fMRI
study of empathy for anothers pain, where again no instructions to empathize
were involved, Singer, et al. (2004) had female participants and their partners
receive mild shocks following a signal which indicated who was to receive the
shock. The participants could see the hands of self and other as well as the signals
while they were in the magnetic resonance chamber. It was found that certain
primary somatosensory areas responded only to pain in self, but that the Anterior
Insula, and the CC responded to the shock signal and anticipated pain both in
self and in other. It has been hypothesized by Craig (2003) and Damasio (1999)
that the anterior portion of the Insula, particularly on the right side, is a recently
evolved region of the brain that represents a feeling self . This region and the
CC may both be involved in conscious representation of pain, in contrast to the
primary sensory cortex, which may measure the intensity and sensory quality of
the pain stimulus, but which may not always contribute to consciousness of pain.
Part of the evidence for the distinction is that placebo effects, where perception
of pain is induced, produce activations in Anterior Insula but not in the primary
sensory areas (Wager et al. 2004). It seems then that, like mirror regions in the
pre-motor and parietal areas, this feeling self level of representation of pain is
responsive, not only to ones own feeling of pain, but to the expressed, or merely
inferred, pain of another person.
The neuroscience of social understanding 55

What the Singer et al. (2004) study seems to show is that the areas involved
in conscious perception or feeling of ones own pain, are also active for the an-
ticipated pain of another. Without instructions to do so, the participants seem
to participate empathically in the anticipated pain of the other, thus sharing in
it, and presumably being aware of their pain by sharing in it. Further support for
this interpretation comes from the fact that dispositional measures of empathic
ability were obtained in this study and a correlation between degree of disposi-
tional empathy and degree of activation in the Insula and CC for the observe-
other condition was found. Therefore, not only does the third-person perception
of the others behavioral situation apparently result in a conceptual understanding
of the feeling state of the other, but it actually induces a comparable feeling state
in the observer, which may be the ground upon which conceptual understand-
ing is based. The degree to which this internal feeling state is induced seems to
depend on the capacity for empathy, or sympathetic imagination, of the observer.
However, as a subsequent study shows (Singer et al. 2006), it also depends on how
one feels about the other person. If one has reason to like the other, then there is
a stronger tendency to show an empathic response to the others pain, than if one
has reason to dislike the other person. In the latter case, men, but not women,
were shown not to have this empathic response to the others pain, but instead
showed evidence of personal pleasure at seeing the other in pain. So the story here
is fairly complex. Unlike the action mirror system in the pre-motor area, which
seems to depend only on attention to the activity of the other, the degree of iden-
tification with or caring for the other may matter in representing the feeling states
of the other in the same mode as ones own feeling states.
In the original Singer, et al. (2004) study, as well as in similar studies on ob-
serving touch (Keysers et al. 2004), and disgust (Wicker et al. 2003) in others,
primary sensory areas could be used to provide first-person information that dis-
tinguished between self and other. However, subsequent research on observation
of localized pain inducing stimuli on another person raises the issue of whether
primary sensory areas are immune to empathically induced responses. For in-
stance, Avenanti et al. (2005) had participants observe needles being pierced into
the hand of another person and found TMS motor cortex induced inhibitory re-
sponses of hand muscles that matched those that would have occurred in their
own case. Based on this and other findings, Singer and Frith (2005) have suggest-
ed that whether one is attending to or imagining the emotional response of the
other person or the sensory quality of the pain may be what distinguishes these
two kinds of results. The implication of this is that to the extent that one can proj-
ect oneself into the particular situation and experiential state of the other to that
extent will one tend to display a matching embodied state. According to IRT, it is
the fact that one has at ones disposal this personal shared experiential base upon
56 John Barresi and Chris Moore

which to understand the state of the other person that one succeeds in accurately
imagining that state. But to elicit such an internal state that typically applies to self
when observing another, a matching must occur between the expressed state of
the other and ones own associated experience of being in a comparable state, or
be elicited by attending to the situation that the other is in as if it were shared. In
the case of an expressed state this requires matching of first-person information
about the appropriate internal state to third-person information about expressed
state. So motor aspects of the behavior of others may be a mediating factor in
situations where we have no direct personal experience of emotional responses in
those situations, or where we would respond differently from the other person.
Several other studies conducted by Iacoboni and his colleagues indicate that
mirroring of expressed affective states may be an important basis for understand-
ing emotions in others. In these studies fMRI brain imaging of participants oc-
curred either while they were engaged in observing or imitating a variety of emo-
tional expressions depicted in photos (Carr et al. 2003; Dapretto et al. 2006). In
the study reported by Carr et al. (2003), observing and imitating emotional ex-
pressions in others activated regions involved in those emotional expressions for
self, in particular the amygdala and insula were involved, but also the pre-motor
area and STS. Again these results can be interpreted as eliciting from third-per-
son information (STS) the matching first-person action information necessary
to understand the internal state of the other individual. Because the observation
and imitation condition had similar pre-motor findings to action studies, this
suggests that implicit if not explicit matching of emotional expression is involved
in emotional empathy, which may feed into the representation of the feeling self
in the insula.
So far we have seen that matching between first- and third-person informa-
tion seems to occur when observing another persons affective state, and it may
not require active use of imagination to feel and understand anothers affective
IRs in that a form of affective sharing may occur directly in response to the situa-
tion or the others expression. Indeed, from a phylogenetic as well as developmen-
tal perspective contagion of emotional states from one organism to another is the
original basis of emotional sharing (cf. Zlatev this volume). However, as we have
argued, sharing a psychological state is not the same as understanding that state.
Other evidence suggests that understanding affective states in the sense of attrib-
uting emotions and other affective states to individuals as well as discriminating
ones own from anothers emotional state, likely requires frontal activity, and oc-
curs later in human development. It appears necessary to have the involvement
of frontal areas, in particular, the Medial Prefrontal Cortex (MPFC), in order to
reflect on and understand the mental state as either ones own, or anothers.
The neuroscience of social understanding 57

The role of the MPFC in understanding at a reflective level pain states in self
and other is highlighted in another recent study directly comparing imagination
of self and other in pain as compared to damage to a manikin figure (Jackson et
al. 2006). While in a magnet, participants viewed images of arms and legs appar-
ently from a first-person perspective in situations likely to be painful or neutral.
They were told to imagine the body part as their own, or another persons, or that
of a manikin. In line with the notion that the MPFC is involved in representing
second order IRs, there was a strong response in this region only for the humans,
but not for the manikin. In addition, there was differential activation in the pos-
terior cingulate, which responded to pain in self and other, and to the inferior
parietal. As in previous studies the insula and ACC were responsive to both self
and other in a comparison between pain and non-pain conditions. But differences
between self and other also occurred. The comparison between self and other
found several regions of difference, indicating different routes to representing the
same pain state in self and other, and the ability to distinguish between our own
and anothers pain.
Taken together the results on emotional processing show that matching can
occur not only in the motor system where actions or expressions of others are
mimicked, perhaps subpersonally, but that feeling states that are connected to those
expressions in ourselves are often also active when observing others or in infer-
ring their emotional states in conditions where sympathetic contagion or empathy
might be elicited. These internal feeling states are then processed further in frontal
areas when we are attempting to understand the emotional state of the other as dis-
tinguished from our own emotional response. Both the matching in the premotor
area and in the feeling self can be viewed as first-person aspects of emotional IRs,
while the visual expressions can be viewed as third-person aspects. However, for
second-order representations of these IRs, frontal activity is necessary.

4.3 False belief and complex social inference tasks

A considerable amount of research has been devoted to establishing the brain ba-
sis of the understanding of the more complex intentional relations characteristic
of theory of mind. The focus of studies using ToM tasks is on determining brain
regions functionally involved in the interpretation of complex stories of social
interaction that are visually or verbally presented and in attributing mental states
to individuals in these stories. Two brain regions have been shown to be most ac-
tive in brain imaging studies using various techniques, when compared to control
conditions involving comparable processing of non-ToM stimuli: (1) The Tem-
poral/Parietal Junction (TPJ; including neighboring Superior Temporal regions
58 John Barresi and Chris Moore

i ncorporating STS as well as Inferior Parietal regions, cf. Decety and Grezes 2006);
(2) The Medial Prefrontal Cortex (MPFC).
The TPJ is believed to be an area in which complex visual stimuli, often in-
volving biological motion and social interaction, are analyzed or represented per-
ceptually and semantically. In the section on action IRs our discussion of the TPJ
focused only on intentional actions of a single agent, but the TPJ is also crucial
for social interactions and for interpreting more complex mental states than ac-
tions. Hence, in terms of IRT the TPJ, at least on the left side, can be understood as
representing the third-person information about IRs of one or more organisms,
involved in simple or complex object and interpersonal interactions. For instance,
even in monkeys this area has individual neurons that are sensitive to eye direc-
tion of a person being observed by the monkey and the congruence with the per-
sons behavior involving another object, with their direction of gaze. Comparable
findings with humans, involving more complex IRs, for instance, involving inten-
tions, have been made using fMRI (see, e.g., Pelphery et al. 2004). So this region
can pick up epistemic as well as action IRs and is also involved in emotion IRs,
involving multiple agents. The second region of importance for the ToM tasks is
the MPFC. This region appears to be important for decoupling (Leslie 1994), or
creating second order representations of IRs that can be attributed to individu-
als. Reflective or conceptual understanding of the intentionality of the behavior
seems an important activity for this region. Indeed, merely noting a stimulus as
an act of an intentional agent rather than a machine seems sufficient to involve
this region (Ramani and Miall 2004). But this region has a number of other func-
tions of a metacognitive, or executive, sort, and there appear to be subregions
with specialized functions, some of which we will consider shortly.
Some recent elegant research using simple false belief tasks presented in sto-
ries and in videos, along with a number of important controls, to brain dam-
aged patients with frontal and/or temporal-parietal lesions (Apperly, Samson,
Chiavarino and Humphreys 2004; Samson, Apperly, Chiavarino and Humphreys
2004; Samson, Apperly, Kathirgamanathan, and Humphreys 2005) has provid-
ed evidence in partial congruence with these imaging studies. They found that
damage to the left TPJ produces a fairly specific deficit in false belief reasoning
about others, but that damage in the frontal regions does not. So it appears that
a functional TPJ at least on the left (no tested patients had right TPJ damage) is
necessary for false belief reasoning. By contrast, it appears that the impact of brain
damage in frontal regions is less specific and more diverse, including effects on
performance on tasks involving executive function but not on ToM tasks. Indeed,
in one of their patients with frontal damage, there was evidence that problems oc-
curred only on false belief tasks that required the inhibition of first-person knowl-
edge of the real location but not on false belief tasks for which the participant
The neuroscience of social understanding 59

did not have knowledge of the real location (Samson et al. 2005). This result is
congruent with other findings which suggest that executive function associated
with frontal activity may be necessary to differentiate between mental states of
self and other, and thus for attributing distinct mental states to individuals. In
these circumstances a single mental state that is shared between self and other
that might be used in cases of passive observation or empathic responding, will
not be sufficient for mental state attribution.
If we consider just the two main regions involved in research with complex
ToM tasks, these results fit well with what we would expect based on the theory
theory (TT) approach to social understanding. The TPJ provides third-person
behavioral analysis of animate activity or apparently animate activity, while the
MPFC decouples or represents abstractly IRs, presumably in a theoretical or con-
ceptual format. That the same behavioral analysis and conceptual representation
could be applied to self and other is suggested by the fact that the MPFC shows
overlap in activity for a variety of tasks involving self and other (e.g., see Decety
and Sommerville 2003, for a summary of this research). It is possible that, in line
with TT, TPJ analyzes and represents animate activity and IRs based mostly on
visual or third-person information. As such the matching problem may not arise
if the IRs of self and other are both analyzed in a behavioristic (or third-person)
mode. MPFC could then provide decoupled (second order) representations of
intentional relations of agents, whether they are of self or other (or jointly self
and other).
However, the fact that the MPFC (and perhaps the TPC, particularly on the
right side) is activated in cases of self-representation that seem not to be based
entirely on third-person information about the self suggests that integrated rep-
resentations involving both first- and third-person information of the kind pos-
tulated by IRT are involved. Furthermore, the frontal region and other regions
along the midline have been postulated to be part of a system for representa-
tion and regulation of self (Northoff and Bermpohl 2004). So, perhaps, the MPFC
generates a second order representation of anothers mental states, through prior
association between a third-person behavioral analysis mainly from the left TPJ
that applies more often to another person than to self and a simulation of first-
person components of mental states found in the rest of the typically right-sided
self-system. This latter interpretation is consistent with studies showing differen-
tial responses for self and other in high level processing of social stimuli (e.g., Lou
et al. 2004).
The main conclusion to be drawn from these studies is that complex ToM
tasks involve two main regions of the brain, a posterior one associated with per-
ceptual representation of IRs and an anterior one associated with metarepresen-
tation of these perceptual representations. Furthermore, there is a good deal of
60 John Barresi and Chris Moore

overlap between the regions involved in representing self and other. Nevertheless,
differences that occur suggest a mapping of third-person information typical of
what we have from others to first-person information more typically associated
with self. While the need for perceptual and metarepresentational processes for
understanding individual IRs in complex ToM tasks is congruent with TT, the
overlap between self and other, and use of first-person information as well as
third-person information in these tasks fits better with the IRT approach to social
understanding.

5. Level 1 understanding of intentional relations the case of autism

Finally, it is worth mentioning some imaging research that supports the notion
that representations of intentional relations can occur in distinct forms. Dapretto
et al. (2006) studied high functioning autistic children and matched controls us-
ing the same imitation task as in the Carr et al. (2003) study mentioned earlier.
However, in addition to observing and imitating emotional expressions, the au-
tistic participants were measured on severity of autism, using several standard-
ized scales. The behavioral findings were that the autistic participants were as
able to imitate emotional expressions as other children, but the imaging findings
suggested that the means that they used were different. The typically developing
children replicated the results of the adult study, where mirror neuron pre-motor
and insula areas were involved in observation and imitation of emotions, along
with other areas. But in autistic children these mirror neuron areas were not as
involved, and degree of involvement of these areas during imitation was inversely
related to severity of autism in the social domain. Furthermore, other areas, the
left anterior parietal and the right visual association areas, were more involved for
autistic than for typical children. It was suggested that these latter areas served as
an alternative route to imitation in this group instead of the usual one involving
the mirror neuron system.
These results, combined with other findings, support the notion put forward
by Barresi and Moore (1996) that the main reason why autistic people have dif-
ficulty in ToM tasks as well as emotion understanding and imitation is that they
do not match and integrate first- and third-person information through an in-
termodal intentional schema, hence that they acquire and deploy independent
first-person (or egocentric) and third-person (or allocentric) theories of mind. At
the time that we wrote our article we had no idea how the notion of intentional
schema might relate to brain activity. However, with the discovery of mirror neu-
rons at about the same time, we, as well as others (e.g., Iacoboni et al. 2007) have
The neuroscience of social understanding 61

been able to make the connection. The Dapretto et al. study probably provides the
best confirmation for the view that it is the lack of matching of these two types
of information through an intentional schema that is at the heart of problems
in social understanding of autistic individuals. The inability to readily transform
third-person perceptions into first-person matching experiences, as well as to
make the reverse mapping, and thus to engage in mind sharing, makes it difficult
for autistic individuals to make sense of mind, because of the absence of a direct
connection between the two necessary, inseparably tied aspects of all mental phe-
nomena, an externally available bodily expressive component, and an internally
available feeling component. As a result of this deficiency in ability to share mind
with others, they lose interest in other people, and have difficulty learning from
them. Eventually, if they do attempt to reflect on and understand mind in self and
others, they form two radically different accounts: on the one hand they develop
rather complex TT-like accounts of mind from a third-person view of their own
and other peoples behavior; and on the other hand they overgeneralize in appar-
ent simulation their own egocentric first-person perspective to others (cf. Frith
and de Vignemont 2005). Because of lack of mind sharing during infancy and
beyond, they are faced with intractable problems in understanding mind beyond
those that appear as purely third-person TT types, or purely first-person ST types,
instead of integrated theories where matching of first- and third-person informa-
tion is involved as we have proposed in IRT.

6. Conclusion

Recent discoveries in the neuroscience of social understanding have opened a new


window through which to evaluate theories of social understanding. In the present
chapter we have primarily examined our own intentional relations theory (Barresi
and Moore 1996) in light of these new discoveries. IRT has three important ele-
ments. First, it postulates a distinction between first- and third-person informa-
tion pertaining to intentional relations, as well as a requirement that both forms
of information be combined in order to generate representations of intentional
action that are shared between, or equally applicable to, self and others. Second,
it postulates that a distinction may be made between a level of social understand-
ing at which first- and third-person information are integrated without being at-
tributable to individual agents and more complex levels of social understanding
at which integrated representations are recognized to be properties of individual
agents. In human ontogeny (and possibly in phylogeny), the latter levels of social
understanding are founded on the former level. Third, it postulates that under
62 John Barresi and Chris Moore

certain conditions first- and third-person information about intentional relations


may be processed separately so that the activities of self and other are represented
independently. In humans, such a condition is seen in autism.
The neuroscience of social understanding shows that integrating first- and
third-person information through matching these two types of information oc-
curs in the understanding of action, emotion, as well as epistemic IRs of self and
other. The fact that matching between first- and third-person aspects of IRs for
self and other occurs immediately on-line for a variety of IRs is congruent with
the notion that both aspects are necessary to fully extract the meaning of these
activities. Such matching occurs early on in life, though this process of mind-
sharing does not develop into understanding individual minds until later in de-
velopment. On our account, it is only through processes that bring about shared
psychological states between individuals early on, and provide the initial basis
for social understanding, that later development of our usual understanding of
individual minds becomes a possibility.
Although TT might account for some instances of theories of mind gener-
ated purely from behavior, it is only in autistic individuals were we see exagger-
ated theories of this type. However, in autistic individuals there is evidence of a
failure in mapping first- and third-person information from very early on in life,
which prevents shared mental activity in dyadic interactions. ST does better than
TT in accounting for a variety of phenomena involving emotional empathy, and
understanding epistemic states. But it cannot account, without special pleading,
for matching phenomena involved in action understanding. Again, autistic indi-
viduals provide a window into the problem. They can generalize either first- or
third-person representations separately from self to other or the reverse. However,
because they did not initially engage in shared mental life with others, they have
problems understanding the meaning of social activity when the integration of
both first- and third-person information is involved. Without prior matching and
integrating these two types of information in earlier shared mental activity associ-
ated with dyadic and triadic interactions, the concepts that they generate based
either on behavior alone or internal states alone are diminished when compared
to our usual understanding of IRs of self and other. Thus, we believe that match-
ing theories like IRT provide the best account of how we come to understand our
own as well as other minds.
The neuroscience of social understanding 63

References

Apperly, I.A., Samson D., Chiavarino C. and Humphreys G.W. 2004. Frontal and left temporo-
parietal contributions to theory of mind: Neuropsychological evidence from a false belief
task with reduced language and executive demands. Journal of Cognitive Neuroscience 16:
177384.
Avenanti, A., Bueti, D., Galati, G. and Aglioti, S.M. 2005. Transcranial magnetic stimulation
highlights the sensorimotor side of empathy for pain. Nature Neuroscience 8: 955960.
Barresi, J. 2001. Extending self-consciousness into the future. In The Self in Time: Develop-
mental Perspectives, C. Moore and K. Lemmon (eds.), 141161. Mahwah, NJ: Lawrence
Erlbaum Associates.
Barresi, J. 2004. Intentional relations and divergent perspectives in social understanding.
In Ipseity and Alterity: Interdisciplinary Approaches to Intersubjectivity, S. Gallagher and
S.Watson (eds.), 7499. Rouen: Presses Universitaires de Rouen.
Barresi, J. and Moore, C. 1996. Intentional relations and social understanding. Behavioral and
Brain Sciences 19: 107154.
Blanke, O. and Mohr, C. 2005. Out-of-body experience, heautoscopy, and autoscopic hallu-
cination of neurological origin Implications for neurocognitive mechanisms of corporeal
awareness and self consciousness. Brain Research Reviews 50: 184199.
Carr, L., Iacoboni, M., Dubeau, M.C., Mazziotta, J.C. and Lenzi, G.L. 2003. Neural mecha-
nisms of empathy in humans: A relay from neural systems for imitation to limbic areas.
Proceedings of the National Academy of Science, USA 100: 54975502.
Craig, A.D. 2003. Interoception: The sense of the physiological condition of the body. Current
Opinion in Neurobiology 13: 500505.
Damasio, A.R. 1999. The Feeling of What Happens: Body and Emotion in the Making of Con-
sciousness. New York: Harcourt Brace.
Dapretto, M., Davies, M.S., Pfeifer, J.H., Scott, A.A., Sigman, M., Bookheimer S.Y. and Iacoboni,
M. 2006. Understanding emotions in others: mirror neuron dysfunction in children with
Autism Spectrum Disorder. Nature Neuroscience 9: 2830.
Decety, J. and Grezes, J. 2006. The power of simulation: Imagining ones own and others be-
havior. Brain Research 1079: 414.
Decety, J. and Sommerville, J. 2003. Shared representations between self and other: A social
cognitive neuroscience view. Trends in Cognitive Sciences 7: 527533.
Frith, U. and de Vignemont, F. 2005. Egocentrism, allocentrism, and Asperger syndrome.
Consciousness and Cognition 14: 719738.
Gallagher, S. and Hutto, D.D. this volume. Understanding others through primary interaction
and narrative practice.
Gallese, V., Keysers, C. and Rizzolatti, G. 2004. A unifying view of the basis of social cognition.
Trends in Cognitive Science 8: 396403.
Gallese, V., Fadiga, L., Fogassi, L. and Rizzolatti, G. 1996. Action recognition in the premotor
cortex. Brain 119: 593609.
Goldman, A. 1992. In defense of the simulation theory. Mind and Language 7: 104119.
Gopnik, A. 1993. How we know our minds: The illusion of first-person knowledge of inten-
tionality. Behavioral and Brain Sciences 16: 114.
Gordon, R. 1986. Folk psychology as simulation. Mind and Language 1: 158171.
64 John Barresi and Chris Moore

Halligan, P.W., Fink, G.R., Marshall, J.C. and Vallar, G. 2003. Spatial cognition: Evidence from
visual neglect. Trends in Cognitive Sciences 7: 125133.
Harris, P. 1989. Children and Emotion. Oxford: Basil Blackwell.
Hobson R.P. 1991. Against the theory of theory of mind. British Journal of Developmental
Psychology 9: 3351.
Hobson, R.P. 1998. The intersubjective foundations of thought. In Intersubjective Communi-
cation and Emotion in Early Ontogeny, S. Braten (ed.), 283296. Cambridge: Cambridge
University Press.
Hobson, R.P. 2002. The Cradle of Thought. Exploring the Origins of Thinking. London: Macmil-
lan.
Hobson, R.P. and Hobson, J. this volume. Engaging, sharing, knowing: Some lessons from
research in autism.
Humphrey, N. 1984. Consciousness Regained. Oxford: Oxford University Press.
Hutchison, W.D., Davis, K.D., Lozano, A.M., Tasker, R.R. and Dostrovsky, J.O. 1999. Pain-re-
lated neurons in the human cingulate cortex. Nature Neuroscience 2: 403405.
Iacoboni, M. 2005. Understanding others: Imitation, language, empathy. In Perspectives on
Imitation: From Neuroscience to Social Science, S. Hurley, and N. Chater (eds.), 7799.
Cambridge, MA: MIT Press.
Iacoboni, M., Kaplan, J. and Wilson, S. in press. A neural architecture for imitation and in-
tentional relations. In Imitation and Social Learning in Robots, Humans and Animals: Be-
havioural, Social and Communicative Dimensions, C. Nehaniv and K. Dautenhahn, (eds.).
Cambridge, UK: Cambridge University Press.
Jackson, P.L., Brunet, E., Meltzoff, A.N. and Decety, J. 2006. Empathy examined through the
neural mechanisms involved in imagining how I feel versus how you feel pain. Neuropsy-
chologia 44: 752761.
Keysers, C. and Perrett, D.I. 2004. The neural correlates of social perception: A Hebbian net-
work perspective. Trends in Cognitive Sciences 8: 501507.
Keysers, C., Wicker, B., Gazzola, V., Anton, J., Fogassi, L. and Gallese, V. 2004. A touching sight:
SII/PV activation during the observation and experience of touch. Neuron 42: 335346.
Knoblich, G. and Jordan, J.S. 2002. The mirror system and joint action. In Mirror Neurons
and the Evolution of Brain and Language, M.I. Stamenov and V. Gallese (eds.), 115124.
Amsterdam: John Benjamins.
Leslie, A.M. 1987. Pretense and representation: The origins of theory of mind. Psychological
Review 94: 412426.
Leslie, A.M. 1994. ToMM, ToBy, and Agency: Core architecture and domain specificity.
In Mapping the Mind: Domain Specificity in Cognition and Culture, L.A. Hirschfeld and
S.A.Gelman (eds.), 119148. New York: Cambridge University Press.
Lou, H.C., Luber, B., Crupain, M., Keenan, J. P., Nowak, M., Kjaer, T.W., Sackeim, H.A. and
Lisanby S.H. 2004. Parietal cortex and representation of the mental Self. Proceedings of
the National Academy of Science, USA 101: 68276832.
Matsuzawa, T. 2007. Comparative cognitive development. Developmental Science 10: 97103.
Moll, H. and Tomasello, M. 2005. 12- and 18-month-old infants follow gaze to spaces behind
barriers.Developmental Science 7: F1F9.
Moore, C. 1999. Intentional relations and triadic interaction. In Developing Theories of Inten-
tion, P. D. Zelazo, J. W. Astington and D. R. Olson (eds.), 4362. Mahwah, NJ: Lawrence
Erlbaum Associates.
The neuroscience of social understanding 65

Moore, C. 2006. Representing intentional relations and acting intentionally in infancy: Current
insights and open questions. In Human Body Perception from the Inside Out, G.Knoblich,
I. Thornton, M. Grosjean and M. Shiffrar (eds.), 427442. New York: Oxford University
Press.
Moore, C. 2007. Understanding self and other in the second year. In Transitions in Early So-
cioemotional Development: The Toddler Years, C.A. Brownell and C.B. Kopp (eds.), 4365.
New York: Guilford Press.
Northoff, G. and Bermpohl, F. 2004. Cortical midline structures and the self. Trends in Cogni-
tive Sciences 8: 1027.
Prinz, W. 1997. Perception and action planning. European Journal of Cognitive Psychology 9:
129154.
Ramnani, N., and Miall, C.R. 2004. A system in the human brain for predicting the actions of
others. Nature Neuroscience 7: 8590.
Repacholi, B.M. and Gopnik, A. 1997. Early reasoning about desires: Evidence from 14- and
18-month-olds. Developmental Psychology 33: 1221.
Rizzolatti, G., Fadiga, L., Gallese,V. and Fogassi, L. 1996. Premotor cortex and the recognition
of motor actions. Cognitive Brain Research 3: 131141.
Samson D., Apperly I.A., Chiavarino C. and Humphreys G.W. 2004. Left temporoparietal junc-
tion is necessary for representing someone elses belief. Nature Neuroscience 7: 499500.
Samson, D., Apperly, I. A., Kathirgamanathan, U., and Humphreys, G.W. 2005. Seeing it my
way: A case of a selective deficit in inhibiting self-perspective. Brain 128: 11021111.
Sebanz, N. and Frith, C. 2004. Beyond simulation? Neural mechanisms for predicting the ac-
tions of others. Nature Neuroscience 7: 56.
Singer, T. and Frith, C. 2005. The painful side of empathy. Nature Neuroscience 8: 845846.
Singer, T., Seymour, B., ODoherty, J., Kaube, H., Dolan, R. and Frith, C. 2004. Empathy for
pain involves the affective but not sensory components of pain. Science 303: 11571162.
Singer, T., Seymour, B., ODoherty, J.P., Stephan, K.E., Dolan, R.J. and Frith, C.D. 2006. Em-
pathic neural responses are modulated by the perceived fairness of others. Nature 439:
466469.
Sommerville, J.A., Woodward, A.L. and Needham, A. 2005. Action experience alters 3-month-
old infants perception of others actions. Cognition 96: B1B11.
Swanson, D., Apperly, I.A., Kathirgamanathan, U. and Humphreys G.W. 2005. Seeing it my
way: A case of a selective deficit in inhibiting self-perspective. Brain 128: 11021111.
Tomasello, M. 1992. First Verbs: A Case Study of Early Grammatical Development. New York:
Cambridge University Press.
Tomasello, M. 1999. The Cultural Origins of Human Cognition. Cambridge, MA: Harvard Uni-
versity Press.
Tomasello, M., Carpenter, M., Call, J., Behne, T. & Moll, H. 2005. Understanding and sharing
intentions: The origins of cultural cognition. Behavioral and Brain Sciences 28: 675691.
Trevarthen, C. & Hubley, P. 1978. Secondary intersubjectivity: Confidence, confiding and acts
of meaning in the first year. In Action, Gesture, and Symbol: The Emergence of Language.
A.Lock (ed.), 183229. New York: Academic Press.
Wager, T.D., Rilling, J.K., Smith, E.E., Sokolik, A., Casey, K.L., Davidson, R.J., Kosslyn, S.M.,
Rose, R.M. and Cohen, J.D. 2004. Placebo-Induced Changes in fMRI in the Anticipation
and Experience of Pain. Science 303: 11621167.
66 John Barresi and Chris Moore

Wicker, B., Keysers, C., Plailly, J., Royet, J.-P., Gallese, V. and Rizzolatti, G. 2003. Both of us
disgusted in my insula: The common neural basis of seeing and feeling disgust. Neuron
40: 655664.
Wilson, M. and Knoblich, G. 2005. The case for motor involvement in perceiving conspecif-
ics. Psychological Bulletin 131: 460473.
Woodward, A.L. 1998. Infants selectively encode the goal object of an actors reach. Cognition
69: 134.
Woodward, A.L. 2005. The infant origins of intentional understanding. Advances in Child
Development and Behavior 33: 229262.
Zlatev, J. this volume. The co-evolution of intersubjectivity and bodily mimesis.
chapter 4

Engaging, sharing, knowing


Some lessons from research in autism

Peter Hobson and Jessica A. Hobson

Our aim in this chapter is to consider how intersubjective co-ordination is


integral to human forms of interpersonal engagement, sharing experiences with
others, and acquiring knowledge about persons with minds. We dwell on three
studies involving children and adolescents with autism, each concerned with
different aspects of non-verbal communication in greetings and farewells, con-
versation, and imitation, respectively. Other researchers reactions to these stud-
ies illustrate how scientists tend to be sceptical of measures (however reliable)
intended to capture the intersubjective dimension of personal relatedness. On
a more theoretical note, we suggest that intersubjectivity acquires the structure
that it does, and has the developmental implications that it does, in virtue of
human beings propensity to identify with others attitudes.

1. Introduction: Qualities of relatedness

Why do we need to bother ourselves with intersubjectivity? To many scientists,


the concept has all the wrong kinds of qualities. It is vague; it seems to be trying to
capture something that exists between or among individuals, a systemic property,
rather than an identifiable feature or function of a given organism; and it is dif-
ficult to operationalize, to quantify or otherwise objectify. Moreover, it smacks of
emotion a matter that should not be so much of a problem, were the concept not
framed in a manner so unaccommodating to cognitive/computational, informa-
tion-processing models of the mind.
Our aim in this chapter is to illustrate why the concept of intersubjectivity is
indispensable for any account of the development of psychological functioning
early in life, and pivotal for understanding the syndrome of autism. We begin by
describing three studies of children and adolescents with autism, and indicate
why the systemic and emotional qualities of intersubjectivity are so important
for interpreting the findings. Each of the studies focuses on different aspects of
68 Peter Hobson and Jessica A. Hobson

non-verbal communication participants greetings and farewells, their bodily


expressions during conversation, and their sharing looks during tests of imita-
tion but they have in common a concern with what such expressions mean
as expressions of affective relatedness and interpersonal co-ordination. A
primary purpose of the studies is to pinpoint atypical qualities of relatedness
among individuals with autism, and the results provide a basis for a discus-
sion of what autism might reveal about the nature of intersubjectivity. We also
consider the developmental implications of intersubjective engagement for the
ability to share experiences with others and to arrive at knowledge of persons-
with-minds.
Perhaps we should provide brief clarification of our use of four terms: so-
cial, interpersonal, intersubjective, and identification. We shall employ the word
social to refer to happenings between people, without prejudging the degree
to which the participants in the exchanges experience each other as persons
(rather than as, say, things). The word interpersonal is intended to reflect a spe-
cial form of relatedness to other embodied persons that includes the potential
for intersubjective engagement (Trevarthen 1979), that is, connectedness and
co-ordination between the subjective orientations of each person involved (see
also Susswein and Racine this volume). We shall be suggesting that a process of
identifying with the attitudes of others is what structures intersubjectivity in
human beings but not, we believe, in other primates and gives intersubjective
transactions the power to shape the course of human cognitive as well as social
development. Rather than attempting to define identification at this point an
especially difficult task, given that it is a process that operates on different lev-
els at successive points in development in what follows we shall illustrate its
meaning through specific instances of its expression. Critically, to identify with
someone else is to assume (and paradigmatically, be moved by) attitudes per-
ceived in the other, in such a way that those attitudes become a part of ones own
experience (for example, when one shares experiences) and, potentially at least,
part of ones own emotional repertoire.
The picture of scientific scepticism that we painted at the beginning of the
chapter may seem an unfair caricature. So in describing our three empirical stud-
ies, we shall recount some stories of scientists reactions to our own research. In
each case we shall convey the responses we received when we submitted the find-
ings to mainstream academic journals.
Engaging, sharing, knowing 69

2. Three empirical studies

2.1 Hello and goodbye

The first study we shall report appeared in a paper entitled: Hello and goodbye:
A study of social engagement in autism (Hobson and Lee 1998). In order to cap-
ture the spontaneous greetings and farewells of children and adolescents with and
without autism relating to an unfamiliar person, a colleague Tony Lee videotaped
participants as they entered and departed from a familiar but empty classroom in
which there was a stranger (PH, the first author) to whom they were introduced,
and from whom they later took their leave.
In outline, the findings were as follows. Compared with participants without
autism, there were about half as many of those with autism who gave spontaneous
expressions of greeting in the Hello episode, and a substantial proportion failed
to respond even after prompting. All the young people without autism made eye
contact, but a third of those with autism failed to do so; no fewer than 17 out of
24 of the former group smiled, but only six out of 24 of those with autism. In the
Goodbye episode, half the individuals without autism but only three of those
with autism made eye contact and said a goodbye. And not only were there few
participants with autism who waved in response to PHs final prompt, but also
their waves were strangely uncoordinated and limp.
When we designed this study, we were aware that behavioural data would
fail to do justice to the intersubjective phenomena we were attempting to cap-
ture, even though such conventional ratings might do the job of highlighting
atypical forms of social exchange. We expected that we should have to resist oth-
ers attempts to impose a conceptual framework in which the phenomena were
reduced to the social transmission of non-verbal communicative cues. There-
fore we also asked our judges to look at the greeting episode up to the time the
child sat down at the table, and to rate the degree of personal engagement with
PH. It turned out that different judges who made these ratings independently
were in good agreement with each other. The results were that 14 participants
without autism but only two with autism were judged to be in the most strongly
engaged category, and only two without autism but 13 with autism in the least
engaged category.
What do these findings really mean? Here is how one female adolescent with
autism negotiated the greeting and farewell. This person gave only the briefest
70 Peter Hobson and Jessica A. Hobson

glance towards PH as she entered, and then looked away. As Tony said This is
Peter, she continued to look away for about a second, then looked towards PH
without moving her rather set facial expression, and gave a brief and toneless
Unn in acknowledgement of PHs presence. Then she looked away to one side,
and maintained this lack of eye contact as she walked across the room with her
hands linked together in front of her body. She sat down without looking at PH.
Once seated, she did not look up at PH across the table. She fixed her gaze towards
her lap. Throughout the sequence, she gave little sense of any emotional contact
with either adult present. Then when she was told that our session was over, she
stood up rather abruptly without making eye contact with PH, and only made any
gesture towards PH when he said a first, rather insistent Goodbye as she turned
to leave. Even here, the gesture was to flap her left hand behind her vaguely in
PHs direction a wave that hardly seemed like a wave, especially since she was
still looking away and her only remark was a rather nasal and flat Bye. PHs
final Goodbye was met with the faintest of head-turns, another quiet (and hardly
expressive) Bye, and what seemed like a stiff extension of her right wrist behind
her body, which might have been a further wave. Although she had seemed aware
of PHs presence, he felt this involved little sense of himself as a person.
When we reviewed the videotapes, something else struck us. This concerned
PHs own behaviour. Although he had been trying to relate to each participant in
a consistent manner, it seemed that in being unable to sustain a fluency and spon-
taneity of exchange with the participants who had autism, his own behaviour and
gestures became stiff and forced. We shall return to this observation in due course.
When we first submitted our paper for publication, it carried the title: Hello
and goodbye: A study of interpersonal engagement in autism. The journal edi-
tor who dealt with our manuscript favoured something more neutral about the
greeting and farewell behaviors of the children we studied. Thanks to our ratings
of engagement, we were able to negotiate a compromise title, replacing the word
interpersonal with social. We had weathered what was to prove the first of a
succession of encounters over our attempts to measure and describe patterns of
intersubjective relatedness.

2.2 Head-nodding

The second study (Garca-Prez, Lee and Hobson 2007) arose out of a previous
investigation in which Tony Lee had engaged adolescents with and without au-
tism in conversation in the form of a semi-structured interview (Lee and Hobson
1998). We had videotaped the interviews, and now we wanted to test whether, as
we supposed, the intersubjective impairments that characterize autism would be
Engaging, sharing, knowing 71

manifest in atypical patterns of interpersonal co-ordination in this conversational


setting. Two previous studies of this issue by Capps, Kehres and Sigman (1998)
and Tantam, Holmes and Cordess (1993) had yielded surprisingly few indications
of such abnormalities. In keeping with these previous studies, we decided to apply
behavioural measures such as the amount of smiling and head-shaking and nod-
ding, but also predicted group differences when ratings of videotaped interactions
were made of two relational characteristics: participants degree of affective en-
gagement with the interviewer, and the flow of the dyadic exchange.
Beyond this, on the basis of an hypothesis that individuals with autism are
seldom moved to adopt the bodily expressed psychological orientation of others
(Hobson 1993a) a phenomenon we consider to reflect a limited propensity to
align ones own subjective stance with that of someone else through the process
of identification we anticipated that participants with autism would show fewer
episodes of nods and shakes of their heads and a smaller proportion of time look-
ing to the interviewers face. We anticipated that these group differences might
be more marked at those times when the interviewer was talking than in periods
when they (the participants) were talking. The point here is that according to
our hypothesis, there should be a specific difficulty when individuals with autism
need to accommodate to and connect with someone elses stance-in-talking, rather
than simply needing to show non-verbal communicative expressions.
The results were striking for the discrepancy between the very marked group
differences that appeared on subjective (but objectively reliable) judgments of af-
fective engagement and interactive flow between the conversational partners, and
what seemed to be either absent, or subtle but modest, group differences on be-
havioural measures of amounts of looking, smiling, and head-nods/shakes. The
participants with autism were rated as low in affective engagement and even more
markedly discrepant from the control group in the smoothness of their exchanges
(in keeping with clinical descriptions by Bosch 1970; Hobson 2002; Kanner 1943),
yet they appeared more similar than different in the behavioural components of
non-verbal communication.
Perhaps there is something about the subtle yet powerful interplay between
conversational partners that eludes capture by measures of behavioural events.
And when we looked more closely at certain of the behavioural measures, there
appeared to be tell-tale signs that all was not well in the interpersonal co-ordina-
tion of the exchanges. More specifically, we found that exactly as in the study by
Capps et al. (1998), participants with autism often showed an absence of head-
shakes/nods when the partner was talking, even though the group difference was
not significant when the participants themselves were talking. Therefore it seemed
that the group difference was not reducible to a general disinclination to nod the
head among participants with autism.
72 Peter Hobson and Jessica A. Hobson

What might this set of results signify? In our view, it probably signifies that
children with autism are limited in the degree to which they identify with an-
other person in conversation. We suggest that in the case of people who do not
have autism, one individual nods in accordance with what he/she is saying when
he/she is talking, and nods in accordance with him/herself in identification with
the other person when the other person is talking. In other words, it is because
of the kind of engagement people have with the stance (and corresponding ideas)
expressed by the other persons speech and expressive behaviour an engagement
that leads one to adopt the other persons cognitive-affective orientation in the act
of comprehending the other that the natural, unselfconscious kind of nodding-
in-communicating follows. Individuals with autism are specifically impaired in
this kind of intersubjective linkage and attunement. This interpretation accords
with other recent evidence that children with autism are limited in the degree to
which they identify with the actions of others in imitative contexts (Hobson and
Lee 1999; Hobson and Meyer 2005; Meyer and Hobson 2004). And it is in keeping
with the fact that there were marked group differences in affective engagement
and the flow of interpersonal exchanges during conversation.
In fact, half-way through the study we made one additional prediction. We
reasoned that if children with autism have a lowered propensity to identify with
and feel moved by another person, then the other person might have a reciprocal
difficulty in identifying with individuals who have autism. Therefore we predict-
ed that the interviewer would also show less head-shaking/nodding specifically
when the participant was talking, owing to the interviewers difficulty in iden-
tifying with the stance of the participant. This prediction was borne out even
though the interviewer did not look significantly less to the participants with au-
tism when they were talking, nor was he lacking in smiles. Therefore the result
was not simply a reflection of his looking less to the participants, nor showing
less feeling. It was, we considered, a reflection of the fact that the intersubjective
system of two people in relation to one another was awry. We were reminded of
PHs own stiffness towards participants with autism in the Hello-Goodbye study.
In both cases, the workings (or not-workings) of intersubjectivity were reflected
in each individuals behaviour, and in all probability each individuals experience,
towards the other.
When we submitted this paper for publication, the reviews were positive.
However, one anonymous reviewer expressed concern that the interviewer had
failed to interview all the participants in the same way, and recommended that
we should control for this effect statistically. Otherwise, the reviewer explained, it
is not possible to be sure that it was the participants rather than the interviewer
who contributed to the outcome on measures of intersubjective engagement. Of
course this raises an important point, for it is possible (though much in the results
Engaging, sharing, knowing 73

suggested was not actually the case) that the interviewer was systematically biased
in his approach to the participants with autism. Yet here the dangers of trying to
dissect the essentially interpersonal phenomenon of intersubjectivity into inde-
pendent parts threatened to undermine the measurement of what was (necessar-
ily) expressed by both components in an interlinked system.

2.3 Sharing looks and self/other-orientated imitation

Our third study (Hobson and Hobson 2007) proved the most problematic of all
when it came to arguing for its scientific respectability. This was because we in-
vited independent raters to watch videotapes of participants interacting (one at a
time) with an adult, and to judge the quality of each look they made to the adults
face. We had not anticipated how strongly other researchers would contest the ap-
propriateness or even the feasibility of judging what we defined as sharing looks
in distinction to checking looks or orientating looks.
The set-up (originally described in Meyer and Hobson 2004), involved a tes-
ter demonstrating actions that might or might not be imitated from the testers
viewpoint, and then instructing the child: Now you. Imitation of self/other-orien-
tation occurred when the child adopted the examiners demonstrated self/other-
anchored orientation, thereby reversing the positioning of the object and direct-
edness of the action. An example of this is when they saw the tester rolling a
wheel far-from-herself and close-to-the-participant, and imitated by rolling the
wheel far-from-themselves and close-to-the-tester. It turned out that participants
with autism were less likely to respond in this way. What we now wanted to test,
was our prediction that those participants who imitated the testers self/other-ori-
entation would also be those most likely to manifest sharing looks towards the
tester in the imitation task itself. The rationale was that sharing looks, too, serve
as indices of the quality of intersubjective engagement that implicates a degree of
identification with the person related-to.
The methodology was as follows. As a first step, two raters were found to
agree in the amounts of time for which children directed their gaze to the tester.
The next stage involved raters judging each look with respect to the quality and/or
function of the look according to the following mutually exclusive and exhaustive
scheme. Sharing looks were defined as those looks directed to the tester that
could be seen to express a participant sharing experience through interpersonal
contact with the tester. They involved a deep gaze which conveyed personal in-
volvement with reciprocity, depth and affective contact, in contrast to checking
looks that involved superficial glances at the tester and were more superficial and
lacking in mutuality. Checking looks were defined as those looks towards the
74 Peter Hobson and Jessica A. Hobson

tester that were used in order to assess or check out either the situation, the testers
response, or to determine what might happen next. Orientating looks were those
that appeared to occur in direct response to an action, sound, or movement on the
part of the tester. It proved that mostly, such looks were easily distinguished: two
independent judges agreed on 89% of the sample of looks they rated according to
this three-way classification.
The results were that each of the three forms of looking were less prevalent
among participants with autism, and most of the participants in each group
showed some checking and orientating looks. However, two-thirds of partici-
pants with autism never showed a sharing look, whereas this was the case for
one-third of the comparison group. Concerning our critical prediction that shar-
ing looks, and only sharing looks, would relate to imitation of self/other-orienta-
tion, it turned out that indeed, participants in each group who showed sharing
looks tended to be those who imitated the demonstrators self/other-orientation,
whereas all those with the lowest scores of self/other-orientation also showed a
complete absence of sharing looks. Our interpretation of the findings was that
sharing involves a structure of interpersonal engagement (involving identifica-
tion) that is easily overlooked until it becomes manifest through imitative self/
other-orientation.
When the paper describing this study was sent out for review, all three of the
anonymous reviewers made criticisms of the ratings and/or definition of sharing
looks. The essence of their objections was that, despite the fact that we had estab-
lished how the ratings could be made with high inter-rater reliability, there need-
ed to be some better kind of behavioural operationalization that defined when a
look could be considered a sharing look.
We shall hold back from citing chapter and verse of the reviewers points,
because this seems uncharitable when they are not here to argue their case. It
should be evident from the two studies already described that we are not averse
to measuring the kinds of behavioural components of communicative exchanges
that these reviewers had in mind for our sharing looks. Yet it seems important to
make two observations that, in our view, have a bearing on methodological ap-
proaches to measuring or evaluating intersubjectivity.
Firstly, was it the case that we had failed to operationalize the concepts in
terms of which our hypotheses and predictions were framed? Is it not standard
science, for example in the domains of psychiatry and social psychology, to ap-
praise all kinds of goings-on through human beings judgements of complex
processes with the only requirement being that independent judges arrive at
acceptable levels of agreement in their ratings? Why would one suppose that al-
ternative kinds of behavioural ratings would provide more reliable or more valid
estimates of sharing looks? Of course this might be so, although as a matter of
Engaging, sharing, knowing 75

fact, our subsequent attempts to find some behavioural indices that corresponded
with judgements of sharing looks failed miserably, so that (for example) they were
often but often not accompanied by smiles, and participants sometimes showed
smiles that were associated with other kinds of looks. The issue here is whether
the kinds of subjective ratings applicable to intersubjective phenomena are going
to be accorded the kind of status that appears to be justified providing, of course,
the usual scientific criteria of inter-rater reliabilities in judgement are satisfied.
Secondly, and related to this, recall that our hypothesis was specifically con-
cerned with looks that reflected sharing of experiences. Why? Because we con-
sidered that it was in virtue of the fact that sharing was happening that the looks
reflected how a participant was so engaged with the tester through identification,
that the imitation of self/other-orientation would be likely to occur. Now if this is
the case, how are we to establish that any behavioural index or indices of sharing
do indeed reflect sharing, without also making subjective judgements of sharing?
Such behavioural components do not come ready-flagged, as it were. One review-
er pointed out that previous researchers had found ways of making behavioural
ratings that (in his/her view) seemed to capture phenomena that were similar to
those we were trying to measure. Yet if this judgement was not corroborated by
ratings of sharing, then it seems the interpretation of the data as reflecting shar-
ing must be questionable. At some point, measures of sharing need to enter the
picture if claims are going to be made about the behavioural expressions and de-
velopmental implications of sharing.

2.4 Overview of the studies

One theme has gained prominence through this research. In the Hello-Goodbye
study, we saw that individuals with autism are less likely to orientate to and af-
fectively engage with a stranger, or to depart with typical gestures of farewell.
For example, all the participants with autism who waved did so with a strangely
configured and ill-directed gesture that hardly seemed a wave at all. Why should
that be? How do children without autism come to adopt and shape their waving,
so that observed waves-from-others-to-self become waves-from-self-to-others?
In the study of non-verbal communication during conversations involving a per-
son with autism, again there was a lack of smooth and affectively co-ordinated
exchanges, but also evidence of a subtle but deep failure of each conversational
partner to link in with the subjective states that found bodily expression in head-
shakes and nods. Finally, in the tests of imitation and sharing looks, there ap-
peared to be a relation between adopting the self/other-orientation of someone
else, and making a connection with that person through sharing looks (see also
76 Peter Hobson and Jessica A. Hobson

Barresi and Moore this volume). Connectedness through identifying with an-
other persons psychological stance is what makes intersubjectivity a system of
selves-in-relation-to-other.
Studies such as these yield fresh insights into qualities of interpersonal en-
gagement that are pivotal for human social and cognitive development. It is with
clinical observations and research on childhood autism firmly in mind, that we
turn to consider the nature and implications of intersubjectivity.

3. The irreducibly social

The answer we give to the rhetorical question with which we began, is that it is
only by bothering with intersubjectivity, that one can pinpoint what is essential
to the social relations of human beings. If this were not enough, it is only if we
accord intersubjectivity an appropriate place within our account of early human
development that we shall be able to explain how social and cognitive develop-
ment take the course that they do. Our purpose in the remainder of this chapter
is to explicate what is involved in these claims, from two points of view; firstly,
what intersubjectivity entails in terms of person-with-person engagement, and
secondly, why it is so central an influence on the growth of the human mind, and
so indispensable when we come to consider the kind of profound social impair-
ment that occurs among children with autism.
One important line of development that threads its way through an account of
very young childrens increasingly differentiated and sophisticated social and cog-
nitive lives concerns their ability to acquire concepts of other people that encom-
pass peoples experiences and psychological orientations their minds as well as
bodies. We put the matter this way, with explicit reference to persons, for reasons
that will soon become clear; in short, relative neglect of how the mind is a feature
of embodied persons and selves has been damaging for much recent theorizing in
developmental psychology. Having said this, such theorizing has also been hugely
beneficial in re-focussing attention on the intimate connections among interper-
sonal understanding (as broadly conceived), communication, and thought.
Our first claim, then, is that intersubjective relations between bodily ex-
pressive persons are at the core of what is irreducibly interpersonal. Perhaps
the quintessential, but by no means only, example of intersubjectivity is to be
found in human forms of sharing. To put it bluntly: you cannot share experi-
ences with a stone, a tree, or a squirrel. It is only in a very limited sense that you
can share experiences with a dog or chimpanzee more limited, we believe,
than with a two-month-old human infant, and infinitely more limited than with
a 10-month-old. Already, then, we may distinguish between what is social, and
Engaging, sharing, knowing 77

what is interpersonal and more specifically, intersubjective in a uniquely human


sense. One can have social relations with a dog or chimpanzee, just as dogs and
chimpanzees can have social relations with conspecifics, and such relations are
not completely devoid of intersubjectivity; on the other hand, one can relate to
other people without this entailing much that is intersubjective. So it would seem
worthwhile to think through what sharing might entail, so that we are in a better
position to consider the origins and development of the ability to share in human
ontogeny and phylogeny.
In order to do so, it is necessary as a preliminary to distinguish between two
kinds of sharing, what Trevarthen (1979; Trevarthen and Hubley 1978) called
primary and secondary intersubjectivity. Primary intersubjectivity concerns the
transactions that go on between two people, paradigmatically in face-to-face en-
gagements, where the subjective states of each are closely co-ordinated one way
or another, for example when they experience joy together, or one is angry and
the other upset; secondary intersubjectivity concerns transactions that implicate
shared experience of some real or imagined object or event external to the two or
more people involved, for example when they share pleasure in watching cricket
at Lords or argue about who pays for the tickets. Then of course one might wish
to distinguish among many forms as well as degrees of intersubjectivity or (more
specifically) sharing, so that sharing a joke is not the same as sharing a friend-
ship. Our thesis is that human forms of sharing have qualities that are special to
humankind, from very early in life.
Let us focus on simple cases. Firstly, consider the following description of
a typically developing two-month-old whom we videotaped during a still-face
procedure as part of a study of mother-infant relations (previously described and
discussed in Hobson 2005). When the mother assumed the still-face and unreac-
tive posture as requested, the infant responded by becoming uneasy, restless, and
jerky in her movements, and lost the infectious smiling and smooth tonguing
movements that had been evident just moments before. Her bright, protracted
gazes into her mothers eyes were transformed into brief, checking glances. More
important for the present purposes, after about 40 seconds her behaviour changed
again, and she began to give longer looks to her mother accompanied by forced
smiles. There was a strong impression that she was seeking to re-establish contact
with her mother, trying to elicit a resumption of the joyful interpersonal exchange
that was now missing.
If all this is correct, then we see how the infant participates in experience with
the other. Sharing experience with someone else is not merely like having ones
own experience of the world, and then adding something. It seems more like hav-
ing ones own subjective state and registering something of the others attitudes
conjointly, in a qualitatively new form of experience (also Tronick et al. 1998).
78 Peter Hobson and Jessica A. Hobson

To explore this at a later point in development, we consider two further exam-


ples that come from videotaped interactions with typically developing children at
the end of the first year of life, during semi-structured interactions with a tester.
In the first example, a 13-month-old girl and a tester were seated across a table
from one another. After playing together, the tester secured the childs gaze to her
face and then looked to her right while extending her right arm and finger into a
point and exclaiming look at that. Initially, the childs gaze lingered on the testers
outstretched hand. Next she looked back to the testers face as if to ascertain what
the tester might be trying to communicate. For a moment she seemed to dwell on
her face and then suddenly she shifted her gaze to the target of the testers still out-
stretched finger. This case illustrates something about the means by which infants
achieve shared reference by their psychological movement through the other. It is
not simply that the others point or gaze or other gesture serves as a signpost to
objects and events in the world. It is also that the infant is drawn into alignment
with the others orientation toward a shared world.
In a second videotaped interaction from a similar testing arrangement, a 12-
month-old girl swiftly followed the testers gaze and outstretched finger to locate a
poster on the wall and then, giving herself barely enough time to take in the con-
tents of the poster, quickly looked back to the testers face with an engaging smile.
The tester (JH) spontaneously commented on this sharing, with a playful and af-
firming Did you see Big Bird? and the child immediately turned, still smiling, to
look back at the poster again. Therefore sharing attention may involve more than
being reorientated to see what another person sees, or even being moved to adopt
a new attitude to the world; it can also mean that the people involved register and
share that sharing is going on.
Researchers have come a long way in measuring behavioural accompaniments
to sharing experiences. Especially important and influential has been research into
the forms of joint attention that occur at the end of the first year of life (Bruner
1983; Seibert, Hogan and Mundy 1982; Sugarman 1984; Walden and Ogan 1988).
In joint attention, infants monitor and follow the gaze of others and point to, show,
and/or alternate eye contact with reference to object and events in order to direct a
persons attention, share experiences, request things, or inform. In one form of joint
attention (initiating joint attention, or IJA), infants use gaze and gesture to achieve
sharing of experience with another (Kasari, Sigman, Mundy and Yirmiya 1990).
Mundy, Kasari and Sigman (1992) provided evidence that directing the attention
of others to objects, rather than to obtain objects, is especially associated with posi-
tive affect (also Bruner 1981; Hornik and Gunnar 1988; Rheingold, Hay and West
1976), and infants show more positive affect during joint attention involving both
objects and caregivers than when playing with objects by themselves (Adamson
Engaging, sharing, knowing 79

and Bakeman 1985). Mundy et al. (1992) considered how such measures of posi-
tive affect in joint attention may contribute to operationalizing aspects of intersub-
jectivity, or the sharing of ones inner subjective experiences with others.
The important thing here is not to allow such concepts as joint attention, dis-
plays of positive affect, or even smiles to displace the meaning of intersubjective
engagement, and in particular, to diminish the importance of joint attention as an
expression of the human propensity to share the experience of sharing. In order to
avoid this, it is essential to return to the meaning of what has been operationalized.
By way of illustration, here are two examples of studies which seek behavioural
evidence for what might be referred to as sharing looks (Venezia, Messinger, Thorp
and Mundy 2004) and knowing looks (Rakoczy, Tomasello and Striano2005). Both
approaches involve the timing of smiles in relation to looks at a social partner dur-
ing triadic experiences with objects, and both remain alert to the need to consider
the intersubjective meaning toward which the evidence points.
Venezia et al. (2004) provided evidence to suggest that although the frequen-
cy of smiling during joint attention remains stable between eight and 12 months,
the timing changes so that the older infants begin to smile in anticipation of shar-
ing the event with a social partner. These anticipatory smiles involve smiling at
an object and then gazing at the tester while smiling. The authors suggest that
this developmental change may reveal increasing ability to communicate pre-ex-
isting positive affect about an object to another person, or be an index of social
referencing in which the infants attempt to confirm their emotional response, but
they also go further when they raise the possibility that anticipatory smiling may
index an intersubjective sense of the social partner as someone with whom ex-
periences can be shared(Venezia et al. 2004:404). We would argue even further
that such sharing looks might reflect the pleasure of psychological linkage with
another person vis--vis a shared world and after all, the infant is more likely to
be smiling in the first place, if someone else is there sharing the experience. The
smile toward the person with whom one is sharing might be expressive of sharing
rather than communicating information. It might even be part of the communi-
cative act: We are enjoying this together.
Consider this vignette. A 15-month-old toddler seated across the table from a
tester utters the word Look as he extends his index finger into a point while gaz-
ing intently at the testers face. He ascertains that he has her attention, and then
begins to point vaguely to one side while turning to a poster just behind him.
He seems to pass this by, as he swings his body to look and point definitively to
another poster on the wall to his right side. At the moment he extends his arm
to a fully-fledged point, he looks back to the testers face with a broad smile. His
gaze and smile seem to affirm that he has created an experience for them to share.
80 Peter Hobson and Jessica A. Hobson

He had decided to show something to his conversational partner, long before he


knew what that something would be.
Moments later, the tester activates a pop-up clown. When this emerges from
hiding, the boy lets out a shriek of delight as he watches the clown dance, but
his full exuberance (a way of characterizing the intense positive affect and social
gregariousness that can be associated with initiating joint attention, Vaughan, van
Hecke and Mundy personal communication September 2006) is only manifest
when he lifts his gaze to connect with the tester with whom he is sharing the expe-
rience. At this moment, his face bursts into a smile: he is sharing the sharing. His
heightened positive affect is an expression of the pleasure in sharing.
Our second example comes from research conducted by Rakoczy et al.
(2005) with 24-month-olds. These researchers gave an operationalized definition
for smiles: retracting both lip corners upwards and backwards(Rakoczy et al.
2005:61). Among the toddlers they studied, looks accompanied by smiles (lets
just call them smiles) were as likely to occur during the imitation of instrumental
and pretence actions, but the smiles associated with pretence were more likely
to coincide with looking at the social partner, while those associated with in-
strumental actions were more likely to have their onset while the child was still
looking at the object. The investigators considered their data in terms of some
special interpersonal behavior by the child in pretending and perhaps more
knowing smiles than in instrumental actions (Rakoczy et al. 2005:59). Here we
find pointers to the interpersonal nature of symbolizing (also Hobson 1993), and
again, reflections of intersubjective engagement.

4. From engaging and sharing to knowing

The key idea here is that it takes emotional engagement to be moved by others, not
least to be moved into sharing experiences. Movements in subjective orientation
are essential for much of what is so special about human life for transitions in
thought as well as feeling and their early beginnings are well illustrated by the
phenomena of social referencing (e.g., Sorce et al. 1985). An especially impor-
tant developmental implication of the ability to apprehend, respond to, and be
moved by the subjective states of other people in relation to the world, is that
such movements create mental space for negotiating attitudes and meanings. It
is only once an infant has experience of shifting across person-anchored stances
through assimilating another persons attitudes but at the same time, registering
the source of those attitudes as other that it becomes possible for the infant to
achieve understanding of what it means to have and to pass through alternative
perspectives.
Engaging, sharing, knowing 81

Understanding what it means to have a perspective is critical if the infant is


to develop the capacity knowingly to introduce new meanings/perspectives on
to the materials of symbolic play. It is also part of achieving a conceptual grasp
of the relation between selves and the world, where the very idea of a self is that
people are each a self with his or her own take on the world that may be shared
or challenged, aligned-with or repudiated. So by the middle of the second year of
life, pre-reflective and intuitive forms of role-taking are yielding both the means
to conceptualise (symbols), and contents to conceptualize selves-with-attitudes in
correspondence with a world that is the focus of attitudes and co-reference.
It is here we find the beginnings of explicit forms of knowing, not least know-
ing about selves-with-minds. Humans think and know in virtue of having a form
of life that is held in common with others, and in which agreements in judgement
are possible (Wittgenstein 1958).
In our view, all of this becomes possible through special forms of interper-
sonal engagement. Initially, there is dyadic engagement as we described in the
two-month-old infant above. Subsequently, there is engagement with others en-
gagement with the world, through which one may be moved in attitudes towards
the world (as in social referencing). Out of this emerges the propensity to move
within our own minds, from one psychological orientation to another as if from
one persons stance to anothers. It is in the propensity to dwell in the experience
of the other, and to experience the world through the other that is, to identify
with the attitudes of another person from the stance of the other that we believe
the specialness of human forms of sharing are grounded.

5. Four questions with brief replies

In the opening pages of this contribution, we summarized three studies in autism.


Our aim was not only to illustrate the possible yield from such research for our
understanding of typical as well as atypical development, nor merely to share our
struggles to publish measures of intersubjectivity. We also wanted to introduce
the notion that intersubjective engagement might be structured by the process of
identifying with the bodily expressed attitudes and actions of other persons. We
proceeded to develop the idea that intersubjective sharing is pivotal for develop-
ment, and highlighted some of the motivational as well as cognitive implications
of this fact.
At this point, we shall not attempt to explicate the notion of identification
in detail, nor try to unravel the details of development from interpersonal en-
gagement to interpersonal understanding (and understanding of everything else,
come to that) over the first two years of life. Instead, we offer some brief responses
82 Peter Hobson and Jessica A. Hobson

to four rhetorical questions that we consider important in thinking about inter-


subjectivity. We hope that the studies in autism cited earlier, and the vignettes
of typically developing young children offered subsequently, will give personal
colouring to this schematic account.
Our four questions are as follows:
Firstly, what is the structure to human interpersonal engagement that is so
special, and that is unfolded (as it were) in the forms of joint attention and social
referencing towards the end of the first year of life? Secondly, is this same mode of
relatedness operative when a toddler achieves not simply interpersonal co-ordi-
nation, but also understanding of what it means to be a person or self who is able
to respond to other persons and to move through a variety of person-anchored
perspectives in his or her own mind? Thirdly, what has this to do with the origins
of theory of mind? Finally, what does it mean that intersubjectivity works as a
system of self-in-relation-to-other, and what does autism tell us about this?
In brief, our answers to these questions are as follows. We believe that in or-
der for sharing to be possible by, say, the second month of life, there needs to be
a capacity to register the attitudes expressed through another persons body in
such a way that they can be experienced in relation to ones own state. This is what
sharing entails, after all. We suppose this to be an early form of identifying with
the attitudes of others. This natural propensity not only to respond to, but also to
assume (in part) another persons feelings, gives motivational impetus to what we
have referred to as being moved by others. Importantly, the boundary within the
self-other mode of experience and of course, this is prior to any concept of self
and other is also operative in settings of joint attention and social referencing.
When an infant is moved to adopt the orientation of other people, it is vital
that he or she has the pre-reflective capacity to register the source of the new ori-
entation as other. This is important for the reason that the child has to reach the
point of adopting such perspectives knowingly and in part, by adopting an other-
person-anchored perspective on him or herself. This would not be possible with-
out the differentiation of person-anchored stances within the childs own mind.
Now to theory of mind. This is a theoretically loaded expression that is or
should be concerned with concepts of the mind such as those of thinking, be-
lieving, feeling, intending, and so on. In some quarters, the theory is framed in
terms of a computational metaphor, with talk of representations, metarepresenta-
tions, and computations. Although there are strengths to this approach, among
many limitations is the lack of a developmental theory (beyond innatism) to ex-
plain the acquisition of concepts of persons-with-minds. The present approach
begins with interpersonal co-ordination of bodily-anchored expressive acts, and
through a developmental pathway that opens such co-ordination to encompass
Engaging, sharing, knowing 83

co-orientation with others towards a shared world, leads to the acquisition of


symbolic representations along with newfound abilities to relate to ones own rela-
tions with that world.
Of course there are developments between the second year of life, and again
around the end of the third year when children seem to acquire a new grasp of
the concept of belief. In our view, the critical acquisition here is the concept of
reality, so that children can now grasp how people may falsely hold as true of
reality and again, we take it that the notion of reality is that which transcends
individual human viewpoints, insofar as it is the way things are. In other words,
children who understand what it is to hold a belief grasp what Perner (1990) calls
the representational relation. We think that it is through intersubjective engage-
ment and communication with others that children come to see the force of this
supra-personal characterization, this arbiter between my view and yours reality
as a given in relation to which humans should merely assent.
Finally, the system of self-in-relation-to-other, although remarkably robust, is
not immune to disruption from various sources. Autism is instructive for helping
us see not only the implications of such disruption impaired creative symbolic
play, limited interpersonal (and theory of mind) understanding, and impaired
self-monitoring probably among these (see also Barresi and Moore this volume)
but also how different forms of dysfunction may underlie the serious impairments
in interpersonal engagement that characterize the syndrome. One of our special
interests has been in congenital blindness (e.g. Hobson 2005), where the lack of
visual (and probably to a lesser extent, affective) co-orientation with others to-
wards a shared world is a serious risk factor for developing the syndrome of au-
tism. Through the association between congenital blindness and autism, we may
discern the importance of interpersonal engagement and sharing attitudes towards
particular, visually perceived objects and events in the environment for coming to
know about symbols, about selves, and more specifically about peoples minds.

6. Methodology revisited

In this final section of the chapter, we return to issues of methodology. We have al-
ready indicated how one way to investigate intersubjectivity is to study human be-
ings in whom the ability to engage with others is compromised, specifically those
with childhood autism. Here we turn the argument around, and consider how in
order to understand the nature of autism, and even to diagnose the condition,
we need to be attuned to the significance of intersubjective engagement. What-
ever the resistances to according intersubjective phenomena a central place in our
84 Peter Hobson and Jessica A. Hobson

a ccounts of autism and in our methodological approaches to its investigation, it


is already the case that judgements about intersubjectivity pervade authoritative
diagnostic approaches.
Among individuals with autism, something is unusual or limited in the kind
of interpersonal engagement we have been describing. In early clinical descrip-
tions of the condition, Kanner (1943) highlighted the childrens inborn autis-
tic disturbances of affective contact (Kanner 1943:250). The phrase affective
contact is notable for attempting to capture what it is like for a person to be in
relation to another, an issue that is still marginalized in much theorizing about
autism. Kanner captured something of this when he wrote that people, so long
as they left the child alone, figured in about the same manner as did the desk, the
bookshelf, or the filing cabinet (Kanner 1943:246).
It is instructive to consider how we measure whether or not a child has
autism. One approach is to focus upon a relative lack of the presence of par-
ticular forms of social-communicative exchange described earlier. For example,
Mundy (2003) describes the childrens fundamental social disturbance in terms
of early and robust impairments in joint attention development. These are those
specific forms of eye contact, affect and gestures used for the singularly social
purpose of sharing experiences with others (Mundy and Neal 2001). When one
administers the Early Social Communication Scales (ESCS: Seibert et al. 1982),
for example, young children with autism between the ages of three and seven
years of age are likely to initiate one or two joint attention bids (Mundy et al.
1986), far below what one would expect in children without autism. This ap-
proach succeeds in characterizing a pivotal aspect of the childrens social-com-
munication impairments, and such impairments have serious implications for
the childrens language and cognitive development. Still one needs to ask in
what sense such impairments in joint attention are the crux of the develop-
mental psychopathology of autism and to what extent they represent prominent
early manifestations of something deeper and yet earlier in the childrens atypi-
cal interpersonal engagement.
But to return to the assessment of autism: one of the most widely-respected
approaches to formal research diagnosis is the Autism Diagnostic Observation
Schedule (ADOS-G: Lord et al. 2000), a semi-structured series of planned social
presses to prompt requests, engagement in joint attention, and communication
with the tester. It is very effective in eliciting certain kinds of interpersonal ex-
change that are often limited in autism. Not only this, but also the standardized way
of administering and scoring the measure means one can interpret scores across
research laboratories for diverse populations around the world. Interrater agree-
ment standards are high. Items are typically scored on a three-point scale from 0
(no evidence of abnormality related to autism) ranging to 2 (definite evidence of
Engaging, sharing, knowing 85

abnormalities related to autism). Some of the items are as follows (and these are
just a few examples): unusual eye contact, quality of social overtures, quality of
social response, and overall quality of rapport. Note the degrees of appropriately
subjective judgement in the ratings. Or again, here are a few examples of items
from the Parent Interview for Autism (PIA: Stone and Hogan 1993). Parents are
asked whether the child enjoys interacting with familiar adults, enjoys playing
with other children, looks through people as if they werent there, seems to be
hard to reach or in his or her own world.
Our point here is that intersubjective judgements are a necessary and im-
portant part of conventional diagnostic procedures. So why is there such resis-
tance to incorporating them into the broader domain of scientific investigation?
Even though informally, eminent researchers will refer to the lack of emotional
sparkle in imitative events (Rogers 2006) or the absence of warm, joyful expres-
sions (Wetherby 2006) among children with autism both examples from pre-
sentations at the recent International Meeting for Autism Research (IMFAR) in
Montreal we have a long way to go before the developmental significance of the
phenomena that such expressions capture is fully appreciated.
As a final flourish, here is one last study of our own (Hobson et al. 2006). In
this study, we asked children and adolescents to pose for a picture. While one
tester took the participants photograph with a Polaroid camera, a second tester
filmed the encounter. All of the participants looked at the camera when their
photograph was taken, and there were no group differences in the tendency to
sustain or avert gaze during the episode. But when it came to the quality of these
looks, there were marked and highly significant differences between the partici-
pants with and without autism. The participants without autism were judged to
show self-consciousness in their looks away and their looks to the tester. Those
with autism were judged to give blank looks away and to the tester. So, even
when the numbers of looks were similar, the quality of the encounter the feel
of the exchange contrasted sharply between the groups. Once again, here were
looks that raters could judge reliably. Although one might now try to track down
which behavioral/expressive characteristics corresponded with the self-con-
sciousness of participants looks, it was only through the ratings of intersubjec-
tively attuned human beings that such looks could be identified as self-conscious
in the first place.
These issues matter when we think about treatment for children with autism.
It is only when those involved in treatment cease to consider emotion recogni-
tion or interpersonal skills as abilities that can be taught or trained (even by
computer), and instead seek ways to foster development in intersubjective engage-
ment through efforts to draw individuals into appropriate kinds of socially co-or-
dinated experience by providing manageable and positively engaging exchanges
86 Peter Hobson and Jessica A. Hobson

(e.g., Gutstein, Burgess and Montfort 2007) do we come closer to finding ways to
help children with autism shift to a more fruitful developmental trajectory.
To conclude: there are indeed intersubjective foundations for engaging, shar-
ing, and knowing. Intersubjectivity has self-other structure, in our view a struc-
ture inherent in the process of identifying with others. One way to explore this
claim is to study children with autism. And one of the most promising starting-
points for understanding, diagnosing, and treating autism is to consider how in-
tersubjective engagement shapes the development of social relations and creative
symbolic thinking.

Acknowledgements

This chapter was written while we were at the Center for Advanced Study in the
Behavioural Sciences, Stanford, California. We are hugely grateful to the Center,
and also to the Tavistock and Portman NHS Trust, for making our stay possible
and so rewarding.

References

Adamson, L. and Bakeman, R. 1985. Affect and attention: Infants observed with mothers and
peers. Child Development 56: 582593.
Bosch, G. 1970. Infantile Autism. New York: Springer-Verlag.
Bruner, J. 1981. Learning how to do things with words. In Human Growth and Development,
J. Bruner and A. Garton (eds.), 6283. London: Oxford University Press.
Bruner, J. 1983. Childs Talk: Learning to Use Language. New York: Norton.
Capps, L., Kehres, J. and Sigman, M. 1998. Conversational abilities among children with autism
and children with developmental delays. Autism 2: 325344.
Garca-Prez, R.M., Lee, A. and Hobson, R.P. 2007. On intersubjective engagement: A con-
trolled study of nonverbal communication in autism. Journal of Autism and Developmen-
tal Disorders 37: 13101322.
Gutstein, S.E., Burgess, A.F. and Montfort, K. 2007. Evaluation of the Relationship Develop-
ment Intervention Program. Autism 11: 397411.
Hobson, J.A. and Hobson, R.P. 2007. Identification: The missing link between joint attention
and imitation? Development and Psychopathology, 19: 411431.
Hobson, R.P. 1993. Autism and the Development of Mind. Hove, Sussex: Erlbaum.
Hobson, R.P. 2002/2004. The Cradle of Thought. London: Pan Macmillan & New York: Oxford
University Press.
Hobson, R.P. 2005. Autism and emotion. In Handbook of Autism and Pervasive Developmen-
tal Disorders (3rd ed), F.R. Volkmar, R. Paul, A. Klin and D. Cohen (eds.), 406424. New
Jersey: John Wiley & Sons, Inc.
Engaging, sharing, knowing 87

Hobson, R.P., Chidambi, G., Lee, A. and Meyer, J.A. 2006. Foundations for self-awareness: An
exploration through autism. Monographs of the Society for Research in Child Development,
Serial No. 284, 71.
Hobson, R.P. and Lee, A. 1998. Hello and goodbye: A study of social engagement in autism.
Journal of Autism and Developmental Disorders 28: 117126.
Hobson, R.P. and Lee, A. 1999. Imitation and identification in autism. Journal of Child Psy-
chology and Psychiatry 40: 649659.
Hobson, R.P. and Meyer, J.A. 2005. Interpersonal foundations for the self: The case of autism.
Developmental Science 8: 481491.
Hornick, R. and Gunnar, M. 1988. A descriptive analysis of infant social referencing. Child
Development 59: 626634.
Kanner, L. 1943. Autistic disturbances of affective contact. Nervous Child 2: 217250.
Kasari, C., Sigman, M., Mundy, P., and Yirmiya, N. 1990. Affect sharing in the context of joint
attention interactions of normal, autistic, and mentally retarded children. Journal of Au-
tism and Developmental Disorders 20: 87100.
Landis J.R. and Koch G.G. 1977. The measurement of observer agreement for categorical
data. Biometrics 33: 159174.
Lee, A. and Hobson, R.P. 1998. On developing self-concepts: A controlled study of children and
adolescents with autism. Journal of Child Psychology and Psychiatry 39: 11311141.
Lord, C., Risi, S., Lambrecht, L., Cook, E.H., Leventhal, B., DiLavore, P.C. and Rutter, M. 2000.
The Autism Diagnostic Observation Schedule Generic: A standard measure of social
and communication deficits associated with the spectrum of autism. Journal of Autism
and Developmental Disorders 30: 205223.
Meyer, J. A. and Hobson, R.P. 2004. Orientation in relation to self and other: The case of au-
tism. Interaction Studies 5: 221244.
Mundy, P. 2003. The neural basis of social impairments in autism: The role of the dorsal me-
dial-frontal cortex and anterior cingulate system. Journal of Child Psychology and Psychia-
try 44: 793809.
Mundy, P., Kasari, C. and Sigman, M. 1992. Nonverbal communication, affective sharing, and
intersubjectivity. Infant Behavior and Development 15: 377381.
Mundy, P. and Neal, R. 2001. Neural plasticity, joint attention, and a transactional social-ori-
enting model of autism. International Review of Mental Retardation 23: 139168.
Mundy, P., Sigman, M.D., Ungerer, J. and Sherman, T. 1986. Defining the social deficits of
autism: The contribution of non- verbal communication measures. Journal of Child Psy-
chology and Psychiatry and Allied Disciplines 27: 657669.
Perner, J. 1990. Understanding the representational mind. Cambridge, MA: Cambridge Univer-
sity Press.
Rakoczy, H., Tomasello, M., and Striano, T. 2005. On tools and toys: How children learn to act
on and pretend with virgin objects. Developmental Science 8: 5773.
Rheingold, H., Hay, D. and West, M. 1976. Sharing in the second year of life. Child Develop-
ment 83: 898913.
Rogers, S.J. 2006. Imitation difficulties in autism. Paper presented at the International Meeting
for Autism Research, June 1 3, Montreal, Canada.
Seibert, J.M., Hogan, A.E., and Mundy, P.C. 1982. Assessing interactional competencies: The
Early Social Communication Scales. Infant Mental Health Journal 3: 244245.
Sorce, J.F., Emde, R.N., Campos, J. & Klinnert, M.D. (1985). Maternal emotional signaling: Its
effect on the visual cliff behavior of 1-year-olds. Developmental Psychology 21: 195200.
88 Peter Hobson and Jessica A. Hobson

Stone, W.L. and Hogan, K.L. 1993. A structured parent interview for identifying young children
with autism. Journal of Autism and Developmental Disorders 23: 639652.
Sugarman, S. 1984. The development of preverbal communication. In The Acquisition of Com-
municative Competence, R.F. Schiefelbusch and J. Pickar (eds), 2367. Baltimore: Univer-
sity Park Press.
Tantam, D., Holmes, D. and Cordess, C. 1993. Nonverbal expression in autism of Asperger type.
Journal of Autism and Developmental Disorders 23: 111133.
Trevarthen, C. 1979. Communication and cooperation in early infancy. A description of
primary intersubjectivity. In Before Speech: The Beginning of Human Communication,
M.Bullowa (ed.), 99136. London: Cambridge University Press.
Trevarthen, C. and Hubley, P. 1978. Secondary inter subjectivity: Confidence, confiding, and
acts of meaning in the first year. In Action, Gesture and Symbol: The Emergence of Lan-
guage, A.Lock (ed.), 183229. London: Academic Press.
Tronick E.Z., Bruschweiler-Stern N., Harrison A.M., Lyons-Ruth K., Morgan A.C., Nahum J.P.,
Sander L. and Stern D.N. 1998. Dyadically expanded states of consciousness and the pro-
cess of therapeutic change. Infant Mental Health Journal 19: 290299.
Venezia, M., Messinger, D.S., Thorp, D. and Mundy, P. 2004. The development of anticipatory
smiling. Infancy 6: 397406.
Walden, T.A. and Ogan, T.A. 1988. The development of social referencing. Child Develop-
ment 59: 12301240.
Wetherby, A.M. 2006. Social communication profiles of children with autism spectrum dis-
orders in the 2nd and 3rd years of life. Paper presented at the International Meeting for
Autism Research, June 13, Montreal, Canada.
Wittgenstein, L. 1958. Philosophical Investigations. Oxford: Blackwell.
chapter 5

Coming to agreement
Object use by infants and adults

Cintia Rodrguez and Christiane Moro

According to the naturalistic view of the object children give meaning to


objects in a natural, direct, and spontaneous manner, without the need of others.
The myth underlying the spontaneity of subject-object encounter is that, in
contrast to the widely assumed opacity of social reality within modern psy-
chological theory, there exists an alternative reality of non-social physical that
is literal and transparent. We challenge this by adopting a pragmatic approach
to objects. In everyday life, objects are situated in communicative contexts and
used for doing things. During their first year of life, children achieve triadic
interactions (baby-object-adult) involving very different degrees of agreement
with adults concerning an objects use and meaning by means of diverse semi-
otic systems in contexts of joint communicative action.

1. Introduction

The ways in which the subject approaches the object constitutes one of the major
themes in early cognitive development in one way or in another the whole Ge-
neva School was devoted to this issue. Despite this and the growing recent influ-
ence of Vygotsky, the naturalistic view of the object is predominant in studies of
infant cognitive development. This naturalistic view implies that the child relates
to the object in a natural, direct, and spontaneous way, often eliminating not only
any action and use by the child upon the object, but also any adult-baby joint com-
municative action on the world (Rodrguez 2007). According to this viewpoint,
the child is supposed to promote his early cognitive development through his en-
counters with an obvious and transparent reality which can be accessed without
the need for any educational guidance through semiotic mediation, in order for
him to share different degrees of meaning with the people surrounding him. The
myth underlying this spontaneous encounter between the subject and the world
is the existence of a literal, obvious, transparent, non-social physical reality,
which declares itself nakedly, as opposed to the opacity of the social reality that
90 Cintia Rodrguez and Christiane Moro

exists only through communicative and conventional channels. This dichotomy


between the opacity of the social and the immediacy of the meaning of objects has
had a profound impact on research into early typically and atypically developing
children.
In this chapter we challenge this naturalistic view, and focus instead on a
pragmatic and semiotic approach to objects. Objects are used for doing things in
everyday life; these meanings and functions are socially established and are linked
to the use we make of them. From the beginning of their lives, children are co-
opted by the people around them as co-protagonists of their activities when doing
things with objects. Adults communicate with children through and about objects
used in everyday contexts. They achieve different levels of agreement about the
meaning of things being used in such contexts. As a consequence, children be-
come involved. They are not in contact with any syntactic formal object, how-
ever, but with a pragmatic one, involving shared uses in communicative contexts
where people do things with them as part of the social world.
In the first section of this chapter we will challenge the widely established
opposition in the theory of mind literature between the referential opacity of the
mental world versus the self-evidence of so-called physical reality. Then we will
question the solitary nature of the encounter between the child and the world and
the spontaneous categorizations resulting therein. We will refer to the perma-
nence of objects determined by their social function (before they have names).
This permanence through use is already in place by the end of the first year of
life and is presented by adults to children from the outset. Before objects are per-
manent to children they are regarded and used as permanent by adults. Then we
will consider some voices arguing against the passivity in which babies are placed
in laboratories and we will confront this with how the classical theories (Piaget,
Wallon, Vygotsky) conceived subjects always as active, as always transforming
their surroundings and hence as a key factor in development. In the next section
we will consider the difference between what is meant in the literature by joint at-
tention and our view of joint action, which implies a pragmatic and semiotic per-
spective where objects are an integral part of baby-adult communication. In the
last section we present five observations of children (from 2- to 12-months-old)
in contexts of triadic interaction with adults around the same object, and consider
the different levels of agreement they reach within this 10 month interval between
the first and last set of observations. As an essential tool of analysis, we have iden-
tified the different semiotic systems that are involved. Thanks to these, different
levels of adult-child agreement about objects, events and situations in the world
can be established about the meanings and the uses that can arise.
Coming to agreement 91

2. The social nature of objects

2.1 Referential opacity of the social versus the evident physical reality

This dichotomy can be illustrated by an (extreme) excellent and well-known ex-


ample that has contributed much to the persistence of this naturalistic and over-
simplified view of the object. This example is provided during the 1980s and 1990s
by traditional theories of mind that characterize mental states by their referential
opacity as opposed to the evidence, transparency and literality of physical reality.
The so-called physical reality is evident, transparent and literal (Leslie 1987).
For instance, according to Baron-Cohen, Belief, knowledge, desire, and pretence
are all opaque mental states. That is, they suspend normal truth-conditions gov-
erning the propositions they prefix [] In contrast, I saw a mouse is true only if
I did indeed see one. Thus, perception is transparent, not opaque (Baron-Cohen
1993:65, emphasis added). As Costall and Dreier (2006:4) point out:
the Theory of Mind approach, which has been dominant within developmental
psychology for the last decade, has largely removed the issue of childrens use of
objects from the research agenda, not only because of its emphasis upon mind,
but also because of its explicit separation of childrens understanding of other
people from their understanding of things.

The vast influence of the theory of mind perspective when trying to explain what
is missing in children with autism is well known. Autistic children have diffi-
culties in pretend play, in true and false belief and in communication, whereas
such impairment will not fundamentally affect [their] apprehension of physi-
cal artifacts, including representational artifacts such as photographs or maps
(Leslie and Roth 1993:92, emphasis in original). According to this view, autistic
children should have no difficulties in their dealings with objects. In other words,
the literal reality of objects that is assumed to be directly evident is unproblem-
atic. Since the 1980s, this position has been extremely influential regarding views
about what should be looked at in early stages of development. Although we can-
not develop this here, we would like to stress that this view of the good relation

. We would like to refer here to the claim made by Jeremy Carpendale and Charlie Lewis
(2004) in relation to childrens developing understanding of mind. They consider that the de-
velopment of childrens social understanding occurs within triadic interaction involving the
childs experience of the world as well as communicative interaction with others about their
experience and beliefs (2004:79). There is maybe only a small, subtle difference in what we say:
We consider that not only social understanding, but development (included what is called cog-
nitive development) occurs within triadic interaction, following Vygotskys well known maxim:
The path going from the child to the world, goes through another person.
92 Cintia Rodrguez and Christiane Moro

of autistic children with objects is contradicted by Williams, Kendell-Scott and


Costalls (2005) findings concerning the difficulty autistic children show with re-
gard to using everyday objects in a conventional way (see also, Arango, Chvez,
and Lasprilla 2003; Pardos and Rodrguez 2005; Moore 2004).

2.2 Early cognitive development: The kingdom of spontaneity

The naturalistic view of the object in early cognitive development is not only the
result of the impact of the classical theory of mind perspective, but is also char-
acteristic of the enormous influence of the mainstream cognitive approaches that
have typically ignored the world when trying to explain the mind. We encourage
readers to have a careful look at the different sections included in recent infant de-
velopment textbooks. Almost invariably one will find a section devoted to social
development, clearly differentiated from another section dedicated to cognitive
development. Now we invite the reader to concentrate on the cognitive develop-
ment and see what it says about the relation between the child and the objects
presented (or represented).
In research, babies are frequently situated in a context (usually the laboratory)
on their own, with no other people to interact with, thus implicitly eliminating
social interaction as a cause of any cognitive development. In this solitary context
their (active) actions on the world are neglected in favour of their (passive) reac-
tions to the stimuli presented to them. When any characteristic of the object is
included, most of the time it is merely shape or color (e.g., Bremner et al. 2005).
In such contexts, the child becomes a big solitary looking eye who does not
transform anything in the world but only reacts on his own towards what is being
presented.
A good example of this is Spelkes nativist approach, according to which in-
fants, from a very early age, have object permanence and representations of ob-
jects as coherent wholes without any social intervention, and without any action
by the child. Her comment that newborn infants also may have a functional sys-
tem of object representation (Spelke 1998:185) seems an extremely vague and
general statement. Mandler (2000, 2004b), for instance, adds something to this
early representation of objects when she says that there is more than one kind of
object categorization. She distinguishes two types: perceptual categorization and
conceptual categorization. The latter is based not on what objects look like, but on
what objects do (2000:3), and meaning accrues from what things do, not what
they look like (2004a:168). Both perceptual and conceptual categories serve dif-
ferent functions: Infants, just like adults, make their inductive generalizations
on the basis of kind, not on the basis of perceptual similarity (2004a:199), and
Coming to agreement 93

when discussing the classification of animals by infants, she notes that these data
are at least partly culturally determined (2004a:204). Adding this pragmatic level
seems to us essential. Nevertheless, it is extremely surprising to find Mandler go
on to conclude with the further claim, especially in a chapter in a book entitled
The Development of the Mediated Mind, dedicated to Katherine Nelson.
In contrast to other chapters in this book, there was no mention here of the influ-
ence of sociocultural context. That is because infants are to some extent shielded
from such influence by their lack of languageby and large, the early development
of the ability tocategorize objects and to learn the important basics of language
such as that it is used for communication, are all governed by universal factors
common to infants in all cultures. It is when the foundations have been laid down
and the naming practices of the culture begin to teach the infant which details are
important that more cultural influence can be seen.
 (Mandler 2004b:2728, emphases added)

But how is it that the baby is shielded from the influence of other people precise-
ly at a time when they are most vulnerable, and completely dependent on them
for every aspect of their existence? Our own findings about how children start
using everyday objects in a conventional way by the end of the first year stress the
cultural and social aspects. If children start to understand objects as sign of their
[public] use, it is because adults deploy an intense semiotic activity when using
the objects with, or in the presence of, children. Thanks to this, children come
to categorize them according to their social and public use (as adults do). The
mediation through signs (not all signs are linguistic) with the object itself is the
central tool of adult-child communication allowing the child to appropriate, over
an extended process, these public meanings of use (Rodrguez and Moro 1998,
1999; Moro and Rodrguez 2005). It is possible that later concepts have their roots
in this kind of canonical uses. The first conventional uses of objects could be re-
garded as concepts in action (Rodrguez 2006).

2.3 The permanence of the object: Only one?

The permanence of the object has been investigated over many years. According
to Piaget, the child is able during the second half of the first year to remove an
obstacle in order to grasp the hidden object. In more recent research, in orders
to test whether the child does or does not have object permanence earlier than
Piaget suggested, images of autonomous moving objects on their own are pro-
jected into screens (Moore and Meltzoff 1999; Bremner et al. 2005). The visual
reaction of the child is then recorded. There is no doubt that important findings
are obtained with this method. Nevertheless, two things should be said. First, the
94 Cintia Rodrguez and Christiane Moro

infants knowledge of the object is an extremely complex thing that involves a long
process of shared meanings through ontogenesis, at many levels, which cannot
be simply reduced to its permanence as assessed with this paradigm. Given these
preconceptions, it is hardly surprising to read that: infant object permanence is
still an enigma after four decades of research (Moore and Meltzoff 1999:623).
According to Piaget (1937), there are degrees of permanence. In fact, he considers
the object as a limit, in the mathematical sense: one is continually approaching
objectivity but the object itself is never reached.
However, there is another kind of permanence of objects functional perma-
nence which is related to their everyday uses, social meanings and cultural func-
tions whose acquisition by children takes place in educational contexts (Rodrguez
and Moro 1998). This kind of permanence was largely ignored by Piaget and by
mainstream cognitive psychology (Rodrguez 2006). Such functional permanence
is different from the abstract conceptual categorization that is assumed in modern
cognitive theory to be independent of education and culture, and it cannot be cap-
tured in the appearance/disappearance paradigm which simply ignores the func-
tion and cultural use of objects. (We will come later to this problem.) Third, in real
life, as we previously pointed out, objects do not engage in their own spontaneous
motion back and forth without the intervention of intentional agents. Finally, in
real life, children are active subjects, not mere re-acting entities.

3. Some voices against the passivity of the subject

Many voices from the ecological tradition are very active nowadays against
the passivity imposed on subjects in experimental situations. As Alan Costall
(2004:76) has put it, if we really did spend all our lives just waiting for things to
happen to us (as the participants in psychology experiments are typically required
to do), then whatever activity is involved in perceiving would necessarily be
confined to internal processing. Eleanor Gibson and Ann Pick (2000:14), in op-
position to common accounts of perception, refer with a great sense of humor to
theories that begin with a motionless creature haplessly bombarded by stimuli.
The stimuli presented belong to the evident, transparent and obvious reality to
which the baby is supposed to be directly confronted.
A similar objection to the absence of action is identified by some research-
ers in relation to the paradigm used, which tells only part of the story about
the cognitive abilities of very young infants, as the visual observation paradigm
treats the infant as a couch potato who merely sits, watches, and offers an opinion
(by showing varying degrees of interest) (Willatts 1997:132). If we check our
not-so-remote past, this complaint about the neglect of actions in the world when
Coming to agreement 95

dealing with cognitive development is a recurrent theme. In their well known


paper If you want to get ahead, get a theory, Karmiloff-Smith and Inhelder em-
phasized that unlike previous Genevan research articles in which extensive quo-
tations were given from what children said, this studys protocols consist mainly
of detailed descriptions of childrens actions (1974:200, emphasis added). To be
fair to Piaget, we must say that their explicit disapproval is part of the Genevan
movement on Microgenesis and problem solving, developed by Le groupe de stra-
tgies (Inhelder et al. 1992) during the 70s and 80s while trying to restore the
central place of actions upon the objects by subjects as the way of changing their
representations, and thus provoking development. This criticism about Piagets
neglect (omission) of action, which is very paradoxical given his epistemologi-
cal emphasis upon the active subject, does not apply to his studies of sensorimotor
development but rather to his work with children after 4 or 5 years old.
In addition to this neglect of the active nature of the child, there has also been
a disregard of communication between the baby and adults about objects in the
world. In other words, we need to take seriously into account joint action in the
world as a central scenario from where children build their meanings by acting in
collaboration with others.
Let us now take a brief look at the classic developmentalists and see which
status was given by them to subjects actions upon objects and to others interven-
tions when dealing with cognitive development.

4. Action, communication and objects

4.1 Classical developmentalists on action,


communication and objects

The situation in early developmental psychology has not always been the same. If
we look back through the 20th century to the classics, the picture we find is very
different the American biologist Stephen Jay Gould, when complaining against
the anti-intellectualist varnish of American culture, used to say that the nuggets
of the authentic discoveries abound in the primary literature (1998:14). Over the
last 50 or 60 years when researchers are involved with cognitive development they
usually look only at babies and objects. This was the classical Piagetian position.

. Even when researchers refer to development as social-cognitive, as soon as they consider


objects or tasks involving objects, they treat them as the non-social part of the situation (e.g.
Striano and Bertin 2005:563)
96 Cintia Rodrguez and Christiane Moro

For him, the social world of conventions and communication had nothing to do
with the origin of the intelligence, as there is no causal link between social facts
and psychological development (Bronckart 1997). Nevertheless, Piaget was very
much concerned with babies actions in the world not just their responses to
stimuli. The method employed by Piaget was observation of common situations
from everyday life, which allowed him to look at microgenetic processes from a
very qualitative standpoint; the complexity of the object was beyond doubt. Ac-
cording to Costall, both Piaget and Gibson insisted upon the primacy of being
in the world (2004:85) since Contrary to the dominant approaches within cog-
nitive psychology and artificial intelligence, they did not take our capacities for
representation and symbolism for granted, but saw clearly that representational
activities need to grounded in our interactions with our surroundings. A classical
objection had been addressed to Piaget in relation to the primacy of action: How
active must a subject be to be considered active (see Moore 2004)? The recent dis-
covery of mirror neurons in humans (see Barresi and Moore this volume; Zlatev
this volume) a number of areas of the brain are activated when people make a
movement that involves reaching out and grasping an object, as well as when they
watch grasping movements made by others (Corballis 2002:47) will provide
probably answers to this important question.
It is not by accident that Piaget was very much influenced by Khlers work
on chimpanzees. If we look at the kind of situations Khler considered as in-
dicating that there was intelligence without language, we have to bear in mind
that chimpanzees always had some practical problem to solve. They were active
subjects inventing new solutions (see Leavens, Hopkins and Bard this volume,
Pika this volume).
Two other important figures in European Developmental Psychology are
Vygotsky and the French psychologist Henri Wallon. Vygotsky is ambiguous
about how many protagonists should be considered in early cognitive develop-
ment. Sometimes it seems that there is room only for the child and the object.
Whereas in his manuscript about the first year of life, Vygotsky says explicitly that
the main way babies engage in activities at this early age is through other persons.
He says something very important: objects appear and disappear from the childs
visual field thanks to the others will (1984/1996:285). The subject considered
by Vygotsky and by Wallon (1942/1970) was active, always involved in the trans-
formation of the world thanks to the use of tools and signs. Wallon, for instance,
used to say that biology is socially oriented.
As stressed earlier, if we compare this situation with the subject being studied
during the last 50 years by mainstream cognitive psychology, the picture is quite
different:
Coming to agreement 97

1. No room is made for education and communication with other people (simi-
lar to Piaget).
2. The object under consideration is assumed to be self-evident, constitutes the
literal reality, and is neither defined by its social use (absence of pragmatics
of the object), nor by its roles in the communicative activities of the people
about it, around it, and with it.
3. The subject no longer acts anymore, but only reacts following the stimuli pre-
sented by the experimenter (who is playing the active role).
4. There is an almost total lack of interest in the processes of construction, in
microgenesis. There is also a disregard of qualitative analysis and longitudinal
studies.

4.2 Acknowledging subjects as active means considering not only joint


attention but also joint action

Some researchers refer to the need to consider triadic interactions (adult-object-


baby), which focus on how the child shifts his interest from the adult towards the
world and how he includes others and the world at the same time. This adult-
world inclusion is known as joint attention. For instance, according to Legerstee
et al. (1987) when children around 17 weeks of age start to be able to grasp a
doll, they manifest less interest towards adults. This does not mean a halt in their
communicative development, but rather this new phase of orientation towards
an object prepares the way for a new kind of communication involving objects
(1987:228). By the end of the first year, communication changes dramatically and
typically developing children start to refer to events taking place beyond the lim-
its of interpersonal exchanges (Bakeman and Adamson 1986:228). Other studies
of joint attention have involved children with Down syndrome (Legerstee and
Weintraub 1997). Others show how children as young as 6 months old use adults
instrumentally; from 6 to 9 there was a rapid increase in their initiation of eye
contact and stylized communication, they began to employ conventional sym-
bolic means after 10 months, such as pointing, object offers, and nods (Mossier
and Rogoff 1994:71). Other studies suggest that a link exists between dyadic and
triadic interactions in children aged 7 and 10 months, and the lack of age ef-
fects suggests a somewhat more gradual process of social cognitive development
than that implied by a suddenly emerging 9-month-revolution as suggested by
Tomasello (1999, see Striano and Rochat 1999; Striano and Bertin 2005).
What distinguishes these views about joint attention from the main argument
we develop here is that we need to know how the mutual understanding between
adults and babies about objects in the world takes place. This is why we need to
98 Cintia Rodrguez and Christiane Moro

know how the long process of joint action on the world involving babies in inter-
action with adults works. In other words, we need to understand how the place of
the adult as a mediator between the child and the world changes.

4.3 Triadic interactions through the first year of life:


A pragmatic perspective

If we agree that objects have functions (Nelson 1974), shared meanings, that
are part of the social history, and which belong to systems of uses shared by the
community (Sinha 2005), an important question then to ask as developmental
psychologists is how, when, and through which processes, babies come to ap-
propriate the pragmatics of objects situated in communicative contexts and used
for doing things. How do children achieve, through diverse semiotic systems in
contexts of triadic interactions very different degrees of agreement with adults
about the meaning and uses of objects? These different agreements between adults
and children are possible thanks to a long process of construction in which babies
are actively involved with other people.
This means putting back on the agenda of early development the study of
processes with the help of observational and microgenetic methods. Our focus on
objects according to their uses is far removed from the culture-free scenario of
mainstream cognitive psychology. There are two major differences. Firstly, early
cognitive development is not such a solitary business. We need to understand
how the educative influences of other people on the baby operate in commu-
nicative contexts, which semiotic systems are at work in ontogenesis, and how
they develop. This means substantially extending the Vygotskian hypothesis ac-
cording to which communication through signs is taken to be a central cause of
cognitive development. Secondly, there is an urgent need to restore to children
their status as active agents, thereby rescuing them from the characteristic passive
position where they can only re-act to stimuli. As a consequence, we reach the
triadic interaction between baby-object-adult, in which both adults and children
act together and communicate by different degrees of shared meanings about the
uses they make of objects in the world. We come to an exceptional online scenario
in which development takes place and new meanings grow from shared uses, and
different levels of adult-child agreement are possible. Understanding these means
seriously taking into account materiality as a central focus around which two
different subjects communicate and establish a process of new, more powerful,
shared meanings and conventions.
The reality of objects, with their everyday social meanings of use, is a very
complex issue. Our studies concerning the uses of objects made by children,
Coming to agreement 99

uring the second half of their first year and first half of the second, indicate that
d
many levels of representation, of meaning and use have to be carefully specified
(Rodrguez and Moro 1998; Moro and Rodrguez 2005). When objects are con-
sidered from a social point of view, rather than reduced to a non-social physical
reality, they may have multiple meanings, and can be used according to very
different functions, both by the children and by the adults in the interaction. Dis-
tinguishing these different functions, putting them into a developmental perspec-
tive, and understanding the adults influence as a cause of cognitive development
is an urgent task that early developmental psychology must address.
As we pointed out when considering the permanence of the object from its
functional perspective, adults in their interactions with babies use objects as per-
manent long before the babies themselves are able to consider them as permanent
entities. This has important consequences for the child since children already live
in a world that is considered and treated as permanent by those surrounding them.
The same thing can be said about symbols. Before children are able to produce
their first symbolic uses of objects the adults surrounding them produce symbols
(see below, observation 1) when using objects with different functions. This has
various didactic purposes, to show how things should be done, when producing
gestures, or when referring to absent things, actions or events. The adult acts with
the child as a symbol maker, thus introducing them into symbolic scenarios of
uses of objects long before they are able to understand symbols as symbols, or, of
course, to produce them.
The same must be said in relation to other semiotic systems as ostensive signs
involving objects. Before children are able to produce their first ostensive signs,
giving or showing (Moro and Rodrguez 1991), or with a private self-reflexive
function (Rodrguez and Palacios 2007), the adult segments the world by high-
lighting certain aspects of it in a space of joint action, thus provoking shared
meanings whose complexity grows through development. First objects appear in
childrens life when they are used by another person in a communicative con-
text (see observations below). The same can be said about the conventional uses
of objects. Adults use objects in the everyday life in a conventional way almost
constantly, introducing directly the child into these practices, a long time before
objects become for the child signs of their social and public use. Objects are first
used (in shared contexts) before being understood by the child. After all, meaning
arises from use.
The same thing happens with indicative gestures. Before the child indicates
with the help of a sophisticated pointing gesture, adults manage in very different
ways to make clear which events or actions are indicated, provoking situations
of joint action and attention. Of course the same thing happens with language.
100 Cintia Rodrguez and Christiane Moro

All this implies a long process and a variety of levels of adult-baby agreement in
communicative contexts about the meaning of objects and events. This variety of
levels of agreement involves different semiotic systems with different degrees of
complexity.
According to our view, infants do not begin to interact with objects only from
5 or 6 months of age as is usually claimed in the literature (Messer 1997). It de-
pends how things are analysed, in which context, and how many partners are
considered. When we look microgenetically at babies as young as 2 months in
naturalistic interactions with adults and objects, we can see how adults bring ba-
bies into contact with objects in the world and how they do things with them. Of
course, the main responsibility and the initiative come from the adult since the
baby is not yet capable of finding them on their own. We will consider all this in
the next section.

5. Joint uses of objects by adults and babies from 2 to 12 months

We will present five observations from 2- to 12-months-old to illustrate the


evolution of triadic interactions where the adult frequently acts as a guide in con-
texts involving the use of a very common object. Through them we will see very
different levels of agreement between adults and children about the same object.
The multiplication and differentiation of new meanings is possible in contexts of
shared use. New meanings involving different semiotic systems grow little by little
through shared contexts of use and joint action.
The videotaping took place in the families homes in three suburban areas of
Madrid. The same object, Chico, very commonly used in schools as well as in the
family homes, was used with all three dyads. This consists of 6 hoops of different
diameters that can be placed over a tapered pivot. The biggest goes at the bottom,
the smallest at the top. Three transparent hoops are filled with small pieces of
plastic, and when shaken sound as a rattle. The three other coloured hoops are
empty, and so produce no rattling effect.
Two main conventional uses can be realised with this object (although, as we
can see in the observations, other symbolic uses are also evident):

1. The simplest consists in shaking the hoops as if they were rattles.


2. The most complex consist in inserting the hoops onto the support.

. Of course, these observations are only illustrations of what happens at each age.
Coming to agreement 101

Through these illustrative observations we will see how different levels of agree-
ment between adults and infants can be reached about the same object. At the
beginning the adult focuses on the easiest uses at the level at which the child can
be included (ostensive signs are very important then), mainly shaking the hoops
as rattles (see observations 13). Later the tendency is to use the object in more
complex ways and place them onto the pivot (see observations 4 and 5).

5.1 Ostensive signs and immediate demonstrations allow the beginning


of joint actions

Observation 1: Alejandro, 0;1,30. Duration: 52 sec.


[Joint action (father-baby-hoops) where the father realises several rhythmic
ostensive uses of a hoop as a rattle and two symbolic uses. Alejandro follows
his fathers uses with great attention and allows himself to be introduced into
them]
[Alejandro is lying on the sofa; his father is leaning towards him] The father
takes a hoop and produces an ostensive sign, showing, shaking it to Alejandro
and saying: this makes a sound. Then, once again, he shakes the hoop close
to Alejandros ear and stops, following a rhythm four times as follows: Listen,
do you like it? approaching the hoop again towards the baby. Alejandro, who
has been looking very attentively at his father all the time, now smiles at him,
vocalises, and moves his body (arms and legs). Yes, this one you likeit, his
father says. When for the fourth time his father shakes the hoop close to his
ear, Alejandro stops looking at his fathers face and turns his head towards his
fathers shaking action of the hoop. His father then leaves the hoop and takes a
second hoop: this one is bigger, doing a new ostensive sign when shaking it.
Alejandro watches with interest his fathers action. The father then transforms
the hoop into a hat when he puts it on Alejandros head, and we put it on top
of your head. Alejandro turns his head and the hoop slides from his head. His
father takes it again, shaking it in front of the baby a couple of times. Alejandro
attentively watches his fathers shaking action and at the same time directs his
arm towards the hoop. [] Look, lets see if you can pull it off your hand []
introducing the hoop onto Alejandros arm as if it were a bracelet, then shaking
his hand. This shaking-babys-hand-with-the-hoop action once again provokes
a rhythmic sound. Alejandro, who is looking towards his father all the time,
co-participates thanks to the immediate demonstration of the object made
by his father when he involves the child in the same use, in his shared space
of rhythmic action, allowing some sharing of meaning. The child also thus
becomes a protagonist in the double function of the object in its conventional
use (as a rattle) and in its symbolic use (as a hat and as a bracelet).
102 Cintia Rodrguez and Christiane Moro

The child is not only confronted with an object, but an object intentionally used
by his father presented to him in a certain way as a result of a particular point
of view, with particular intentions, constituted by meanings of use by the adult.
When an object is being used, this implies that a choice is being made, a perspec-
tive is being adopted; it is an object regarding which certain signs are applied. If
we look carefully at the fathers action, we see that he is using different semiotic
systems to communicate different things about and through the hoop to his son.
Through its use, he transforms the object into something else, this particular thing
that would not be accessible to this very young child in isolation. The hoop shown
to the baby becomes something to look at, to listen to, and eventually to be used
conjointly. The hoop becomes an object of shared attention and, in some degree,
of shared action by both protagonists. This is only possible thanks to the fathers
ostensive gestures when using the object, although this does not mean that the
meaning given by each of them is identical.
At the beginning of the observation, Alejandro looks at his fathers face with
interest. Only after a while does he shift his attention and look at the ostensive ac-
tions of his father with the hoop. The father uses several semiotic systems and at a
certain point the ostensive signs (hoop shown as a rattle) become a symbol for the
father (when it is used as a hat or as a bracelet). Looking at this situation carefully
allows us to understand that babies are exposed to complex uses related to differ-
ent systems of signs (including symbols and language) from a very early age, long
before they are able to understand them as such, as symbols or as words, and of
producing them.
Even with a two-month-old child, we are far from dealing with a re-acting
subject, as a big looking eye. This is so because adults introduce the child con-
stantly into their own ongoing meaningful activity. They intentionally introduce
the child into different semiotic systems long before the child is able to understand
and share the same meanings. The child himself is active at three levels: (1) As a
subject who is very much interested in the others actions. (2) The adult introduces
the child constantly into his own uses. (3) The child himself acts at his level with
his own resources; at the very end he directs his arm towards the hoop used by his
father, and this provokes in his father a symbolic new use of the object as bracelet.

5.2 Ostensive categorization uses by the father by sound


and uses by the child

Observation 2: Alejandro, 0;4,7. Duration: 37 sec.


[Ostensive presentation / classification by the father of different hoops accord-
ing to whether they make sounds or not]
Coming to agreement 103

Alejandro is sitting on his fathers lap. Both are looking towards the support
with the hoops very close on the table. The father puts three hoops on his
hands, two empty ones that are yellow and orange and one transparent with
little pieces inside, and presents all of them to the child while exclaiming:
which one do you want? This one makes a sound shaking the transparent
as a rattle, then leaves it on the table. Or do you want this one? showing the
orange one which has nothing inside (there is no sound). Alejandro, who is
looking with great interest at what his father is doing, stretches out his hand
and takes the orange one. He opens his mouth thus indicating he is going to
use it for sucking, which is what he immediately does. Then, his father pulls
out from the support another transparent hoop with little pieces inside and
then creates new ostensive signs when showing them all again to Alejandro.
Alejandro keeps the orange hoop in his hands and looks at the two others
his father is showing him: or do you prefer those? [] Alejandro drops the
orange ring and stretches his hand out towards the rattle hoop, trying to
grasp it. But it is too far away to reach, and then he starts crying [].

As in observation 1, here the interaction between the two protagonists takes place
through the use of the hoops, and once again the father does not perform the
complex conventional use of introducing the hoops inside the support, as it
would be too difficult to be shared with Alejandro at such an early age. Instead of
this, he focuses on the easiest conventional use of some hoops as a rattle. He seg-
ments his presentation, realising a series of ostensive signs when showing them,
according to a classification/categorization, from the point of view of the sounds
that can be obtained versus no sounds when shaken. In his ostensive presentation
the father is organizing the reality showed to the child according to this rhyth-
mic musical criterion. This raises a very interesting question about the degree to
which the early childrens categorisations are actually spontaneous. We see here
how Alejandros father is active at providing a meaningful framework to the child
according to the uses that can be made of the hoops. His intervention provides the
child with a shared context of classification (of meaning subsequently). It comes
to be a really important way of discretization, of putting order into the reality
from a rhythmic musical and pragmatic point of view.
As in observation 1, Alejandro is, at his level, very active. He gets readily
involved in his fathers ostensive signs when he shows the hoops from a rhythmic
point of view. The big difference with what happened at 2 months is that now he
is much more involved with the uses of the hoops he is able to perform himself.
He readily understands the meaning of the ostensive signs of his father as take
them and do something most of the time Alejandro just sucks them. A very
important shift takes place now in relation to the place of the object in the interac-
tion, as Alejandro participates always actively making a certain selection before
104 Cintia Rodrguez and Christiane Moro

the ostensive signs with the objects realised by his father. In this sense, it is a
triadic interaction, with joint action. Both the father and child share important
degrees of agreement about what to do with what is being shown. The father puts
the object into a certain framework of meaning. The child accepts this point of
view and does something with it. Whereas at 2 months Alejandro got involved
most of the time in the uses made by his father, he is now able to use and choose
the objects in a much more autonomous way.

5.3 The object as sign of its social use for the child related to its musicality

Observation 3: Alejandro, 0;6. Duration: 86: sec.


[Alejandro is able to obtain a variety of rhythmic sounds when he shakes the
hoops. The father is providing the support, allowing the child an exploratory
use of the object according to its musicality, sonority and rhythm]
Alejandro is sitting on his own (while his father is anxious about whether he
will lose his balance). In front of the child four of the six hoops are inserted
into the support. He directs his hands towards the support too far from him.
His father brings it toward the child. Alejandro then hits the hoops inserted
into the support with both hands provoking a rhythmic sound. This action
of rhythmically hitting the support gives way to trying to get one hoop, but
without success as this is too difficult for him (even though he is looking
attentively at the hoops). His father reads his intentions evaluating at the
same time the difficulties he is having, then takes the last hoop from the sup-
port (the orange) dropping it closer to the child (between his knees and the
support). In the meanwhile Alejandro is looking and trying to search out
another hoop outside of the support too far from his reach. His father moves it
towards Alejandro but he finally takes the (orange) hoop previously dropped
by his father close to him. Alejandro then hits his hoop against the support or
against the table obtaining a variety of sounds. His father holds the support
close to the child (otherwise it would slip out of his reach).

If we compare this observation with the previous ones, we see a major difference.
In observation 1, it is the father who, through different ostensive signs, presents
the hoops and introduces the child directly into their use: the sounds they can
produce when shaken. In the second observation he presents several hoops mak-
ing an ostensive sign, proposing to the child to make a choice from the point of
view of the musicality, something that Alejandro does at once. Whereas in this
third observation the child is already able of doing it, obtaining with the hoops
two types of musicality (when hitting the table or the support). The common
agreement shared between the child and the father about the use of the object
Coming to agreement 105

is very easy to see. In this case the father does not give the hoops to the child, he
only brings them closer in such a way that the child himself can realise and explore
different uses related to their musicality. Both share the meaning of an object to be
taken. The father reads his intentions and makes adjustments to facilitate the child
to take the hoops. These adjustments are far less ostensive (bringing to attention)
than the ostensive signs when presenting the hoops (observations 1 and2) be-
cause, (1) the hoops are already the object of the childs attention, (2) both protag-
onists share the definition of the (easiest) conventional use of the object (if hoop,
then take and produce musicality with it). At 6 months of age the child is able to
explore different kinds of rhythmic sounds on his own.
The common feature of these three observations is that the uses selected, pro-
moted and facilitated by the father are related to sonority and rhythm (uses more
basic than those related to introduce the hoops into the support). What is different
is the degree of involvement of each protagonist in the use. We will see that the
situation is quite different in the next observations with Javier, a 9-month-old child
and with Nerea, a 12-month-old girl, where as we will see theirs mothers promote
the more complex conventional use of introducing the hoops into the support.

5.4 The child understands the ostensive signs but does not understand
the pointing gestures related to the more complex conventional
use of the object

Observation 4: Javier, 0;9. Duration: 28 sec.


[The mother tries to elicit the more complex conventional use of the objects
(to introduce hoops into the support). Javier understands without problems
his mothers ostensive signs, but does not understand her pointing gestures as
indicating a certain use of the object]
Javier is sitting. His mother is sitting close to him. Javier is manipulating the
support with both hands whereas the hoops are lying all around. Then his
mother takes the biggest hoop and, while showing/offering it to the child, says:
this one first, it is bigger, insert it my son, insert it. Javier looks at the hoop
presented by his mother, and takes it with one hand while holding the support
with the other, but as the support is not vertical the hoop touches it in a paral-
lel position. The mother says there, there doing a pointing that is immediate
(the finger is touching the support) and multiple (it happens more than once,
in this case two times). Insert it, here, here, here putting the support, held
by the child, in a quasi-vertical position and doing again an immediate and
multiple pointing. Put it here. Javier manipulates the hoop very close to the
top of the support looking at it very carefully but not following his mothers
106 Cintia Rodrguez and Christiane Moro

indicative gesture as related to the conventional use of the object of inserting the
hoop into the support. Then the mother says, [the hoop] has pellets inter-
preting maybe the interest of the child in looking at the hoop, does it sound?
After attending to the hoop, Javier once again takes the support, brings the
hoop towards it but without trying to insert it. Then he puts the support far
away to the right, while keeping hold of the hoop in his other hand [].

Compared with the previous observations, this is the first time the mother uses
highly elaborated conventional indexical signs (pointing gestures), as semiotic
mediators, and, this time, not only as ostensive signs with the object, as was the
case in the previous observations. This is first time also that the mother tries to get
the child to recognise the more complex conventional use of the hoop by introduc-
ing it inside the support. In the previous observations the adult was much more
concerned about the use of the hoops according to their sounds (the other pos-
sible conventional use of the object). It is true that the degree to which the child
is introduced into this use is very different from what happened with Alejandro
in observation 1 (where the musicality comes from the fathers use) to observa-
tion3 (where the child explores different sonorities and rhythms). What seems to
us evident is that at his level Alejandro from 2 to 6 months was always part of this
universe of sound and rhythm.
Now we would like to stress how the ostensive signs function here. The moth-
er shows/offers the hoop and Javier readily understands his mothers intentions
about this particular use of the object: he takes the object being offered. Clearly,
there is no problem with this level of agreement between mother and child. Both
share this kind of meaning related to this object use. Nevertheless, with the in-
dexical signs (the pointing gestures in relation to the conventional use expected)
things are quite different. The mother points towards the support and shows her
intentions about the way the child should use the object (i.e., the more complex
conventional use of the hoops by introducing them into the support), but Javier
does not follow her intentions in relation to this particular use. He does not inter-
pret the signs according to the use she proposes. To interpret correctly the point-
ing gesture of the mother means to use this indexical sign as a tool to help doing
the conventional use of the hoop by introducing it into the support. To make this
inference: if pointing gesture, then introduce this hoop into this support, is very
complex and non-transparent. This requires sharing several levels of signs (and
several semiotic systems involving the meaning of the pointing, the social mean-
ing of the object, as sign of its use, and the articulation between both, consider-
ing that the object becomes a sign of its use thanks to the ostensive and indexical
signs applied to it). Objects themselves do not demonstrate how they should be
used. We always have to do that on their behalf. This is why objects are opaque
Coming to agreement 107

entities. Under no circumstances is their meaning self-evident. For the child to


understand his mothers intentions necessarily means to be able to articulate them
with the state of affairs of the world. In this case Javier has no problem under-
standing the ostensive gestures, but does not understand the conventional indexi-
cal one relating to a public use. Parent and child share important meanings related
about what to do, but only to a certain degree. Only later in development do more
complex uses (canonical, or symbolic, for instance) become systematic.

5.5 Agreements about more complex use: the child understands


the ostensive and the pointing gestures related to the use of the object

Observation 5: Nerea, 0;12 Duration: 25 sec.


[Mother elicits the most complex conventional use of the objects (introduce
hoops into support). Nerea readily understands ostensive signs, as well as
the pointing gestures of her mother as related to the conventional use of the
object]
[At the beginning of the session, Nerea does not make any connection
between the hoops and the support as the place where they should be inserted
but sucks them instead. Her mother makes several distant demonstrations
(ostensive signs of the whole conventional use) showing how to do it. Finally
Nerea starts trying doing it herself and she introduces herself into the new
meanings according to the practice proposed by her mother]
Nerea and her mother are sitting on the floor. The support is standing between
both of them. One hoop is already inserted into the support. The mother
inserts another saying: come on Ill insert them and, while she is saying
this, Nerea moves the red hoop in her hand towards the support in an attempt
to comply with the conventional use of the object. But it is too far away, and
so she fails. Her mother, changing her initial intention of introducing the
hoops herself, says, Yes, this one pointing towards Nereas approaching hoop.
As it is too far, Nerea withdraws the hoop keeping it on her hand. She looks
with great attention at her mothers action. Right, the red, put it here, here
indicating with an immediate gesture touching the support with several
fingers and multiple as she does it two times, provoking a sound. She says
again: here, put it here, realising an immediate multiple pointing towards the
support. And taking the whole support and shaking it to indicate where to
insert the object, the mother exclaims, The red one, while pointing towards
the hoop held by Nerea. Here, pointing again towards the support with an
immediate and multiple pointing. Nerea, who has been closely attending to
her mothers gestures, once again moves the hoop towards the support in an
108 Cintia Rodrguez and Christiane Moro

attempt to insert it, but it runs away and falls down. The mother takes it, and
segments her action into two phases: First she brings the hoop towards the
support, and says, look, and then she stops for a while with the hoop in the
air above the pivot. In the second phase, she says like this, inserts the hoop
and thus completes the action. And now, the orange one, she says, show-
ing/giving this new hoop to Nerea. Nerea, who was reaching out towards her
mothers action, immediately takes it. The mother says here pointing to the
support, and Nerea reads her mothers intentions as she interprets correctly
the meaning of the pointing as related to the use of the object directing the
hoop towards the place indicated by her mother. Finally she inserts the to
hoop around the support and her mother claps her hands, very good. Now,
the yellow one says the mother, giving it to Nerea, who again without hesita-
tion takes it and direct towards the support. However, as she brings the hoop
towards the support at the wrong angle (i.e. parallel rather than vertically),
the hoop hits the support and so she fails to insert it. Even though she did not
succeed, her intention was to achieve the conventional use of the object.

If we compare what happened in observations 1 to 3 and in observations 4 and


5, we see how the mothers focus with the oldest children Javier (observation 4)
and Nerea (observation 5) in the more complex conventional use of the object of
introducing the hoops into the support, a thing that was completely ignored by
the adult with the youngest. In this latter case, adults were more focused on the
conventional use of the hoop as a rattle. In other words, adults are extremely sen-
sitive to the abilities of the children, towards what they can and cannot do. This
is how they ignore some uses (and meanings) of the object (too complex for a 2
or a 4 months old child) and privilege those uses allowing levels of agreement
with the child.
If we then compare Javier and Nerea we see that in the case of Javier he un-
derstands the meaning of the ostensive gestures of his mother to take the object
offered, whereas the pointing gestures related to the use was too complex for him
and he does not direct the hoop towards the support being pointed to by his
mother. With Nerea the situation was similar at the beginning of the session, but
in the observation we have presented, she was already able to understand many
of the semiotic mediators at work when her mother was trying to elicit from her
the conventional use of the object. Consider for instance, her comprehension of
the different modalities of the pointing gestures, of the distant demonstrations or the
ostensive signs when the mother shows the hoops to Nerea. At this moment, the
object on its own becomes for Nerea the sign of the (complex) use that should
be done with it. New and much more complex levels of agreement between the
child and adult now begin to emerge about how to use the object.
Coming to agreement 109

As an open question for further research, we would like to stress that on an-
other study concerning the private ostensive and pointing gestures that Nerea pro-
duces with the hoops and the support at 18 months (Rodrguez and Palacios2007)
we see certain parallelism with the segmentations of the use made by her mother
that we have seen in this last observation. Whether or not there is any connection
between both activities needs further research.

6. Conclusions

Twentieth century philosophy has centered its reflection on language (Eco 1997).
The linguistic turn has strongly affected our vision of language putting the
question of its meaning into a pragmatic perspective of everyday contexts of use
(Wittgenstein 1953; see Itkonen, this volume). There is no need to remind how this
pragmatic position has been influential in Psychology. Bruners work with babies
is an excellent example (Tomasello 2001; Shotter 2001). However, this pragmatic
perspective has hardly been extended to objects, the very things we do things with.
The emphasis placed by Bruner (1975) on contexts-of-use in relation to language
acquisition has, paradoxically, not been applied to objects. Everything happens as
if language was used, but objects were not. Perhaps this happens because objects
seem to be evident, no matter how complex they may be. The naturalistic view of
objects, according to which they are obvious and show their meaning in a literal
and direct way, is predominant nowadays in early developmental research (Ro-
drguez 2006, 2007).
This position has been challenged in this chapter. In opposition to the over-
simplification implied by the naturalistic view of objects, we have situated them
from a semiotic and pragmatic position. That is to say, we have considered their
use(s) in the everyday life. Objects cannot be reduced to any physical self-evident
non-social reality (Costall 2004; Rodrguez and Moro 1999, 2002; Sinha 2005,
Sinha and Rodrguez this volume). Many of the things we do with them are only
possible within communicative contexts, and communication itself changes ac-
cording to the object or use selected. From a developmental perspective, very dif-
ferent levels of adult-baby agreement can be reached when they interact around
an object.
As examples illustrating this oversimplification, we have focussed on a few
topics extremely well known in the literature, such as the dichotomy in the theory
of mind research between the opacity of the social world versus the evident physi-
cal reality. This dichotomy does not fit with our findings when looking at baby-
object-adult interaction. The so-called physical reality is complex as objects are
included into normative practices.
110 Cintia Rodrguez and Christiane Moro

Our position is not far from Racines (2004) who, following Wittgenstein, dis-
tances himself from a naturalistic logic of meaning: intentions, beliefs and
desires are not in our head; they exist and are understood in language-games
(2004:271). What Racine says is that the theories of mind we use should not be
placed exclusively inside the head because they are actually born, and take place
in language-games. Our focus here refers to what happens below the level of lan-
guage games. This means that it exists, taking Wittgensteins words, in sign-games
with objects (Rodrguez 2006) long before and probably as a pre-condition
for the language games to appear later in development. Two subjects, such as an
adult and a 2 months old baby, can reach important levels of agreement around
the use of an object because adults make clear their intentions when using objects
in a communicative context.
Concerning the question of the subject when dealing with objects, in our view,
there is an urgent need to place the active child back on the agenda of Psychology,
in place of a merely reactive one. Often studies in early cognitive development
belong to the kingdom of spontaneity, ignoring that children interact with oth-
ers since the beginning of their life in contexts of joint action. The permanence
of the object is another important but oversimplified topic. How many kinds of
object permanence are there? According to our view, it is necessary to include
something like a pragmatic permanence that takes into account its social use. After
all, if objects have public names, they have also public functions. How does this
affect the conventional use of the object (once the child has got it) to the objects
permanence? Is there any link? After all, objects are used according to certain
rules shared by the community and these represent some kind of stability. The
conventional use procures for the object a permanence of its social use, a sort of
label of use (Rodrguez 2007). This means a detachment from the concrete and
strict individuality of the case.
A similar thing can be said in relation to early categorizations. Is there any
relation between the public uses of objects and early conceptual categorizations?
To use an object in a conventional way implies the application of a certain level
of categorization. Maybe we should conclude that early categorisations made by
children are not so culture free, not as spontaneous as it has been widely as-
sured.
When we put the object into this pragmatic position, and examine the uses of
objects microgenetically within triadic interactions baby-object-adult- we dis-
cover the multiple scenarios at work and how each new shared meaning emerges
through use in such communicative contexts. Objects do not afford the same
things in different moments in ontogenesis and cannot be excluded from norma-
tive practices. Adults seem to know this very well.
Coming to agreement 111

In the observations presented in this chapter, we have seen how children


reach, through use, very different degrees of agreement with adults involving
diverse semiotic systems around the meaning of objects. All observations in-
volve situations of joint action between two protagonists and an object (not only
joint attention where babies most of the time do not act). Different kinds and
levels of conventional uses and different semiotic systems are at work. When ob-
jects are considered from this pragmatic position, the fiction that sees them as a
literal, self-evident, non-social reality disappears. In fact many communicative
processes take place through them. Meanings can be very diverse, as different se-
miotic systems with different levels of complexity are involved. The child reads the
social meaning of the object and the adults intentions because signs and objects
are articulated through use.
Dealing with all that implies putting the child in an active position (as its com-
mon place among the classical theorists). Triadic interactions appear as a royal way
of considering the processes at work long before the child is able to actively involve
the adult and the world in a same communicative act. The magic number three
as the unit of analysis with the educational role of adults may help to clarify
how this growth of meaning takes place, starting with the simplest signs, with the
simplest uses. The triadic interaction we are dealing with cannot be only joint at-
tention, that is to say, a communality of attention around something (see Zlatev
this volume). We need more; we need to know how what we call joint action works
and evolves, how both protagonists transform the object by using it.
A long time before the child is able to produce his first ostensive, convention-
al and symbolic uses of objects, once they become signs of its [social] use, his
first pointing gestures towards something, or his first words, the adult acts with
the child as a symbol maker, produces ostensive gestures with objects, points
to them to make clear her intentions, uses the world in a canonical manner and
talks to the child almost constantly. This implies a long process and an enormous
variety of levels of adult-baby agreement about objects and situations during the
first year of life.
In one word, the meanings of objects in everyday life are a function of their
use. If objects are not excluded from normative practices, then we need to know
how this pragmatic position of the object and communication with others affects
early cognitive development. After all, we do things with words, but we also do
things with things (Costall and Dreier 2006), in agreement with others, and this is
extremely important from the very beginning of life. Therefore, we have to under-
stand how these sign games where objects are included work in development. Many
levels of conventions refer not only to language but also to objects. Early cognitive
development and linguistic theories cannot ignore all that any longer.
112 Cintia Rodrguez and Christiane Moro

Acknowledgements

We would like to thank Alan Costall and the Editors of this book for their excel-
lent comments and suggestions on this chapter.

References

Arango, S., Chvez, L. and Lasprilla, A. 2003. Uso de un Objeto por Seis Nios Autistas. Unpub-
lished manuscript, Universidad del Valle, Instituto de Psicologa.
Bakeman, R. and Adamson, L. 1986. Infants conventionalized acts: Gestures and words with
mothers and peers. Infant Behavior and Development 9: 215230.
Baron-Cohen, S. 1993. From attention-goal psychology to belief-desire psychology: The devel-
opment of a theory of mind, and its dysfunction. In Understanding Other Minds: Perspec-
tives from Autism, S. Baron-Cohen, H. Tager-Flusberg and D. Cohen (eds.), 5982. Oxford:
Oxford Medical Publications
Barresi, J. and Moore, C. this volume. The neuroscience of social understanding.
Bremner, G., Slater, A., Foster, K., Johnson, S., Mason, U., Cheshire, A. and Spring, J. 2005.
Conditions for young infants perception of objects trajectories. Child Development 76:
10291043.
Bronckart, J.-P. 1997. Semiotic interaction and cognitive construction. Archives de Psychologie
65: 95106.
Bruner, J. 1975. From communication to language: A psychological perspective. Cognition 3:
255287.
Bruner, J. 1990. Acts of Meaning. Cambridge, MA: Harvard University Press.
Carpendale, J.I.M., and Lewis, C. 2004. Constructing an understanding of mind: The develop-
ment of childrens social understanding within social interaction. Behavioral and Brain
Sciences 27: 79151.
Corballis, M. 2002. From Hand to Mouth. The Origins of Language. Princeton: Princeton Uni-
versity Press.
Costall, A. 2004. From direct perception to the primacy of action. In Theories of Infant Devel-
opment, G. Bremner and A. Slater (eds.), 7089. Oxford: Blackwell.
Costall, A. and Dreier, O. 2006. Doing Things with Things. The Design and Use of Everyday Ob-
jects. Hamphshire: Ashgate.
Donald, M. 2001. A Mind So Rare. The Evolution of Human Consciousness. New York: W.W.
Norton.
Eco, U. 1999 [1997]. Kant y el Ornitorrinco. Barcelona: Lumen.
Gibson, E.J. and Pick, A. 2000. An Ecological Approach to Perceptual Learning and Development.
Oxford: Oxford University Press.
Gould, S.J. 2001 [1998]. Les Coquillages de Lonard: Rflexions sur Lhistoire Naturelle. Pars:
Seuil.
Inhelder, B., Cllerier, G., Ackermann, E., Blanchet, A., Boder, A., de Caprona, D., Ducret, J.-J.
and Saada-Robert M. 1992. Le Cheminement des Dcouvertes chez lenfant. Recherche sur les
Microgenses Cognitives. Neuchtel-Paris: Delachaux et Niestl.
Itkonen, E. this volume. The central role of normativity for language and linguistics.
Coming to agreement 113

Karmiloff-Smith, A. and Inhelder, B. 1974. If you want to get ahead get a Theory. Cognition
3: 195212
Leavens, D.A., Hoppkins, W.D. and Bard, K.A. this volume. The heterochronic origins of ex-
plicit reference.
Legerstee, M., Pomerleau, A., Malcuit, G. and Feider, H. 1987. The development of infants re-
sponses to people and a doll: Implications for research in communication. Infant Behavior
and Development 10: 8195.
Legerstee, M. and Weintraub, J. 1997. The integration of person and object attention in infants
with and without Down syndrome. Infant Behavior and Development 20: 7182.
Leslie, A. 1987. A Language of Thought approach to early pretense. In Symbolism and
Knowledge/Symbolisme et connaissance, J. Montangero, A. Tryphon and S. Dionnet (eds.),
133144. Genve: Cahiers de la Fondation des Archives Jean Piaget, 8.
Leslie, A. and Roth, D. 1993. What autism teaches us about metarepresentation. In Under-
standing Other Minds. Perspectives from Autism, S. Baron-Cohen, H. Tager-Flusberg and
D. Cohen (eds.), 83111. Oxford: Oxford Medical Publications.
Messer, D. 1997. Referential communication: Making sense of the social and physical world.
In Infant Development: Recent Advances, G. Bremner, A. Slater and G. Butterworth (eds.),
291309. Sussex: Psychology Press.
Mandler, J. 2000. Perceptual and conceptual processes in infancy. Journal of Cognition and
Development 1: 336.
Mandler, J. 2004a. The Foundations of Mind. Origins of Conceptual Thought. Oxford: Oxford
University Press.
Mandler, J. 2004b. Two kinds of knowledge acquisition. In The Development of the Mediated
Mind: Sociocultural Context and Cognitive Development, J. Lucariello, J. Hudson, R. Fivush
and P. Bauer (eds.), 1332. Mahwah, London: LEA.
Moore, C. 2004. George and Sam. London: Penguin Books
Moore, K. and Meltzoff, A.N. 1999. New findings on object permanence: A developmental
difference between two types of occlusion. British Journal of Developmental Psychology
17: 563584.
Moro, C. and Rodrguez, C. 1991. Por qu tiende el nio el objeto hacia el adulto? La con-
struccin social de la significacin de los objetos? Infancia y Aprendizaje 53: 99118.
Moro, C. and Rodrguez, C. 2005. Lobjet et la Construction de son Usage chez le Bb: Une Ap-
proche Smiotique du Dveloppement Prverbal. Berne New York: Peter Lang.
Mossier, C. and Rogoff, B. 1994. Infants instrumental use of their mothers to achieve their
goals. Child Development 65: 7079.
Nelson, K. 1974. Concept, word and sentence: Interrelations in acquisition and development.
Psychological Review 81: 267285
Pardos, A. and Rodrguez, C. 2005. The importance of the use of objects in the early detection
of autism. Paper presented in the symposium Uses of objects and semiotic mediation in
impaired children. First ISCAR Congress, Seville, 2024 September.
Piaget, J. 1977 [1937]. La Construction du Rel chez Lenfant. Neuchtel-Paris: Delachaux et
Niestl.
Pika, S. this volume. What is the nature of the gestural communication of great apes?
Racine, T.P. 2004. Wittgensteins internalistic logic and childrens theories of mind. In Social
Interaction and the Development of Knowledge, J.I.M. Carpendale and U. Mller (eds.),
257276. Mahwah, NJ: Erlbaum.
114 Cintia Rodrguez and Christiane Moro

Rodrguez, C. 2007. Object use, communication and signs. The triadic basis of early cognitive
development. In The Cambridge Handbook of Socio-Cultural Psychology, J. Valsiner and
A.Rosa (eds.), 257276. New York: Cambridge University Press.
Rodrguez, C. 2006. Del Ritmo al Smbolo: Los Signos en el Nacimiento de la Inteligencia. Barce-
lona: ICE-Horsori.
Rodrguez, C. and Moro, C. 1998. El uso convencional tambin hace permanentes a los obje-
tos. Infancia y Aprendizaje 84: 6783.
Rodrguez, C. and Moro, C. 1999. El Mgico Nmero Tres. Cuando los Nios an no Hablan.
Barcelona: Paids.
Rodrguez, C. and Moro C. 2002. Objeto, comunicacin y smbolo. Una mirada a los primeros
usos simblicos de los objetos. Estudios de Psicologa 23: 32333.
Rodrguez, C. and Palacios, P. 2007. Do private gestures have a self-regulatory function? A case
study. Infant Behavior and Development 30: 180194.
Shotter, J. 2001. Towards a third revolution in Psychology: From inner mental representations
to dialogically-structured social practices. In Jerome Bruner. Language, Culture and Self,
D. Bakhurst and S.G. Shanker (eds.), 167183. London: Sage Publications.
Sinha, C. 2005. Blending out of the background: Play, props and staging in the material world.
Journal of Pragmatics 37: 15371554.
Sinha, C. and Rodrguez, C. this volume. Language and the signifying object: From convention
to imagination.
Spelke, E. 1998. Nativism, empiricism, and the origins of knowledge. Infant Behavior and
Development 21: 181200.
Striano, T. and Bertin, E. 2005. Social-cognitive skills between 5 and 10 months of age. British
Journal of Developmental Psychology 23: 559568.
Striano, T. and Rochat, P. 1999. Developmental link between dyadic and triadic social compe-
tence in infancy. British Journal of Developmental Psychology 17: 551562.
Thelen, E., and Smith, L. 1998 [1994]. A Dynamic Systems Approach to the Development of Cog-
nition and Action. Cambridge: The MIT Press.
Tomasello, M. 2001. Bruner on language acquisition. In Jerome Bruner. Language, Culture and
Self, D. Bakhurst and S. G. Shanker (eds.), 3149. London: Sage.
Vygotsky, L. 1996 [1984]. El primer ao. In Obras escogidas IV. Psicologa infantil, L. Vygotsky,
275318. Madrid: Visor.
Wallon, H. 1970 [1942]. De Lacte la Pense. Paris: Flammarion.
Williams, E., Kendell-Scott, L. and Costall, A. (2005). Parents experiences of introducing ev-
eryday object use to their children with autism. Autism 9: 521540.
Willatts, P. 1997. Beyond the couch potato infant: How infants use their knowledge to regu-
late action, solve problems and achieve goals. In Infant Development: Recent Advances,
G. Bremner, A. Slater and G. Butterworth (eds.), 109135. Sussex: Psychology Press,
Erlbaum.
Wittgenstein, L. 1958 [1953]. Philosophical investigations. Englewood Cliffs: Prentice Hall.
Zlatev, J. this volume. The co-evolution of intersubjectivity and bodily mimesis.
chapter 6

The role of intersubjectivity in the


development of intentional communication

Ingar Brinck

The present account explains (i) which elements of nonverbal reference are
intersubjective, (ii) what major effects intersubjectivity has on the general
development of intentional communication and at what stages, and (iii) how
intersubjectivity contributes to triggering the general capacity for nonverbal
reference in the second year of life. First, intersubjectivity is analysed in terms of
a sharing of experiences that is either mutual or individual, and either dyadic or
triadic. Then it is shown that nonverbal reference presupposes intersubjectivity
in communicative intent indicating and referential behaviour, and indirectly in
modifications of previous behaviour in response to communication failure. It is
argued that different forms of intersubjectivity entail different types of commu-
nicative skills. A comprehensive analysis of data on gaze-related intersubjective
behaviour in young infants shows that interaffectivity and interattentionality
enable referential skills early in development and together allow for complex
behaviour. Early referential skills, it is proposed, arise by other mechanisms
than in nonverbal reference. Reliable and consistent use of nonverbal reference
occurs when interaffectivity and interattentionality coalesce with interintention-
ality, which affords general cognitive skills that together permit a decontextuali-
sation of communicative behaviour.

1. Introductory remarks on the approach and method

Few people would disagree with the statement that intersubjectivity plays a critical
role for language acquisition and is central to nonverbal reference, yet not many
would agree about its exact significance. Intersubjectivity is a complex relation
that manifests itself in many types of context and in different ways, which makes
it difficult to elucidate exactly how it contributes to these capacities. The aim of
the present chapter is to account for the role that intersubjectivity plays for the
116 Ingar Brinck

evelopment of nonverbal, intentional communication in human infants. The de-


d
velopmental psychologists definition of intersubjectivity in terms of a sharing of
experiences will provide the starting-point for an approach in three steps. First, it
will be determined which elements in the act of nonverbal reference are intersub-
jective, then the major effects of intersubjectivity on the developmental trajectory
of intentional communication will be established, and finally it will be explained
how intersubjectivity contributes to elicit the capacity for nonverbal reference.
The approach is interdisciplinary and takes experimental research in devel-
opmental psychology and related areas as the foundation for generating new hy-
potheses and explanations from a general, cognitive perspective. In the present
context this will mean to analyze and systematize existing data on intersubjective
and referential skills in infants, evaluate current explanations of such skills, es-
tablish connections among different types of data that can be expected to have a
general explanatory value, and develop the necessary theoretical and conceptual
framework for explaining any observations that are made during the course of
the investigation.

2. Sharing experiences requires complementary capacities

In developmental psychology, intersubjectivity is frequently defined as a deliber-


ate sharing of experiences about objects and events (cf. Trevarthen and Hubley
1978; Stern 1985). There is no reason to contest the central part of the definition,
that the infants capacity for intersubjectivity concerns a sharing of experiences.
We will return to it below.
The first part of the definition that describes intersubjectivity as deliberate is
ambiguous. A weak interpretation in terms of goal-directedness or intentionality
is presupposed by the concept of intersubjectivity, which implies interaction, and
is therefore unproblematic. Any wider implications that stronger interpretations
might have are of no particular concern to the present discussion, and will not be
further considered.
The last part of the definition according to which experiences are about some-
thing expresses an obvious truth, yet is easily misunderstood. Intersubjectivity can

. In line with current praxis in developmental psychology and research on language acquisi-
tion, the term intentional communication will be used to refer to nonverbal, referential com-
munication by gesture, gaze, vocalisation, etc., in preverbal human infants.
. Experiences are felt qualities, i.e., the qualitative ways in which objects, states, and events
present themselves to conscious awareness when perceived from a first-person perspective.
They vary in intensity and vividness, and are sometimes value-laden.
Intersubjectivity and intentional communication 117

be triadic (relating two subjects relative to a third element) or dyadic (relating two
subjects). In the former case, the subjects are said to exchange experiences around
an object. In the latter case, the experiences concern the subjects themselves and
the interaction between them, and therefore reflect the fundamental reflexivity of
the relation of intersubjectivity.
The definition of intersubjectivity as a sharing of experiences draws atten-
tion to the interactive aspect of the relation. Intersubjectivity first materializes in
the form of early imitation between the newborn infant and an adult (usually its
caretaker), and soon develops into mutual engagement. During these exchanges,
infant and adult characteristically take a second-person perspective toward each
other. Mutual engagement develops in the first month from the repetitive but ac-
tive and rhythmic matching of facial expressions of emotion between adult and
infant. In turn, proto-conversation is built around turn-taking, that is, the recip-
rocal co-ordination and sequencing of more or less spontaneous behaviours and
actions in time (Trevarthen 1979; Trevarthen and Aitken 2001). By two months,
the infant can produce differentiated responses to the adults attention and so take
turns, and one month later, begins to actively call others attention to the self, as it
seems, to initiate mutual engagement (Reddy 2005).
The newborn infants primordial experience of similarity with the other is a
precondition for developing intersubjectivity. Repeated episodes of mutual en-
gagement during the first months in life further promote the implicit recognition
of similarity between self and other. Meltzoff and Brooks (2001) argue that this
recognition is based in a cross-modal mapping of felt and observed actions, which
prepares the infant for adjusting his or her individual behaviour to the needs and
demands of the other, as in later and more complex forms of interaction. It seems
reasonable to think that the cross-modal mapping of actions causally depends
on activity in the mirror neurons. Research on mirror neurons has revealed that
manual actions are recognised by a mapping of the observed action onto a mo-
tor representation of it in the observers brain. When an observer is watching an
agent perform an action, there is a concurrent activation of the motor circuits
that would have been recruited had the observer performed the action herself
(Gallese, Keysers and Rizzolatti 2004:397). Furthermore, noticing another agents
facial expressions of emotion will activate similar areas of the observers brain as
of the agent whose face is observed, and gives rise to similar sensations and nega-
tive or positive experiences (Gallese et al. 2004; Rizzolatti et al. 2002; Wicker et
al. 2003; Barresi and Moore, this volume). In line with this, the neural correlates

. Other brain regions, e.g., the inferior prioretal cortex, make it possible for the brain (hence,
the agent) to discriminate actions of the self from those of another (observed) agent. See Barresi
and Moore (this volume) on the role of mirror neurons for intersubjectivity.
118 Ingar Brinck

of mutual engagement can be described in terms of spreading neural activation


in the brains motor representations, which trains the perception-action system to
recognise and produce instrumental actions and facial expression of emotion and
to fine-tune its reactions (cf. Decety and Ingvar1990; Jeannerod 1997).
Meltzoff and Brooks (2001) maintain that the similarity between self and
other presents itself directly to the senses of the infant in mutual engagement.
Yet thinking of mutual engagement as based in an experience of mere similarity
makes it hard to see how it might engender a dynamic interaction or exchange.
Reddy (2003, 2005) resolves the dilemma by proposing that the experience of
self-other equivalence presupposes both the infants original identification with
the other and the experience of being different. Once made, the point is obvious:
Similarity is only noticeable against the background of an experienced dissimilar-
ity. It explains why mutual engagement is in fact dynamic and not purely contem-
plative although intersubjectivity presupposes sameness, or identity.
Reddys remark implies that the experience that we have of the self and its
relation to others is inherently double. In other words, self-awareness presuppos-
es the awareness of another subject who will reflect the similarity by displaying
points of difference. To follow up on what was said above about the neural basis of
self-other equivalence, the present idea also can be expressed by reference to the
perception-action system in the brain. Original identification is likely to be based
in the activity of mirror neurons that automatically map observations of behav-
iour onto the corresponding areas in the brain of the observer. The experience of
difference then will emerge from the need to calibrate the motor representations
in the brain that are caused by the perception of other agents behaviour.
The sharing of experiences requires complementary capacities. First, there is
the capacity to recognise the experiences of another individual, and second, there
is the capacity to make available ones own experiences to somebody else. Even
if the two correspond, they are distinct. The newborn infant needs to practise
both, and learns how to harmonize them by engaging in early imitation soon
after birth. Given that both the similarity and difference between self and other
are perceivable, the infant will directly experience the bi-directional relation be-
tween self and other in mutual engagement, and eventually develop an intuitive
understanding of intersubjectivity. Furthermore, since early intersubjectivity is
grounded in concrete contexts of reciprocal interaction, the two complementary
capacities automatically are brought together.

. The present view has affinities to the theory advanced by Goldman (2006), who argues that
mind-reading is simulation, and that some forms of mind-reading, which do not depend on
verbal capacities, are based in the mirror neuron system.
Intersubjectivity and intentional communication 119

3. Sharing experiences individually or mutually

It might seem problematic that while the definition of intersubjectivity emphasizes


a mutual sharing of experiences in the second person, just a fraction of the infants
skills that are characterised as intersubjective in the literature is genuinely recipro-
cal. For instance, nonreciprocal gaze following in the direction of a target involves
checking where the other agent is looking by attending to head and body orienta-
tion, visually searching for a target in common space, and then localising the target
by gaze alternation between the agent and salient items in the shared context. In
such cases, experiences are shared on an individual basis in the third person.
Since behaviour that is built around individual sharing is not reciprocal in the
sense of involving attention contact, it is a fair question whether there are other,
independent reasons for calling it intersubjective, except for its being a sharing
of experiences by definition. If not, it might seem that individual sharing is inter-
subjective in a merely derived or figurative sense, and we might choose not to call
behaviour that relies on it intersubjective. However, it is clear that this behaviour
requires a sharing of experiences, even if both parties are not aware of doing so.
If an agents perceptual experiences could not be shared, or accessed, by observa-
tion, behaviour such as nonreciprocal gaze following would be impossible.
That individual sharing is intersubjective is also evidenced by the fact that
similar mechanisms in the brain underlie individual and mutual sharing of expe-
riences. As described in the previous section, mirror neurons make it possible for
another agents experiences to resonate in the observer. The underlying mapping
mechanism will only function properly if it originates from the experience of self-
other equivalence and receives training in episodes of mutual engagement. Be-
sides, to decide to call only behaviour that involves attention contact and mutual
engagement intersubjective will prove impractical, given that in developmental
and comparative psychology the term often is used for social behaviour that in-
volves attending to another agents attention whether or not the agents have
attention contact.
Thus, sharing is individual when a subject gains access to another subjects
experiences via observation (this still concerns attending to the other subjects
attention). In contrast to mutual sharing, it is one-way, distanced, and instrumen-
tal. Another example is gaze alternation as part of social referencing, when the
infant seeks information from the adult by looking at his or her facial expression
of emotion. Many states of mind can be manifest in overt behaviour and then are
perceptually available to others in the third person (Bhler 1934). Thus attention,
emotion, attitude, interest, and perceptual knowledge can be read from posture,
movement, gesture, facial expression, gaze, head and body turn, and (nonverbal)
vocalisation. Attention reading, the recognition of goal-directed intentions in
120 Ingar Brinck

others from observations of their behaviour relative to salient entities in the local
context, is a generic form of individual sharing (Brinck 2004).
A mutual sharing of experiences occurs when two agents gain access to each
others experiences via either observation or attention contact. Mutual sharing is
bi-directional. It does not necessarily entail attention contact, because the agents
may be looking repeatedly towards each others faces but not simultaneously.
It will still be a case of mutual sharing, since each agent is aware of the others
experiences even if they are not mutually aware of sharing experiences. Two
examples of mutual sharing that do include attention contact are (triadic) joint
attention and (triadic) ritualised behaviour in the form of structured play. Thus,
consider the repeated giving and taking back of an object while exchanging emo-
tions around it by vocalisation and facial expression, as illustrated by the toddler
who repeatedly hands an old sock to her parents dinner guest, expecting to re-
ceive it in return from the adult.

4. The act of nonverbal reference

To explain what is intersubjective in intentional communication and how it is


so calls for an account of intentional communication that identifies its basic ele-
ments. Such an account is offered below, where the range of intentionally com-
municative behaviour that has been observed among human preverbal infants is
categorised into four types, according to the standard uses the behaviour has been
observed to have in acts of nonverbal reference.
Intentional communication may be defined as the nonverbal, spontaneous
and purposively produced social interaction between (typically) two agents rela-
tive to a distal object in common space. Its primary use is to establish joint atten-
tion to a third entity, typically for some further purpose, according to the senders
needs and desires (Brinck 2001, 2003, 2004). The process that leads to joint atten-
tion is flexible, and can be adjusted to meet the behaviourally manifest idiosyn-
crasies and expectations of individual agents. The act of nonverbal reference, with
the function of directing the observers attention to a distal target, is the paradig-
matic example of intentional communication. Although its principal means is the

. The present analysis of intentional communication globally agrees with the analyses in
Bard (1992), Bates et al. (1979), Leavens, Hopkins and Thomas (2004), and Leavens, Russell,
and Hopkins (2005), and applies to human infants and Great apes alike.
Intersubjectivity and intentional communication 121

pointing gesture, it can be accomplished by other means, such as gaze or head and
body turn in the direction of the target.
Intentionally communicative acts are complex in the sense of being com-
posed from a selection of behaviour. First, the varieties of behaviour that occur in
intentional communication can be categorised into four distinct types according
to the functions that such behaviour has been observed to have in contexts of use.
The types consequently constitute the basic acts or units of the complex act of
nonverbal reference. Which of these acts should be performed to achieve nonver-
bal reference on a given occasion is determined in the context of use.
Below, the act of nonverbal reference is described from the senders per-
spective in terms of the four types of behaviour that are used to perform it (cf.
Bard1992; Bates et al. 1979; Leavens, Hopkins and Thomas 2004; and Leavens,
Russell and Hopkins 2005). Taken together they are intended to subsume the
multitude of communicative behaviour that recurs in nonverbal reference. Each
type of behaviour will be characterised, first, operationally by its observable fea-
tures, and then, in terms of its use, or function, in concrete situations. The present
concept of a function identifies behaviour by the observable use it has across
contexts. It stands in contrast to the concept of a teleological or design function,
which identifies behaviour in terms of its intended goal: the effect that is achieved
by performing the behaviour successfully.

. A wide interpretation of the pointing gesture is used in the research on intentional commu-
nication, typically as the extension of the arm and hand towards a distal target, with or without
the index finger outstretched.
. Categorising behaviour with respect to its major detectable use in acts of nonverbal ref-
erence may be valuable for comparative psychology, which investigates cognitive differences
between species, in simplifying the identification and explanation of behavioural similarities
across species in spite of existing dissimilarities among surface properties. The present account
of the basic units of referential acts provides the tools for further investigation of the varieties
of behaviour in intentional communication.
. It is very difficult to check whether a teleological interpretation of nonverbal behaviour is
correct, which is why an interpretation in terms of use generally is preferable to one in terms of
desired effect. Austin (1962) introduced a distinction between an utterances illocutionary force
and perlocutionary effect that is similar to the distinction made here. It captures the difference
between what is achieved in saying something and by saying something. The illocutionary force
concerns changes that an utterance (if successful) will produce as a result of its regular use and
meaning (informational content) in the context of utterance, but not any changes that occur in
the wider environment. The perlocutionary effect concerns changes that an utterance can be
used to achieve outside of the utterance context given that the non-linguistic contextual condi-
tions are appropriate. Brinck (2003) argues against definitions of pointing in terms of perlocu-
tionary effect, or what an act of pointing is meant to achieve, and claims instead that declarative
122 Ingar Brinck

i. Preparatory (attention getting) behaviour, e.g., gesturing, vocalising, and simi-


larly conspicuous sounds and behaviours, drawing the observers attention to the
sender.
Given that the behaviour is successful, and the observer reacts to it by turning
his or her attention to the sender, it will permit the sender to subsequently ma-
nipulate the observers behaviour via his or her attention. Thereby the behaviour
prepares for the ensuing interaction between sender and observer.

ii. Communicative-intent indicating behaviour, e.g., looks to the observers face


and eyes, gesturing, gazing, vocalising, and touching, performed relative to the
attentional status of the observer.
These communicative-intent indicators signal the senders attempt to have
attention contact and interact face-to-face with the observer. One might say that
they make the senders intention to communicate mutually manifest to sender and
observer.

iii. Referential behaviour, e.g., pointing, gazing, visual orienting towards a (distal)
target, head and body orienting in the direction of the target, gaze alternation
between the observers face and the target, and looking back, performed relative
to the attentional status of the observer.
Referential behaviour displays the senders interest in, on the one hand, a tar-
get of attention in common space, and on the other, the observers attention. If
successful, it will make the observer shift his or her attention to the senders target.
In guiding the observers attention to the target, referential behaviour fixes the
content of individual acts of nonverbal reference.

pointing is an illocutionary act with an indicating function, and that it can be used for different
purposes, and thus can have different perlocutionary effects.
. Developmental psychology defines the pointing gesture of human infants by what an act
of pointing is meant to achieve, thus distinguishing between an imperative and a declarative
form. Pointing can be used imperatively to request an object, or declaratively to achieve joint
attention to an object for some further purpose, such as exchanging experiences of it with the
observer, informing the observer about its location, initiating play that involves it, etc. (Bates
1976; Brinck 2003). Declarative pointing usually is characterised in referential terms, while im-
perative pointing is described as partly communicative, partly instrumental. Yet, given that im-
perative pointing is performed relative to the observers attentional state, it involves a referential
element in directing the observers attention to the requested object (cf. Hopkins, Leavens, and
Bard, this volume). Henceforth, the term referential pointing will be used for any pointing
gesture that satisfies this requirement.
Intersubjectivity and intentional communication 123

iv. Essentially intentional behaviour, i.e., persistent behaviour until reward, and
elaboration of behaviour when repeated attempts to communicate fail.
Essentially intentional behaviour shows that the sender understands that dif-
ferent means may be directed toward the same end and that the same means may
be used for different ends (Tomasello and Call 1997:361), and consequently is
distinctive of intentional communication.

Given that the four basic acts have counterparts in the real world, intentional
communication has systematic properties. The acts have a similar use, or function
(meaning) across contexts, independently of individual agents, and in different
combinations. Several distinct behaviours share the same function and so can
realise the same act. This permits using them selectively on different occasions to
meet contingent contextual demands. As a consequence of this flexibility in use,
both as to the selection of basic acts and the behaviour that realises the selected
act, nonverbal reference can take a great many forms.
Essentially intentional behaviour has a fundamentally different function than
the other basic acts, which gives it a unique status (Bates et al. 1979). It reinforces
the senders previous act by introducing changes in the way the act was expressed,
while preserving its quality, or content. In spite of the fact that over time essen-
tially intentional behaviour has become the scientists litmus test for intentional
capacities in the sender, it does not have a strictly communicative function, and
is not in general necessary for the success of acts of nonverbal reference. To be
more precise, essentially intentional behaviour constitutes a meta-operation that
the sender uses to enhance previous behaviour that has failed to achieve the act of
nonverbal reference. This makes it a resource for improving and repairing on-go-
ing communicative behaviour.

5. Intersubjectivity in intentional communication

Having identified and described the building-blocks of acts of nonverbal refer-


ence, it is now possible to examine whether intentional communication is inter-
subjective. Remember that the present definition of intersubjectivity is in terms
of sharing experiences, and that both mutual and individual ways of sharing have
been specified.
Of the four behaviour types, only preparatory (attention-getting) behaviour is
not intersubjective. It neither depends on, nor directly issues in a sharing of sorts,
but its role is exactly to prepare for the sharing of experiences. It functions as a
signal that during normal circumstances will elicit a certain response in the ob-
server, namely, a re-orientation of her attention towards the sender. This response
124 Ingar Brinck

may be explained in terms of reflexive behaviour as a reaction to the perception


of a salient event in common space, say, the senders waving his arms, vocalising,
or tapping on the ground.
In contrast, communicative-intent indicators and referential behaviour are
both intersubjective. This is demonstrated by the fact that unless they cause a shar-
ing of experiences, they will misfire. Communicative-intent indicating behaviour
is successful in case the observer notices the senders attempts to communicate,
and in addition alerts the sender to this recognition, preferably by establishing
attention contact with him or her. Referential behaviour is successful if it makes
the observer attend to the senders object of attention in common space. It does
not necessarily require that the agents are mutually aware of sharing experiences
to be effective, although reciprocal intersubjectivity is typical for intentional com-
munication. Whenever salient properties of the shared context can be expected
to direct the observers attention towards the target, the sender will not have to
invest herself in the other agent. As a result of the senders purposive behaviour,
contextual scaffolding sometimes can replace mutual sharing relative to referen-
tial behaviour.
Essentially intentional behaviour, finally, is directed at improving the condi-
tions for communication by changing the way in which a particular act of non-
verbal reference is expressed. It inherits its intersubjectivity in the context of use
from the particular behaviour for whose enhancement it has been invoked. This
means that essentially intentional behaviour presupposes the capacity for inter-
subjectivity in the sender.
The present inquiry into the intersubjectivity of intentional communication
suggests that although intersubjectivity often is ascribed to intentional communi-
cation on a general level, in fact, only a limited set of intentionally communicative
behaviour is fundamentally intersubjective. Such behaviour has the function of
either making manifest the senders intention to communicate by the attempts to
engage in attention contact with the observer, or guiding the observers attention
towards the senders target of attention in common space, by the senders referen-
tial behaviour performed relative to the observers attention.
To conclude, intersubjectivity plays an integral role for intentional communi-
cation. Yet, it is still uncertain whether intersubjectivity is the triggering factor of
intentional communication that elicits its onset. Data concerning the intersubjec-
tive and communicative skills in young infants that apparently speak against this
hypothesis will be presented in the next section. The claim that intersubjective
skills pave the way for intentional communication will be investigated relative to
these data in Section 7.
Intersubjectivity and intentional communication 125

6. Referential skills in early intersubjective behaviour

There is a wealth of experimental and observational data on intersubjective be-


haviour in human infants. According to the received view intersubjectivity and
intentional communication are intimately related, and in contemporary research,
intersubjectivity is one of the most commonly cited triggering factors of nonver-
bal reference. Despite this, recent data about mainly referential and gaze-related
behaviour in young infants cast doubt on the received view by showing that quite
a few intersubjective skills emerge several weeks, even months, before the capac-
ity for nonverbal reference.
First, point following: 7-month-olds have been reported to follow point when
the gesture is initiated by eye contact and a subsequent head and gaze turn towards
the object (Striano and Bertin 2005). This is puzzling, considering that normally
the pointing gesture emerges around 10 months of age, and referential pointing
is produced reliably some time between 12 and 15 months. Why does intentional
communication occur so late, when apparently similar capacities emerge much
earlier? The fact that point following pertains to the passive role of an observer
does not in itself constitute a problem for the infant in this respect, because as a
rule, infants first learn to understand communicative acts, and only shortly after
that to perform them (Camaioni et al. 2004).
Second, gaze following: A preference for eye contact that is similar to the one
that occurs in contexts of intentional communication has been observed already
in 25-days-old babies (Farroni et al. 2002). They attend to direct, but not averted
gaze. Experiments on slightly older infants indicate that eye contact in newborns
may facilitate later face processing and perception of gaze direction (Farroni,
Johnson and Csibra 2004). Thus, 34-month-olds have been observed to follow
gaze-shift (DEntremont, Hains and Muir 1997), and perceived lateral motion will
cue spatial location in 4-month-olds if preceded by eye contact (Farroni et al.
2003). The cueing effect is analogous to the one that results from referential gaze
reading in older infants. These data suggest that communicative-intent indicating
behaviour has an early precursor in the babys preference for direct gaze. They
both signal the senders attempt to communicate to the observer. In contrast, by 6
months of age, infants will match direction of gaze as signalled by mere head turn
within their own visual field (DEntremont 2000; DEntremont et al. 1997; Mo-
rales, Mundy and Rojas 1998). By this age an initial eye contact is less important
for the matching of direction than it is for younger infants.
Third, gaze alternation: Around 7 or 8 months, infants start to alternate gaze in
contexts of social referencing (Campos and Steinberg 1981). In ambiguous or dis-
tressing situations, they look to the adults face and then back at the object, seeking
126 Ingar Brinck

emotional information from the adult by looking at his or her face, and, as it seems,
using it to evaluate the situation. A similar behaviour occurs in unpredictable situ-
ations, and after infants intentional actions (Reddy 2005), when emotional cues
appear to be sought for confirmation of the infants evaluation of the event.
The significance of these data is not fully clear. However, granted that infants
engage in attention contact, follow point and line of regard, and alternate gaze sev-
eral months before the onset of intentional communication, the view that all the
relevant intersubjective skills emerge at a precise age seems incorrect (cf. Tomasello
1995, 1999 on the 9-month revolution). The data seem to support the opposing view
that intersubjectivity develops gradually (Reddy 2005; Striano and Bertin2005;
Striano and Rochat 1999). However, infant communication does show signs of a
transition from one kind of behaviour to another around 12 months, when triadic
interaction becomes more important and elaborate. This suggests that while inter-
subjectivity is crucial for intentional communication, the presence of intersubjec-
tive skills as such does not imply skills for intentional communication.

7. Why does nonverbal reference not emerge before 12 months?

The question is why nonverbal reference does not emerge before 12 months,
since the capacity for it appears to be in place much earlier. Three answers will
be considered with regard to this issue. A first answer is based in the research
on socio-emotional behaviour and autism. It takes its starting-point in the claim
that human infants have a natural inclination to interact with their caregivers by
exchanging facial expressions of emotion, irrespective of other goals than shar-
ing experiences (Hobson 2002; Tomasello 1999; Trevarthen 1979). Infants who
go on to be diagnosed with autism behave differently. For instance, they do not
show a distinct preference for visual attention contact or for pointing merely to
share attention. Observed differences in the communicative behaviour of autistic
and non-autistic children are taken as evidence that certain intersubjective abili-
ties, missing in autistic children, are necessary for intentional communication.
Among these we find social reciprocity, emotional relatedness, the recognition
of psychological self-other equivalence (by matching of mental states), and the
capacity to identify with the other (Barresi and Moore, this volume; Hobson 2002;
Hobson 2005; Meltzoff and Brooks 2001). Further evidence for the view that so-
cio-emotional factors cause the emergence of intentional communication comes
from experimental research on apes that test for typically human intersubjective
capacities. The tests demonstrate behavioural differences between human infants
and apes that in certain respects parallel those between non-autistic and autistic
Intersubjectivity and intentional communication 127

children (cf. Tomasello 1999; Tomasello et al. 2005). Because the observed differ-
ences are correlated with differences in the capacity for intentional communica-
tion, they are held to support the view that intentional communication requires
specifically human, socio-cultural capacities.
A second answer relies on experimental evidence about, on the one hand,
perceptual skills for reading individual goal-directed action, and on the oth-
er, referential skills for understanding action, bodily manifest emotion, and
gaze as linking agent and world. These skills manifest themselves in behaviour
such as gaze following and social referencing and are held to be in place by
9 to 12 months, if not before (Brooks and Meltzoff 2005; Moses et al. 2001;
Woodward2005). Csibra (2003) holds that they represent two kinds of action
understanding that operate independently of each other, but rely on a similar
blind tracking of perceptual cues in respectively instrumental and communica-
tive contexts. The tracking capacity functions without the attribution of mental,
intentional, or representational states. Csibra (2003:448) further conjectures
that the two skills combine into a higher-order mentalistic understanding dur-
ing the second year.
In focusing on capacities for dyadic emotional engagement, the first answer
emphasizes the importance of primary, face-to-face intersubjectivity for engag-
ing with others, leaving the referential relation between agent and object aside
(Trevarthen and Hubley 1978). The second answer revolves around capacities
for attention reading and co-ordinated joint attention that do not involve social
reciprocity. It stresses the role of secondary, side-by-side intersubjectivity (fac-
ing the shared target) for intentionality, but neglects the importance of explicit
attentional engagement for triadic relations. Nevertheless, there is no reason to
think of the two as mutually exclusive; on the contrary, they are complementary.
As argued in Section 5, both kinds of intersubjectivity are required for intentional
communication. Communicative-intent indicating behaviour involves face-to-
face intersubjectivity, or a mutual sharing of experiences by attention contact,
whereas referential behaviour relies on side-by-side intersubjectivity.
Both answers provide explanations of why the early presence of intersubjec-
tive capacities does not trigger intentional communication. The first one refers to
an insufficient understanding of socio-cultural practices by dyadic infants as the
reason why they develop intentional communication only at 12 months. Even so,
it does not specify which those precise mechanisms are that socio-cultural prac-
tises can modulate, or in what respects they are conducive to intentional com-
munication. The second answer explains early referential behaviour as the side
effect of a blind tracking of motion cues, and is detailed enough to have a bearing
on the data. Still, the hypothesis that action understanding is mechanistic and
128 Ingar Brinck

automatic before the age of 1 year, and that by 2 years the automatic skills com-
bine to yield an interpretative system, does not account for the progress from one
level to the other.
Tomasello (1999) provides a third, reconciliatory answer in an attempt to
combine the previous two approaches. He asserts that both dimensions of in-
tersubjectivity the motivation to share emotions and the understanding of
intentional action are necessary for human-like cognition. In the same vein,
Tomasello and Rakoczy (2003:125) declare that the understanding of persons as
intentional agents who have a perspective on the world that can be followed into,
directed, and shared constitutes a crucial social-cognitive skill. Tomasello et al.
(2005) emphasise the importance of shared (we) intentionality and of collabora-
tive (as against competitive) co-operation for achieving typically human ends.10
They maintain that dialogic cognitive representations emerge by 14 months that
integrate the first- and third-person perspectives via internalisation and allow for
specifically human, collaborative, cultural practices (2005:683f., 689). However,
it is not clear that the present form of integration of perspectives would permit
such practices, or exactly how it would do so. The claim that dialogic representa-
tions permit specifically human forms of collaboration trades on an ambiguity
in the concept of a third-person perspective. Although infants can integrate the
first- and third-person perspectives relative to a shared (physical) space, this abil-
ity will not extend to regular kinds of future-directed collaborative co-operation.
Such co-operation requires representations that do not contain indexical and de-
monstrative components (Brinck and Grdenfors 2003), and are not available to
the infant by 14 months. Furthermore, because of the all-inclusive nature of the
theory, the significance of dialogic cognitive representations for the onset of in-
tentional communication is uncertain.
None of the answers fully explain why global capacities for intentional com-
munication emerge by 1215 months and not before. A satisfactory account
would have to be explicit about the details. Nonetheless, recognising the con-
tinuous development of nonverbal reference is important for identifying changes
in intersubjective behaviour that may be relevant. Next, we return to the data
presented in Section 6 to search for a pattern that might indicate the direction in
which to pursue the inquiry.

10. Brinck and Grdenfors (2003) express a similar view. They describe collaborative co-opera-
tion as directed at future goals that are unrelated to the context of (inter)action. Sharing inten-
tions about future goals requires the capacity for sharing representational content that does not
depend on the actual context. In contrast, competitive co-operation concerns goals that exist as
resources in the shared environment and sometimes appear in the context of (inter)action.
Intersubjectivity and intentional communication 129

8. The effect of direct gaze and eye contact on gaze following

Direct gaze and eye contact have been seen to have a strong impact on the capac-
ity for intersubjective behaviour. 4-month-olds follow gaze if it is preceded by eye
contact, 7-month-olds alternate gaze in contexts of social referencing, and follow
point if initiated by eye contact and a subsequent head turn towards the target.
Gaze will be perceived differently and provoke other reactions if initially oriented
towards the infants attention, than if directly oriented towards a target of action.
Eye contact prompts a referential reading (in the direction of the target) of be-
haviour in the infant, first of averted gaze, and a few months later of the pointing
gesture. Do these data support the hypothesis that infants have capacities for in-
tentional communication already by this age? It will be argued that the answer is
negative, and that the referential reading is an effect of contextual enhancement.
The context in which referential, gaze-related behaviour is performed clear-
ly plays a decisive role for how the young infant interprets the behaviour. To-
gether, the physical layout and the senders behaviour simultaneously bolster
and determine the infants response, which makes it likely that the mechanism
that produces the response is a property of the local environment rather than of
the infant. Consequently, the referential response to gaze requires an increased
awareness of the layout of the physical context, but does not imply that the infant
has acquired similar capacities to those demanded for nonverbal reference. On
the contrary, to benefit from the positive effects of eye contact on the capacity for
reading behaviour referentially, further cognitive capacities than those acquired
during the first months in life are not necessary. Seen from an evolutionary point
of view, this is not surprising, since recycling is natures way of increasing per-
formance while minimizing the costs. Gaze plays a vital role for initiating and
maintaining dyadic engagement in early infancy. Such engagement in turn un-
derlies the understanding of attention, the experience of self-other equivalence,
and intersubjectivity (see Section 2). Given that early infant-adult interaction is
centred on attention contact, it comes as no surprise that the adults attempt to
establish eye contact a few months later still will signal an invitation to interact
to the infant, who as a consequence will expect the adults ensuing action to be
of concern to the self. In summary, gaze has a particular social function for the
infant in episodes of prolonged eye contact that most probably originates from
dyadic engagements in early infancy.
By 3 or 4 months, infants are capable of recognising the social function of
gaze in contexts of joint attention. The information that the infant picks up from
having had eye contact with the adult concerns the adults readiness to interact
with the infant. Direct gaze has an imperative dimension that directs the infant to
respond to the adults action and follow his or her gaze in the direction of a target.
130 Ingar Brinck

The fact that eye contact enhances the infants capacity for the referential reading
of behaviour strengthens the view that intersubjectivity is a major driving force in
the development of intentional communication.
By 6 months, eye contact loses its importance for gaze following. That this be-
haviour changes its function indicates that at this stage of development direct gaze
is primarily supportive of other behaviour, and cognitive rather than genuinely
communicative. Thus, in Section 4, it was argued that the functions of the four ba-
sic acts of intentional communication are stable and do not vary between contexts
or users. Because it is unnecessary to negotiate the meaning of these acts, more
effort can be put into producing and responding to the message. Furthermore, the
expression of an act can be adjusted to the context without a change in meaning.
In contrast to the behaviour discussed above, it is uncertain that the intersub-
jective skills that occur in contexts of social referencing predict capacities for in-
tentional communication. The nature of social referencing makes it unlikely that
the infant by this age would use referential skills, although such skills are avail-
able. Moses et al. (2001) submit that there is evidence for referential understand-
ing in the emotions domain by 12 months. Yet infants begin to seek information
from adults about distal objects already by 7 months by alternating gaze between
object and adult. This behaviour might be taken to demonstrate spontaneous ref-
erential behaviour already by this age. However, it is not very probable that gaze
alternation is referential in social referencing. First, it would mean that the capac-
ity for nonverbal reference would be functional in an isolated domain for a long
period, several months, before it was generalised, which is atypical. Second, the
significant information to the infant concerns the mothers attitude to the object,
not the object itself. The infant is seeking information about a target that has been
individuated before social referencing is initiated and that moreover motivates
initiating it. Therefore the infant is unlikely to attend to the referential relation
between the adults facial expression and the target, but rather attends to how the
target is affecting the adult emotionally. Third, on the assumption that this infor-
mation is available by a from a cognitive point of view cheaper strategy than the
referential one, the referential interpretation of social referencing is gratuitous.
An alternative, more plausible interpretation of social referencing is that the
behaviour ultimately relies on learning about the function of facial expressions of
emotion during early dyadic engagement. Eventually this knowledge generalises
to the adults responses to events in the surroundings. The infant knows that fa-
cial expression signals the observers reaction to a salient target. When perceiving
a salient object of uncertain value, information about what to do can be found
by looking to the adults face. In all likelihood, gaze alternation here indicates
that the infant is monitoring the situation. Even though intersubjectivity can en-
able behaviour typical for intentional communication, evidently it does not do
Intersubjectivity and intentional communication 131

so automatically, but only in certain kinds of contexts. To young infants, reading


behaviour referentially is demanding. Since there is no need for it in social refer-
encing, the cost would hardly be motivated.
To conclude this section, the examination of the data has established that
some intersubjective behaviour constitutes precursors to intentional communica-
tion, but that intersubjective skills nevertheless do not entail skills for intentional
communication. The fact that referential gaze will acquire different functions in
different contexts of action depending on whether eye contact is involved makes
it interesting to pursue the analysis of intersubjectivity in a new direction. Below
the hypothesis will be investigated that the influence of intersubjectivity on the
capacity for intentional communication is not uniform, but differs according to
which kind of experience a given behaviour concerns.

9. Interaffectivity, interattentionality, interintentionality

Sterns (1985) distinction between interaffectivity, interattentionality, and interin-


tentionality will constitute the basis for developing the proposal that not merely how
experiences are shared (mutually or individually, in dyadic or triadic behaviour)
matters for the developmental trajectory of intentional communication, but what
kind of experiences also matters. It will be argued that the three forms of intersub-
jectivity enable distinct forms of behaviour; thus, referential behaviour depends on
interattentionality, and communicative-intent indicators on interaffectivity. Addi-
tional data on gaze-related and referential behaviour will be discussed with the
double purpose of illustrating and testing the adequacy of the hypothesis.
The distinction between emotional, intentional, and attentional mental states
corresponds to the traditional classification of mental acts in terms of affective,
cognitive, or conative function. The analogy reveals the major ways in which a
sharing of experiences may contribute to over-all behaviour. Affect concerns the
experience and expression of emotion and is associated with positive or negative
attitude and evaluation; cognition concerns the monitoring and control of inten-
tional states from a general, sometimes a meta-, perspective; and conation concerns
selective attention, interest, and action readiness with regard to a concrete context.
Depending on which kind of experiences the agents are sharing, the interaction
will be predominantly interaffective, interattentional, or interintentional.
Stern (1985) describes the three forms of intersubjectivity as follows (italics
added): Interaffectivity consists in the infants matching its own feeling state as ex-
perienced within with the feeling state seen on or in another (1985:132), and
still by 12 months, affective exchange is the predominant mode and substance of
communications (ibid:133). Interattentionality means that the infant has some
132 Ingar Brinck

sense that persons, including the infant itself, can have individual, different, at-
tentional foci, which can be brought into alignment and shared (ibid:130). In-
terintentionality impl[ies] that the infant attributes an internal, mental state to
the adult, namely, comprehension of the infants intention and the capacity to
intend to satisfy that intention. Intention is a shareable, but not necessarily self-
aware, experience (ibid:131).
The above definitions involve assumptions about the qualitative nature of
intersubjective states, which make them problematic for the present inquiry into
the nature of intersubjectivity. Because the definitions focus on qualitative expe-
riences instead of observable behaviour, they cannot be used to identify inter-
subjective behaviour except from the first-person perspective. This may reduce
their value for some types of empirical and experimental research. To avoid these
problems while preserving Sterns intuitions, the concepts will be redefined as
follows:11

Interaffectivity is the simultaneous matching of affects and emotions to the af-


fects and emotions displayed by another agent in overt behaviour. It emerges
in the guise of emotional contagion, and the infant soon develops skills for
monitoring the emotions of self and other. Interaffectivity is dyadic, and un-
derlies behaviour such as proto-conversation, social referencing, and atten-
tion contact.
Interattentionality is the alignment of attention to the attention displayed by
another agent in overt behaviour. It can equally well be a result of contagion
as of purposive behaviour. Attentional states spring from arousal, and reflect
the agents interest and action readiness with respect to a target of action.
Therefore, interattentionality only makes sense relative to a context of action,
although it is not always explicitly organised around a shared focus of atten-
tion or third entity. It supports behaviour such as gaze alternation, gaze fol-
lowing, attention reading, and forms of joint attention that do not require a
mutual sharing of experiences.
Interintentionality is the sharing of information with another agent about the
intentions and beliefs of the self and others, first by ostensive, bodily-based
means, such as gaze, gesture, and vocalisation, later in development by sym-
bolic means, for instance, verbally. It is essentially triadic and permits taking
an allocentric or de-centred perspective.

11. The descriptions of the three forms of intersubjectivity are not intended to reveal the whole
truth about intersubjectivity, nor do they constitute an attempt to replace phenomenological
descriptions.
Intersubjectivity and intentional communication 133

These three forms of intersubjectivity are not intrinsically related, and can de-
velop independently of each other. During normal conditions, they interact in
different ways at different stages to enable complex behaviour. The infant begins
to explore interaffectivity soon after birth in mutual engagement, and later in-
terattentionality, whereas the capacity for interintentionality emerges some time
after 12 months and continues to develop for years to come. Irregularities in their
respective developmental trajectories give rise to weaknesses or disturbances in
an agents action repertoire relative to the range of behaviour that the failing form
of intersubjectivity normally supports.
Data about gaze reading will now be used to exemplify how different forms
of intersubjectivity sustain behaviour, and illustrate how the interaction between
interattentionality and interaffectivity causes developmental changes in intersub-
jective behaviour during the infants first 6 months in life.
By four months, infants react differentially to perceived motion depending
on the triggering conditions (Farroni et al. 2000, 2003). Cue-driven saccades are
elicited when the models pupils shift from a central position to either side, and
occur before the appearance of the target is recorded. The pupils do not have to
occur in the context of an upright face to have this effect. Target-driven saccades,
in the direction of the target, are elicited only after a period of eye contact with
an upright face. The infant will then react equally to directed movement of either
head or eye gaze (pupil shift). Both responses depend on the capacity for interat-
tentionality, but occur by different processing.
Similarly to the data presented in Section 8, the data on the effect of triggering
conditions on response demonstrate that from quite early on infants intuitively
read gaze referentially in case the gaze shift is preceded by eye contact. Eye con-
tact has a special role for social interaction, appreciated already in the first month,
which shows that mutual attention appeals to other ways of behaving than do con-
tingent perceptual cues (Farroni et al. 2000; Farroni et al. 2002; cf. Emory2000).
In view of the previous discussion, the social function of gaze can be identified
with reference to interaffectivity. Thus, the fact that eye contact produces a mutual
sharing of affect would cause the infant to pursue the interaction.
However, eye contact will also cancel cue-driven gaze shift, because only
reflexive responses to unidentified stimuli are cue-driven.12 Consequently, eye
contact might instead cause target-driven gaze shift indirectly, by eliminating the

12. Gaze is re-oriented automatically by salient, contingent behaviour, which trigger quick and
effortless, cued gaze shifts (Driver et al. 1999; cf. Chawarska, Klin and Volkmar 2003). Because
the shift occurs while the behaviour is processed in the early vision or pre-attention system of
the brain that is unavailable to conscious awareness, the subject cannot access the process to
directly intervene and inhibit behaviour matching.
134 Ingar Brinck

cue-driven alternative. In spite of this, it is attractive to explain the enhancing role


of eye contact for referential gaze reading as an effect of interaffectivity, because
eye contact later returns in a similar role and again results in a referential reading,
but of other actions, such as pointing. Therefore interaffectivity seems to be the
cause of the referential response, rather than for this response to occur merely
because the cue-driven gaze shift was cancelled.
By 6 months, the infant begins to interpret gaze as goal-directed even in the
absence of eye contact. This behaviour indicates a developmental generalisation of
the response to gaze to target-driven saccades. The infant now can read any gaze
referentially, and the referential reading will replace the earlier cue-driven reac-
tion to gaze shift in most contexts, although cue-driven saccades remain opera-
tive in reflexive behaviour such as contagious attention (also in the adult). Hence,
interaffectivity has stopped being a factor in gaze following, although direct gaze
retains its social function. Slightly later, eye contact with the adult will prompt the
infant to follow pointing. Another developmental generalisation has occurred,
of the infants response to direct gaze to new forms of communicative behaviour.
Eye contact can now be used to disambiguate novel types of interaction, and thus
enhances the general capacity for intentional communication.
To summarize, shortly after 6 months the infant can discriminate between
the referential and social uses of gaze. This means that several months before
the onset of intentional communication, two mechanisms for reading gaze have
emerged an instrumental one for intentions to act and a communicative one for
intentions to interact. The instrumental mechanism attributes a referential func-
tion to the gaze of other agents, by which gaze indicates the senders target of
attention and current interest. The communicative mechanism causes direct gaze
to signal communicative intent to the observer, and enhances the observers at-
tention to the other agent.
The influence of direct gaze on social interaction suggests that intersubjec-
tivity can be an instrument for meta-cognition. Interaffectivity enables an under-
standing of communicative intent-indicating behaviour by relating the agents in
the mutual exchange of emotions from a second person-perspective. Interatten-
tionality enables referential behaviour by causing a target-driven, still unattended
and implicit, attention shift in the observer of the behaviour, which reflects the
understanding of action as goal-directed. In mutual engagement, interaffectivity
will introduce interattentionality into the monitoring of the interaction by draw-
ing the agents attention to their mutual affective experiences, and then to the fact
that they are mutually attending to each other. Consequently, interaffectivity and
interattentionality together are able to dynamically increase the agents control
over the on-going interaction, making the interaction mutually transparent.
Intersubjectivity and intentional communication 135

The chapter so far has investigated and expanded on the major elements of
intersubjectivity in infants aged up to about 9 months. It remains to determine
the triggering factors of the over-all capacity for intentional communication. In
the following section, it is claimed that by 12 months of age, infants start to fa-
miliarize themselves with a few of the general skills that interintentionality leads
to, and attempt to disengage from the previously mandatory egocentric and situ-
ated point of view. This cognitive change characterises the decontextualisation of
communicative skills, which is necessary for the complete mastery of intentional
communication.

10. The onset of intentional communication

Some time between 12 and 15 months, infants begin to reliably produce and un-
derstand referential pointing. This behaviour is usually taken as a sign of the ca-
pacity for intentional communication, because it shows that the infants grasp of
the gesture is consistent and bi-directional. It is hard to say which those single fac-
tors are that make it possible for the infant to access intentional communication
at this very moment, because by this age, behaviour is quite complex and results
from the interaction between several underlying capacities. Moreover it is en-
hanced by environmental and interactional scaffolding.13 It seems clear, though,
that one factor is distinctive of the period: the decontextualisation of both com-
municative and cognitive skills (Bates et al. 1979). Context-independent cogni-
tive skills start developing, and interintentionality will slowly mesh with the two
existing forms of intersubjectivity. This will radically enhance the infants ability
to control the environment.
Two abilities are especially important for the development of nonverbal refer-
ence. First, there is the ability to distinguish instrumental, goal-directed intention-
ality that is pulled by the context of action from communicative, intention-guided
intentionality that the agent pushes into the world. The infant must recognise the
difference between goal-directed and intention-guided actions to fully appreciate
the reference relation, in referential pointing as in verbal language. Second, there
is the meta-cognitive ability to regulate behaviour with respect to a general prin-
ciple or rule of action.
By 15 months, most infants can both produce and understand acts of nonver-
bal reference, although they do not achieve this by a similar mechanism for read-
ing speaker intention as to adults. They still exploit physical-functional action

13. See Brinck (2007) for the constitutive role of environmental scaffolding for the evolution of
cognition.
136 Ingar Brinck

properties and contextual cues to understand gaze, point, and reach alike (Sodian
and Thoermer 2001; Thoermer and Sodian 2004). The fact that gaze following
now may be driven by any contextual features that happen to support it on a given
occasion shows that infants by this age do not yet conceive of the sender-object
relation as determined by the senders intention. Still, their use is reliable and
appears consistent to an observer. Apparently they have acquired a general meth-
od for mastering referential acts, which in most contexts is good enough. They
can use a range of referential behaviour in different combinations, in a variety of
contexts, and during different conditions. The major requirement for having this
competence is the ability to detach communicative behaviour from the original
context in which it was learnt, and then to extend its use to an open-ended num-
ber of contexts. This means that understanding that the senders intention deter-
mines the reference relation is less important for mastering referential behaviour
than is the ability to generalise.
Decontextualisation is a gradual process that proceeds in what might seem
like a random manner. In previous sections it has been argued that interaf-
fectivity and interattentionality contribute to a general understanding of be-
haviour by increasing the infants awareness of social interaction and the pro-
cess of communication, and also by providing the instruments for managing
this process in particular contexts. Accordingly, decontextualisation concerns
changes in on the one hand the infants global understanding of communica-
tion, and on the other, concrete contexts of interaction. It has several aspects.
Behaviour can be extended to new kinds of contexts, while keeping the original
function; it can be generalised to cover new situations, which will produce a
change in the original function; it may be detached from any specific context of
use by abstraction, losing exactitude while becoming inclusive; and it may be
idealised, and changed into a general behaviour-guiding principle that controls
certain contexts of (inter)action. This will transfer the behaviour to the meta-
cognitive domain.
Although robust skills for intersubjective behaviour occur already by 6
months, they are only intermittently available to the infant, because they are
linked to particular contexts and patterns of (inter)action, usually the ones in
which they were learned. Much of the behaviour shows traces of ritualisation,
in contrast to intentionally communicative behaviour that characteristically is
decontextualised (Brinck 2001; Brinck 2003). What from a general perspective
appears to be identical, equally demanding contexts may still be handled in very
different ways by the same infant, because behavioural competence is not a simple
matter of development, but depends on the local affordances and constraints. To
optimize performance, behaviour is fine-tuned to local properties physical and
Intersubjectivity and intentional communication 137

functional as well as social ones. As a consequence, in one and the same infant
one may find quite different behaviours in two related situations.
In Section 9, it was argued that the sharing of experiences concerns affect,
attention, or intention, and that each of these kinds of experience enables certain
behaviour. This view suggests that communication depends on intersubjectivity
in quite specific ways. Interaffectivity and interattentionality are operative soon
after birth and support the development of communication during the first year,
while skills for interintentionality that sustain complex and general forms of in-
tentional communication develop later. Thus separate intersubjective behaviours
first develop in parallel, but after a few months start to interact. Decontextualisa-
tion begins towards the end of the first year. Eventually the many manifestations
of intersubjectivity organise themselves into the four functional units (the basic
acts) that are constitutive of intentional communication (see Section 4).
In contrast to views that describe the development of intentional commu-
nication as clear-cut, linear, and stage-like, the present one has similarities with
both Woodwards (2005:124) and Reddys (2005). Woodward attests that infants
accrue knowledge about particular actions gradually during the first year of life,
while Reddy argues that the development of attention is continuous and proceeds
from engagement. Woodwards findings suggest that infants initially encode ac-
tions at a detailed level, which causes them to respond to similar actions in dif-
ferent ways. Woodward (2005:125) further proposes that there are varied devel-
opmental relations between different levels of responses, a statement that is in
agreement with the present view.
Nevertheless, a systematic account of the development of intentional commu-
nication is within reach. Although intersubjectivity is complex and its develop-
ment follows variable trajectories, its progress is continuous. Mapping it out not
only provides for tying the emergence of various behaviours to specific periods
in time and understanding how they interact, but also for elucidating the general
principles behind the development of intersubjectivity as well as of intentional
communication.

Acknowledgements

Work on this chapter was gracefully supported by The Swedish Research Council
and has profited from research conducted within the project Stages in the Evolu-
tion and Development of Sign Use (SEDSU), funded by the European Commission
under the FP6 programme. I am grateful to Jordan Zlatev, Tim Racine, and Chris
Sinha for comments on earlier versions of the chapter.
138 Ingar Brinck

References

Austin, J.L. 1962. How to Do Things with Words. Oxford: Oxford University Press.
Bard, K. 1992. Intentional behaviour and intentional communication in young free-ranging
Orangutans. Child Development 63: 11861197.
Barresi and Moore, this volume. The neuroscience of social understanding.
Bates, E. (ed.). 1976. Language and Context. The Acquisition of Pragmatics. New York: Academic
Press.
Bates, E., Benigni, L., Bretherton, I., Camaioni, L. and Volterra, V. 1979. The Emergence of Sym-
bols. New York: Academic Press.
Brinck, I. 2001. Attention and the evolution of intentional communication. Pragmatics and
Cognition 9 (2): 255272.
Brinck, I. 2003. The pragmatics of imperative and declarative pointing. Cognitive Science
Quarterly 3 (4): 429446.
Brinck, I. 2004. Joint attention, triangulation and radical interpretation: A problem and its
solution. Dialectica 58 (2): 179205.
Brinck, I. 2007. Situated cognition, dynamic systems, and art. On artistic creativity and aes-
thetic experience. JanusHead 9 (2): 407431.
Brinck, I. and Grdenfors, P. 2003. Co-operation and communication in apes and humans.
Mind and Language 18 (5): 484501.
Brooks, R. and Meltzoff, A.N. 2002. The importance of eyes: How infants interpret adult look-
ing behaviour. Developmental Psychology 38 (6): 958966.
Brooks, R. and Meltzoff, A.N. 2005. The development of gaze following and its relation to
language. Developmental Science. 8 (6): 535543.
Bhler, K. 1934. Rpt. Theory of Language. The Representational Function of Language. Amster-
dam: John Benjamins.
Camaioni, L., Perucchini, P., Bellagamba, F. and Colonnesi, C. 2004. The role of declarative
pointing in developing a theory of mind. Infancy 5 (3): 291308.
Campos, J.J. and Steinberg, C.R. 1981. Perception, appraisal, and emotion: The onset of so-
cial referencing. In Infant Social Cognition: Empirical and Theoretical Considerations,
M.E.Lamb and L.R. Sherrod (eds.), 273314. Hillsdale, N.J.: Erlbaum.
Chawarska, K., Klin, A. and Volkmar, F. 2003. Automatic attention cueing through eye move-
ment in 2-year-old children with autism. Child Development 74 (4): 11081122.
Csibra, G. 2003. Teleological and referential understanding of action in infancy. Philos. Trans.
R Soc. B Biol. Sci. 29: 447458.
Decety, J. and Ingvar, D.H. 1990. Brain structures participating in mental simulation of motor
behavior: A neuropsychological interpretation. Acta Psychologica 73: 1324.
DEntremont, B. 2000. A perceptual-attentional explanation of gaze following in 3- to 6-
months-olds. Developmental Science 3: 302311.
DEntremont, B., Hains, S.M.J., and Muir, D.W. 1997. A demonstration of gaze following in
3- to 6-month-olds. Infant Behavior and Development 20: 569572.
Driver J., Davis G., Ricciardelli P., Kidd P., Maxwell E. and Baron-Cohen S. 1999. Gaze percep-
tion triggers reflexive visuospatial orienting. Visual Cognition 6 (5): 50954.
Emory, N.J. 2000. The eyes have it: The neuroethology, function and evolution of social gaze.
Neuroscience and Biobehavioral Reviews 24: 581604.
Intersubjectivity and intentional communication 139

Farroni, T., Johnson, M.H., Brockbank, M. and Simion, F. 2000. Infants use of gaze direction
to cue attention: The importance of perceived motion. Visual Cognition 7 (6): 705718.
Farroni, T., Csibra, G., Simion, F. and Johnson, M.H. 2002. Eye contact detection in humans
from birth. PNAS 99 (14): 96029605.
Farroni, T., Mansfield, E.M., Lai, C. and Johnson, M.H. 2003. Infants perceiving and acting on
the eyes: Tests of an evolutionary hypothesis. Journal of Experimental Child Psychology
85: 199212.
Farroni, T., Johnson, M.H. and Csibra, G. 2004. Mechanisms of eye gaze perception during
infancy. Journal of Cognitive Neuroscience 16 (8): 13201326.
Gallese, V., Keysers, C., and Rizzolatti, G. 2004. A unifying view of the basis of social cogni-
tion. Trends in Cognitive Sciences 8 (9): 396403.
Goldman, A. 2006. Simulating Minds: The Philosophy, Psychology and Neuroscience of Mind-
reading, New York: Oxford University Press.
Hobson, R.P. 2002. The Cradle of Thought. London: Pan Macmillan.
Hobson, R.P. 2005. What puts jointness into joint attention? In Joint Attention, Communica-
tion, and Other Minds, N. Eilan, C. Hoerl, T. McCormack and J. Roessler (eds.), 185204.
Oxford: Oxford University Press.
Jeannerod, M. 1997. The Cognitive Neuroscience of Action. Oxford: Blackwell Publ.
Leavens, D.A., Hopkins, W.D., and Thomas, R.K. 2004. Referential communication by chim-
panzees (Pan troglodytes). Journal of Comparative Psychology 118 (1): 4857.
Leavens, D.A., Russell, J.L., and Hopkins, W.D. 2005. Intentionality as measured in the persis-
tence and elaboration of communication by chimpanzees (Pan troglodytes). Child Devel-
opment 76: 29130.
Leavens, D.A., Hopkins, W.D. and Bard, K. this volume. The heterochronic origins of explicit
reference.
Meltzoff, A.N. and Brooks, R. 2001. Like me as a building block for understanding other
minds: Bodily acts, attention, and intention. In Intentions and Intentionality. Foundations
of Social Science, B. Malle, L.J. Moses and D.A. Baldwin (eds.), 171191. Cambridge, Mass.:
MIT Press.
Morales, M., Mundy, P. and Rojas, J. 1998. Gaze following and language development in six-
month-olds. Infant Behavior and Development 21: 373377.
Moses, L.J., Baldwin, D.A., Rosicky, J.G. and Tidball, G. 2001. Evidence for referential under-
standing in the emotions domain at twelve and eighteen months. Child Development 72
(3): 718 735.
Reddy, V. 2003. On being the object of attention: Implications for self-other consciousness.
Trends in Cognitive Sciences 7 (9): 397402.
Reddy, V. 2005. Before the third element. In Joint Attention, Communication, and Other
Minds, N. Eilan, C. Hoerl, T. McCormack and J. Roessler (eds.), 85109. Oxford: Oxford
University Press.
Rizzolatti, G., Fadiga, L., Fogazzi, L. and Gallese, V. 2002. From mirror neurons to imitation:
facts and speculations. In The Imitative Mind, A.N. Meltzoff and W. Prinz (eds.), 247266.
Cambridge: Cambridge University Press.
Sodian, B. and Thoermer, C. 2004. Infants understanding of looking, pointing, and reaching
as cues to goal-directed action. Journal of Cognition and Development. 5 (3), 289316.
Stern, D.N. 1985. The Interpersonal World of the Infant. New York, NY: Basic Books.
140 Ingar Brinck

Striano, T. and Bertin, E. 2005. Social-cognitive skills between 5 and 10 months of age. British
Journal of Developmental Psychology 23: 111.
Striano, T. and Rochat, P. 1999. Developmental links between dyadic and triadic social compe-
tence in infancy. British Journal of Developmental Psychology 17: 551562.
Thoermer, C. and Sodian, B. 2001. Preverbal infants understanding of referential gestures.
First Language 21: 245264.
Tomasello, M. 1998. Reference: Intending that others jointly attend. Pragmatics and Cogni-
tion, 6 (1/2): 229243.
Tomasello, M. 1999. The Cultural Origins of Human Cognition. Cambridge, MA: Harvard Uni-
versity Press.
Tomasello, M. and Call J. 1997. Primate Cognition. New York: Oxford University Press.
Tomasello, M., Carpenter, M., Call, J., Behne, T., and Moll, H. 2005. Understanding and sharing
intentions: The origins of cultural cognition. Behavioral and Brain Sciences 28: 675735.
Tomasello, M. and Rakoczy, H. 2003. What makes human cognition unique? From individual
to collective shared intentionality. Mind and Language 18 (2): 121147.
Trevarthen, C. 1979. Communication and cooperation in early infancy: A description of pri-
mary intersubjectivity. In Before Speech, M. Bullowa (ed.), 321347. Cambridge: Cam-
bridge University Press.
Trevarthen, C. and Aitken, K. 2001. Infant intersubjectivity: Research, theory and clinical ap-
plications. Journal of Child Psychology and Psychiatry 42 (1): 348.
Trevarthen, C. and Hubley, P. 1978. Secondary intersubjectivity: Confidence, confiding, and
acts of meaning in the first year. In Action, Gesture, and Symbol: The Emergence of Lan-
guage, A. Lock (ed.), 183229. New York: Academic Press.
Wicker, B., Keysers, C., Plailly, J., Royet, J.-P., Gallese, V., and Rizzolatti, G. 2003. Both of us
disgusted in my insula: The common neural basis of seeing and feeling disgust. Neuron
40: 655664.
Woodward, A.L. 2003. Infants developing understanding of the link between looker and ob-
ject. Developmental Science 6: 297311.
Woodward, A.L. 2005. Infants understanding of the actions involved in joint attention. In
Joint Attention, Communication, and Other Minds, N. Eilan, C. Hoerl, T. McCormack and
J. Roessler (eds.), 85109. Oxford: Oxford University Press.
chapter 7

Sharing mental states


Causal and definitional issues in intersubjectivity

Noah Susswein and Timothy P. Racine

In this chapter we analyse intersubjectivity and related psychological con-


cepts. We focus on distinguishing between causal and definitional issues in
early social development, between categorical explanations of what an organ-
ism is doing and causal explanations of how or why it is doing it. We argue
that intersubjectivity is a taxonomic rather than a causal explanatory concept,
a technical concept used to classify interactive behaviours and abilities rather
than to denote vehicles or causes of those behaviours and abilities. We begin
by examining the idea that intersubjective engagement involves the sharing of
mental states and argue that the role of mental states and experience in inter-
subjective engagement is misconstrued. In the final sections we consider the
meaning of declarative pointing.

1. Introduction

Trevarthens (1977, 1979; Trevarthen and Hubley 1978) landmark research on


early infant social development explicitly focussed the attention of developmental
psychologists on the issue of intersubjectivity. In longitudinal observations of the
first few months of life, Trevarthen showed that even 2-month-old infants have
richly coordinated and complex interactions with their caregivers and respond
differently to persons than objects. He described these early activities in terms of
mutual intentionality and sharing of mental state (Trevarthen 1977:228). Based
on this elegant and careful research, Trevarthen proposed an innatist theory to
explain this primary intersubjectivity in which he argued, the infant is born with
awareness specifically receptive to subjective states in other persons (Trevarthen
and Aitken 2001:4). Intersubjectivity on this view is meant to denote a biologi-
cally specified psychological capacity that is a causal precondition for early hu-
man social interaction.
142 Noah Susswein and Timothy P. Racine

Infants behaviour towards animate and inanimate objects is strikingly dif-


ferent early in life and their interactions with others become increasingly coordi-
nated and complex. This coordinated activity involves mutual sensitivity to one
anothers emotional and attentional states, forms of activity that seem lacking in
non-human primates (Tomasello and Carpenter 2005) and which are truncated
in children with autism (Baron-Cohen 1995; Hobson and Hobson this volume).
But the degree to which infants discriminatory and interactive abilities and pref-
erences can be explained in terms of their experience is unclear, although infants
motivation to interact may be partly explained by the fact that they experience
pleasure while interacting. Thus, although there can be no doubt about the im-
portance of Trevarthens findings, there can be doubt about his interpretation of
those findings.
Trevarthens is just one characterization of intersubjectivity by developmental
psychologists. Reimers and Fogel (1992:82), for example, conceive of intersubjec-
tivity as a shared understanding of what an interaction is about. This argument
turns in part on what exactly Reimers and Fogel mean by understand (Racine
and Carpendale 2007b, 2007c). This definition could be read as meaning that in-
tersubjectivity is not a vehicle of interactive abilities, but that it is manifest in and
inseparable from interaction. This emphasises the logical connections between
understanding and doing, whereas Trevarthens use of awareness seems to locate
understanding in the experience of persons able to do such and such. Babies dis-
criminate between persons and objects early in life, preferentially attend to faces,
manifest pleasure in interaction, imitate, play, etc. In short, they do all of things
that count as intersubjective engagement. However, conceiving of intersubjectiv-
ity as denoting a psychological vehicle of infants interactive abilities is problem-
atic, conflating causal and definitional relations. And the claim of innate intersub-
jectivity goes beyond the claim that human biology is some sort of precondition
for human social life. Specifying the unique structural properties of a life form
is an important part of understanding how it is able to do what it does; e.g., the
eye, optic nerve, visual cortex, etc. are vehicles of our ability to see. A causal ex-
planation of visual perceptive abilities would involve specifying how processes in
these vehicles interact with specified features of the environment in the process
of seeing. It is not clear that intersubjectivity similarly refers to vehicles of our
interactive abilities, although it may appear to do so.
In this chapter, we explore the distinction between causal and definitional is-
sues in early social development. We distinguish between conceptual explanations
of what an organism is a doing and causal explanations of how they are doing it
(Dupr 1993). We argue that intersubjectivity is a refinement of our commonsense
notion of interaction, specifying a form of interaction that is chronologically pri-
mary and developmentally significant in that its absence typically foreshadows
Sharing mental states 143

s erious disruption in a persons social cognitive and emotional functioning. Rath-


er than answering a how question by denoting a vehicle of interactive abilities,
we argue that intersubjectivity is a taxonomic concept that speaks to questions of
what an organism is capable of doing. In the first sections we examine the familiar
idea that intersubjective engagement involves the sharing of mental states.

2. Sharing mental states

Intersubjectivity appears relatively well understood in the sense that we know it


when we see it. People typically agree on whether two (or more) agents are do-
ing something together or are just near one another by happenstance. Echoing
Trevarthen, intersubjectivity is understood widely as involving matched or
shared mental states between or among individuals (Barresi and Moore this vol-
ume; see also Gallagher and Hutto this volume; Hobson and Hobson this volume).
We think this definition is unclear and incomplete. What exactly does it mean to
share a mental state with another? Does it mean something in addition to coor-
dinating ones actions in shared activities with others? Let us first take sharing a
mental state to mean having the same or similar thoughts or feelings. We assume
that no one would claim that 2-month-olds and their caregivers typically literally
think and feel the same things during protoconversations. For example, a care-
giver can experience feelings of aching pride and can see a resemblance between
the infant and her father, but infants cannot. So we cannot define intersubjectiv-
ity as an interaction involving two agents having the same thoughts and feelings.
However, dramatic differences in emotional states between infants and caregivers
might be treated as clear criterion for a lack of intersubjective engagement. For
example, when a caregiver tries to soothe a wailing infant, they might be said to be
interacting without being intersubjectively engaged. So perhaps episodes of inter-
subjective engagement are partially characterized by a global similarity in mental
state, by two parties experiencing some form of pleasure. However, two cheerful
strangers standing side by side on an elevator would satisfy this condition, so
this definition is obviously too inclusive. It is misleading to define intersubjective
engagement in terms of shared mental states, if it is also necessary that two agents
be coordinating their activities with one another.
Perhaps both the interactional requirement and the emphasis on mental states
can be met by stipulating that parents and infants must share a common atten-
tional state for intersubjectivity to occur. However, as with the cheerful elevator
passengers above, a distinction is often made in the research literature between
passive coordinated attention where agents might be looking at the same state
of affairs by happenstance versus both agents being aware that their attention is
144 Noah Susswein and Timothy P. Racine

coordinated (e.g., Bakeman and Adamson 1984). So even in clear cases of agents
sharing an attentional state by attending to the same stimulus, it is necessary to
stipulate that they be attending to the stimulus together if activity is to count as in-
tersubjective engagement. From a third person perspective, infant and caregiver
might be described as sharing the state of attending to each other (i.e. A and B
are intersubjectively engaged just in case A is attending to B while B is attending
to A). This liberal definition of shared attentional states focuses on coordinated
activity rather than two actors having the same or similar experiences. That is,
knowing that two actors are attending to one another may not entail knowing
what it is like to be behind the eyes of either of the pair. So there are reasons
to question the assumption that intersubjectivity involves shared mental states
even in the sense of having similar thoughts, feelings, or perceptual experiences,
although it may require that interacting parties both manifest a degree of plea-
sure. There does seem to be a role for shared mental states in considerations of
infant intersubjectivity, but a more limited role than is widely assumed. Perhaps
manifesting pleasure or interest in coordinated interactions is a more accurate
description of intersubjectivity.

2.1 Understanding and experience

We must distinguish between intersubjectivity as shared mental states and inter-


subjectivity as shared understanding. The first notion of intersubjectivity seems
to focus on experience whereas the second emphasises intersubjectivity as a kind
of knowledge. Because it is widely assumed that understanding is a type of mental
state, it may seem like these definitions of intersubjectivity are roughly equivalent.
We argue that they are quite different, and that it is very misleading to conceive
of understanding as a mental state. Mental states have genuine duration, begin-
nings and ends and vary in intensity (Bennett and Hacker 2003; Racine2004;
Wittgenstein 1958). They are states that a person is in for some period of time,
unlike dispositions or traits, which might characterize a persons thoughts or feel-
ings for a lifetime.
The category of mental states covers a lot of mental ground, including emo-
tional states such as states of anxiety; appetitive states, such as states of hunger or
thirst; perceptual states, such as seeing an apple or hearing a piano; and attentional
states, such as attending to one hunger or to the apple one sees on the table. This
list is meant to be illustrative, not exhaustive. The category of mental states is di-
verse, populous, and not sharply bounded. Nor is it sharply divided, as illustrated
by the overlapping of attentional states with perceptual and appetitive states in
the list above. However, not everything mental should be construed as a mental
Sharing mental states 145

state. Understanding is one example. First, understanding, unlike a mental state,


is a potentiality, not an actuality (Baker and Hacker 1984; Kenny 1989). For ex-
ample, to understand attention is, among other things, to be able to gaze follow
and direct others attention. But ones understanding of attention does not hinge
upon continuously exercising those abilities, that is, it does not hinge upon being
in a mental state of attending to others attention. Furthermore, understanding
is typically general, a family of abilities (gaze following, social referencing, and
declarative pointing all index infants understanding of attention), rather than a
specific ability such as the ability to pass a particular perspective taking task.
To ascribe an understanding of X to a person is to characterize what they are
able to do, not what they currently or continuously experience (see also Leav-
ens, Hopkins, and Bard this volume). If an infant follows others gazes, observers
know that this infant has some rudimentary knowledge of attention. But observ-
ers do not know what it is like for the infant to follow others gazes. The criteria
for ascribing an understanding of attention what counts as understanding atten-
tion are actions, not experiences of actions. Discussing the experience of infants
always risks adultocentrism, so we will use a different example to further illustrate
this point. Whether a person experiences algebra as dreadfully boring or deeply
satisfyingly is irrelevant to considerations of her understanding of the subject. It
is the action of correctly solving for x or y and not ones experience of doing so
that determines whether one understands some aspect of algebra. That said, we
might expect a person who greatly enjoys algebra to be or go on to become better
at it than persons who enjoy it less or not at all, due the algebra-enjoyers spending
more time practicing, etc. So it would be false to say that persons abilities to V are
unrelated to their experience of Ving. However, a persons enjoyment of an activ-
ity and their skill at it are distinct, as many hobbyists know firsthand.
It might be tempting to think of understanding as an experience or mental
state because we do have experiences of trying to understand things, coming to
understand them, realizing that we do not or only partially understand something
we thought we understood, remembering or relearning something we previously
understood, etc. These experiences are related to the phenomena of understand-
ing. But they are experiences of trying, learning, realizing, remembering, and
relearning, not experiences of understanding per se. Furthermore, such experi-
ences would seem to necessarily involve attentional states, and might also involve
perceptual states, emotional states, and even appetitive states, for example, com-
ing to understand that one always gets hungry soon after eating at a particular
restaurant. So there are connections between mental states and the phenomena of
understanding. However, there does not appear to be any reason to regard under-
standing itself as a type of mental state. We develop a more positive assessment of
146 Noah Susswein and Timothy P. Racine

what understanding is in the next section by considering the new forms of under-
standing that are often characterized as new forms of intersubjectivity.

2.2 Understanding, intersubjectivity, and abilities

Many theorists argue that the new activities that appear towards the end of the
first year, such as reliable gaze following, pointing and social referencing herald
a shift from a basic to a more sophisticated form of intersubjective engagement.
Trevarthen and Hubley (1978) characterize this new level of engagement as sec-
ondary intersubjectivity, which Bretherton and colleagues claim is suggestive of a
burgeoning implicit theory of mind (Bretherton 1991; Bretherton, McNew and
Beeghly-Smith 1981) and which Tomasello (1995, 1999) argues may show that
infants are able to experience the intentions of others as similar to or different
from their own. Developmentalists often refer to the appearance of these shared
activities that require simultaneous visual attention on the part of both infant
and caregiver as episodes of joint attention (see chapters by Brinck and Leavens
et al. this volume and in Eilan, Hoerl, McCormack and Roessler 2005; Kita 2003;
Moore and Dunham 1995; for a review see Racine and Carpendale 2007c). Many
accounts of joint attention treat the emergence of these skills as behavioural ef-
fects of the infants having new kinds of experiences or of possessing some ru-
dimentary understanding of psychological concepts. A possibly caricatured but
illustrative paraphrase of such views is that infants discover or hypothesize that
people possess mental things called attention and intention, and to use this
knowledge to navigate the social world. For example, gaze following or pointing
skills are sometimes conceived of as varied behavioural consequences of a general
conceptual insight, as the claim that infants come to represent other persons
as beings with attentional and/or intentional capacities (Tomasello 1995, 1999)
seems to suggest.
We agree that joint attention behaviours are related to new experiences and
new understandings, but not that these relations are of cause and effect. First we
consider the notion of new experiences. There is a clear sense in which experienc-
es can be causally related to new understandings. For example, the experience of
listening to good lecture may cause a person to understand a difficult argument.
And experiencing a serious illness may lead someone to understand the value of
good health and supportive family. In these clichd examples, the experiences

. Although Tomasellos recent revision of his theory has tempered some of his earlier claims
(Tomasello, Carpenter, Call, Behne and Moll 2005), his revised theory is still based on a mental-
ist metaphysics (Racine and Carpendale 2007a, 2007b, 2007c).
Sharing mental states 147

precede the new understandings, in keeping with the conventional requirement


that operative causes precede their effects (Mill 1843/1875). An operative cause is
that which brings about a change. However, causal explanations can also involve
causal preconditions, the vehicles by which a change is brought about. For ex-
ample, the operative cause of hearing an infant cry would be that infants crying
while the vehicles of hearing her cry would include the eardrum, auditory cortex,
etc. We think it is misleading to regard new experiences of others attention as
either operative causes or as causal vehicles of joint attention behaviours.
The statement I understand the argument because of her lecture bears a
superficial grammatical similarity to infants begin react to others attention be-
cause they begin to experience others attention. However, in this latter case,
because does not specify a contingent, causal relation, as in A understands
something because B explained it. Rather, it is more like a semantic stipulation:
the fact that infants react differentially is claimed to mean that others attention
are objects of their experience. In one sense, the emergence of joint attention
behaviours or secondary intersubjectivity necessarily involves new experiences;
the emergence of gaze following entails a new experience of following an oth-
ers gaze, and the emergence of declarative pointing entails a new experience of
directing others attention. However, such experiences of others attention and
reactions to others attention are not perfectly correlated but conceptually re-
lated, as are being an unmarried man and being a bachelor. It is not as if infants
experience of others attention can be detected independent of their reactions to
others attention (Leavens et al. this volume). Thus, the experience of others at-
tention cannot be detected prior to reactions to others attention. So it is seems
misleading to regard new experiences as operative causes of new behaviours be-
cause operative causes precede their effects. But the bigger problem is logical, not
chronological, nor methodological. There is simply no way to determine what
any person, infant or adult, experiences independent of observing their behav-
iour, or at least, some consequence of their behaviour, e.g. reading an autobiog-
raphy (Bennett and Hacker 2003). To be clear, this is not to say that experiencing
is behaving. A comatose person may hear or see without behaving at all. But if
it is impossible to assess what infants experience independent of what they do,
then it cannot be an empirical claim that new experiences lead to new forms of
interactive behaviours.
It is also awkward to regard new understandings as the causes of such new
behaviours. Rather than a more advanced understanding of mentality causally
underlying these behaviours, we think it is clearer, albeit less familiar, to say that a
more advanced understanding of mentality is manifest in these behaviours. Gaze
following, pointing, social referencing, object-directed imitation and so forth are
among the most rudimentary behaviours that count as understanding attention.
148 Noah Susswein and Timothy P. Racine

To understand attention is, among other things, to be able to gaze follow and
direct others attention. Being able to explain what attention means or use atten-
tion correctly in a sentence are more advanced criteria of understanding atten-
tion. What an agent understands determines what they can do. But determine
can mean both cause and define. A mans marital status determines whether or
not he is a bachelor; this relation is logical. If this man has such objectionable
body odor as to drive away all would-be spouses, this relationship is causal. It is
mistaken to regard e.g. understanding of attention as cause and gaze following,
pointing, social referencing as effects, because these behaviours define rudimen-
tary forms of understanding attention. We list these behaviours when explaining
what it means to say that infants begin to understand attention at around one year
of age. Ascribing an understanding of attention to infants specifies what they are
capable of doing, not how or why they do it (Dupr 1993). That is, the relation-
ship between joint attention behaviours and understanding attention is logical
rather than causal. Dyadic interactive behaviours manifesting pleasure or in-
terest while interacting with another define primary intersubjectivity. Triadic
interactive behaviours define secondary intersubjectivity and an understanding
of attention. Invoking a state or stage secondary intersubjectivity should not be
viewed as a causal explanation of these behaviours but a categorical explanation.
Rather than cause and effect, the relations between primary intersubjectivity and
dyadic interactions, and between secondary intersubjectivity and triadic interac-
tion behaviours are relations between types and tokens.

3. Intersubjectivity as a taxonomic concept

We have said a lot about we think intersubjectivity is not. Now we elaborate on


what we think it is. Intersubjectivity is a theoretically motivated taxonomic con-
cept that helps researchers distinguish different forms of interaction that unfold
in early development. A taxonomy is a scientific system of classification. A general
question about how children develop social understanding must be broken down
into smaller parts, into questions about more specific abilities, in order to be stud-
ied empirically. In distinguishing, for example, between primary and secondary
intersubjectivity social developmental theorists are defining technical concepts
for the purposes of dividing development into interesting stages or domains. In-
tersubjectivity does not explain how it is that human infants are motivated to
play and proto-converse with others. Reliable neurological differences between,
for example, children with autistic spectrum disorders and non-autistic children
may help answer how questions, specifying the causal preconditions necessary
Sharing mental states 149

for doing what we call intersubjectivity. Other operative causal questions related
to intersubjectivite abilities may involve asking which if any environmental fac-
tors reliably predict the early emergence of or more skilful interactive abilities.
However, for any such empirical questions to be asked about social cognitive de-
velopment, conceptual questions regarding what counts as social cognition and
development must be specified in advance.
Not all genuine explanation is causal explanation. An explanation of social
understanding involves specifying what counts as social understanding. And an
explanation of infant development in terms of phases of intersubjectivity explains
infant development in the same sense that a species/phylum/ family/genus diagram
explains the animal kingdom. That is, to discuss the emergence of secondary inter-
subjectivity is to pick out a class of actions triadic interactions involving two per-
sons attending one another as well as to some other feature of the environment as
theoretically interesting. Where a nave observer may see nothing more than babys
first point, not obviously more alluring than babys first tooth, the developmental-
ist appreciates the theoretical significance of the first pointing gesture, and teaches
her students to do the same. But in doing so, she is teaching a technical, taxonomic
vocabulary, not a lesson in discovered causes of different behaviours.
We might also ask, in ascribing an understanding of attention or a stage of
secondary intersubjectivity to infants who gaze follow, point, socially reference
and so on whether developmentalists have discovered or created order. We argue
that this is a case of creating order, of specifying a theoretically interesting aspect
of what infants do rather than discovering how they do it. This is important be-
cause phenomena must be conceptualized in order to be investigated. However,
conceptualization is a creative act. Categories do not exist independent of lan-
guage even though some categories (e.g., human infants) are more natural than
others (e.g., things over one pound) in the sense that knowing that X is a human
infant allows one to know much more about X than knowing that X is a thing that
weighs over 1 pound (Dupr 1993:6264). Creating as well as discovering order
is an essential aspect of empirical science. But mistaking creations for discoveries
is, by definition, mistaken.

3.1 Ontological diversity of the mental

Does our denial that primary and secondary intersubjectivity are causes of dy-
adic, and triadic behaviours respectively entail a general scepticism about mind
or mental causation? It does not. However, we do believe that causal explanations
are only one type of explanation in which psychological concepts appear, and
that the role of causal explanations of behaviour in social understanding is often
150 Noah Susswein and Timothy P. Racine

e xaggerated. In this section we wish to explain why we think our analysis is not at
all anti-realist about mind.
It is widely assumed that mental states are causes of behaviours (Racine and
Carpendale 2007b, 2007c). And there are cases in which it is perfectly sensible to
regard a mental state as the cause of a particular behaviour, for example, when a
person shudders involuntarily at the memory of a disturbing image. In a differ-
ent vein, our reflective thinking often provides us with reasons for choosing one
course of action over another. Thus, it would be false to claim that mental phe-
nomena are never causes of behaviour or, more generally, that mental phenomena
are somehow irrelevant to considerations of behaviour. But it is grossly overgen-
eralized to think that mental states are causes of all behaviours. For one thing, this
conflates actions with reactions. And if we erase the distinction between action
and reaction, we vitiate the concept of responsibility, the practice of apologiz-
ing, the distinction between on purpose and, by accident, and a host of other
distinctions that seem to partially define human social life. An understanding of
behaviour as, in a variety of ways, purposeful is an important aspect of the very
commonsense view of the mind (folk psychology) that developmentalists study
in nascent forms. For example, to claim that A Ved intentionally is to claim that A
is responsible for his Ving to a degree that A is not if A Ved unintentionally, out of
ignorance or by accident. Responsibility and causality are related but distinct. A
person can be held responsible for an outcome that she only indirectly caused, e.g.
hiring another person to commit a crime. We also typically hold persons respon-
sible to some degree for the mental states of jealousy or drunkenness that may
cause them to act foolishly. And we readily forgive others for causing damage to
our belongings if they didnt do it on purpose. That these are truisms is precisely
our point. A common-sense understanding of action involves more than causal
analyses of behaviour. It seems that the psychology of the folk is subtler than most
accounts of folk psychology would have it.
A thorny feature of psychological predicates is that they are used in a variety
of relatively unrelated ways (Bennett and Hacker 2003; ter Hark 1990). When
subsuming mental state concepts under the superordinate category of the mind,
it is easy not to notice this feature. It might seem counterintuitive or anti-realist
to suggest that the mentalness of psychological predicates is a consequence of,
rather than justification for, treating mental as relatively uniform and sharply
bounded category (e.g., as opposed to physical or behavioural). But to claim
that the mental is a heterogeneous category is not to claim that it is a sense-
less category (as might be, for example colourless green ideas). The mental is an
ontologically diverse category. And the diversity of the mental can be illustrated
by examining the use of even a single term. For example, think can be used to
issue a threat (I think you better leave now), to make a request (do you think
Sharing mental states 151

you can pass me the salt?), to issue a command (I think youve had enough ice
cream), to express uncertainly (its raining, I think), to delay action (okay, Im
going to think about that), and far less frequently to actually express subjective
experiences (I am thinking of the time that we went to Barbados). That is, a
single mental state term can be manifestly used to perform very different social
acts (Austin 1975).
Now, some readers may object that we are too focused here on the word
think itself, and insist that it is the nature of the phenomena of thinking, and
not the utterance of the word think that is really at issue here. We agree that
it is the phenomena of thinking that is at issue. But, given that the only way to
determine which phenomena count as thinking is to examine the application of
the concept, we insist that considering the use of think is of central importance
to determination of what thinking is. However, for those who are not persuaded
by this argument and hold that ontological and semantic questions are entirely
separate, we offer another, roughly non-linguistic example of ontological diversity
of a single mental concept.
Any human being can intend to grab a rock. But only a human being in a
very specific social context can intend to write a check (Dupr 1993). One cannot
intend to V unless there is such a thing as Ving (unlike imagining Ving which
may mean either that such a thing as Ving exists, and one imagines oneself doing
what counts as Ving, or that that one is imagining that there is such a thing as Ving,
the meaning of which must be explained if one is to describe what one has imag-
ined). It was simply not possible for even the cleverest of our Stone Age ancestors
to intend to write a check, whereas even the most Neanderthal of contemporary
humans can do so (provided she has a checking account). Thus, many intentions
are ontologically dependent upon highly specific social structures, while other in-
tentions (e.g. to pick up a rock) denote capacities that seem universal (but extend
far beyond the species boundary seagulls also intend to pick up rocks). Again,
the mentality common to both of these intentions does not appear to be well ex-
plained by shared or even similar ontological statuses. Furthermore, beyond prag-
matic diversity in the application of think and ontological diversity within the
phenomena of intending, there are logical differences of a different order between
mental state concepts that, as we noted earlier, have to do with issues of duration
and which mental phenomena can be sensibly regarded as states. Mental state con-
cepts do not necessarily share much in common other than seeming to refer to the
inner, but the examples of think and intention above suggest this view of mental
predicates is misleading. This seeming to refer to the inner is a consequence and
not a cause of, conceptualizing activities in psychological terms.
Action explanations that involve properties of agents rather than situational
factors refer to the inner. Although we typically think of our thoughts and feelings
152 Noah Susswein and Timothy P. Racine

as inside us, talk of the inner is metaphorical. Lakoff and Johnson (2003) argue
that common sense is fundamentally metaphorical and describe one such foun-
dational container metaphor, a tendency to conceive of non-physical entities as
contained by or within a physical structure (Slaney and Maraun 2005). Conceiv-
ing of experiences and abilities as being inside of the agent whose experiences and
abilities they are is an example of this container metaphor. The fact that is some-
times possible to conceal what we think and feel should not tempt one to think of
the inner in mistakenly concrete terms (Bennett and Hacker 2003:8890). Now,
to state baldly that the inner is metaphorical seems uncontroversial. But the cor-
ollary that it is the use of psychological concepts and not their putative inner ref-
erents that determines their meaning is more difficult. We describe a usage-based
account of understanding attention in the next section.
However, first we wish to address a potential misunderstanding. We are not
arguing for a psychological nominalist or eliminativist position that psycho-
logical concepts are mere artefacts of language, just a way of talking, or that we
dont really have intentions or understandings (Churchland 1986). We do really
have intentions and understandings, but this having is not a relation of con-
taining. Rather, this having is akin to possessing. Containing involves a vessel
while possessing involves an agent. What makes something an agent is that it acts
or behaves. Not all agents but only agents can be characterized in psychological
terms; only agents have psychological properties. It is the connections between
psychological phenomena and activity, not descriptions of activity, which we wish
to elucidate here. Psychological concepts are heterogeneous, but if we must speak
in generalities, we ought to think of the verb and adverbial forms of psychological
predicates (e.g., attending, intentionally) as primary and the noun forms (atten-
tion, intention) as usually harmless reifications.

4. Social interaction and meaning

In our earlier discussion of psychological capacities, we attempted to draw atten-


tion to activity, and argued that it is what persons do that determines whether or
not they are attending to X, intended to Y, that they understand some aspect of
attending and intending, or that they are in a stage of primary or secondary in-
tersubjectivity. If attributions of psychological capacities and understandings are
based on behavioural criteria, how do we know when such ascriptions are justi-
fied? Pointing, and especially declarative pointing (see Brinck; Leavens et al. this
volume), is thought to unambiguously reveal infants understanding of attention.
However, other than Bates and colleagues (Bates, Camaioni and Volterra 1975)
Sharing mental states 153

and Bruners (1983) pioneering work, the issue of how pointing in general comes
to be recruited and used as an interactional (and what is often, and derivatively,
understood to be a referential) device is largely absent in empirical work.
If one takes an inductive approach to joint attention, ethnomethodological,
and derivatively, conversation analytic (CA) methods of social interaction are
useful (e.g., Antaki 2004; Atkinson and Heritage 1984; Garfinkel 1967; Sharrock
and Coulter 2004; Turnbull 2003; Wootton 1997). Conversation analysts study
the orderly structure of conversation in order to examine how interaction is ac-
complished. CA studies of early infant development are relatively rare because
with preverbal infants there is, by definition, no conversation to analyze. Wootton
(1994) focuses on third position repair sequences (i.e., where an infant has acted,
the parent has responded to that action in the second turn/position, and the infant
responds to that response in the third turn/position). An infant is in a position to
display her congruence or lack thereof with the parents previous turn, which is a
contingency that Wootton reports 12-month-olds can manage. However, a prior
concern for us is to first reconcile the logic of ethnomethodological investigation
with the concerns discussed earlier.
CA approaches to meaning assume that the meaning that participants attri-
bute to their interactions is manifest in the details of that sequential and negoti-
ated activity. Conversation analysis is a means of using interactional sequences to
instantiate criteria in ongoing interaction. We now turn to sequences of interac-
tion to illustrate the claim that it is what they do that determines whether two
agents have a shared understanding of what an interaction is about (Reimers
and Fogel 1992).

4.1 The attribution of intention

To demonstrate this approach, to which we will return in more detail in the next
section, we present two brief sequences of interaction involving the second au-
thors daughter, T, shortly after her second birthday. These examples show how
criteria for the application of intention were warranted in the contingencies of
this childs interactions with her mother, C, and that Ts mother attributed inten-
tion to her daughters activity. That is, mother and child both acted in ways that
manifest an understanding of intention. Readers and the participants, to the de-
gree that they possess reasonably complex language, can see the criteria that are
contained therein. However, we do not see a mental state in T that is her goal-di-
rected behaviour. Rather the ascription of mental predicate to T presupposes her
ability to act in goal-directed ways.
154 Noah Susswein and Timothy P. Racine

T and C were recorded while sharing a meal (cf. Canfield 1993). The exam-
ples were transcribed according to CA conventions (e.g. Atkinson and Heritage
1984). Square brackets represent overlap in speech. Numbers in round brackets
represent pauses in seconds. Dots in round brackets indicate noticeable pauses
of under 0.2 seconds in length. Words in brackets represent a description of an
action. A colon represents a drawn out syllable. An upward arrow indicates rising
intonation in a pitch contour.
Example 1
1 C: the [skin is really soft on those kind]
2 T: [too sour mommy] i han havva drink
3 (.5)
4 C: [youre going to have a drink]
5 [(T drinks from cup on table)]

We can see T doing something analogous to C in line 2 of Example 2 below. Cs


intention in this case is to provide T with a fork with which to eat her dinner,
which T acknowledges.
Example 2
1 C: no thats mine (.) i think i forgot your fork thi (.5)
2 T: you did
3 C: i did (.5) oh i did (.) its right here

How are we to make sense of this activity? Grasping the meaning of an action
that is, what it is that an agent is doing involves understanding the practice in
which it is embedded: e.g., too sour mommy i han havva drink followed by (T
drinks from cup on table). To grasp the local meanings of actions is to see intention
in this particular social situation and in so doing to attribute intentions to the ac-
tor who performed the act in question: i.e., youre going to have a drink.

4.2 Gesture as a window on intersubjectivity

In order to further discuss pointing, we need to address the issue of reference.


From a conversational analytic point of view, episodes involving joint attention
are meaningful because both agents, for example, understand a request is being
made, not simply because both agents appreciate the correspondence between an
action and a referent. From this point of view, reference presupposes, and is para-
sitic upon, social practice. More generally, reference consists in a signs having a

. A full transcription of this interaction is available upon request from the second author.
Sharing mental states 155

role in a language-game (Proudfoot and Copeland 2002:338). That is, although


human reference requires particular mental and neurological capacities, reference
is not a mental or neurological phenomenon, but a social one (see also Leavens
et al. this volume). It is not a mental or a neurological event which makes a first
finger extended towards X a reference to X. It is the fact that this extended finger
is a technique or practice of orienting others towards X in a particular context.
A person may very well be imagining what they plan to prepare for dinner while
they point toward X, but the point of this distracted individual is no less referen-
tial. That is, to know that As point towards X is a reference to X is not to know
anything about As state of mind. Now, it is of course possible to misinterpret As
point toward X as a point toward Y. And doing so would constitute a misunder-
standing of As intention. But this would be a case of misinterpreting what kind
of action that point was, not an incorrect guess as to the underlying cause of As
pointing behaviour.
When thinking about gesture as a window on intersubjective engagement, the
canonical activity of interest is pointing. Declarative pointing has attracted par-
ticular attention because it is thought to not exist in non-human primates, apart
from those who are language-trained and humanly enculturated, and is has also
been argued to ground language acquisition (e.g. Butterworth 2003; Brinck this
volume; Leavens et al. this volume). Developmentalists tend to think of pointing
as evidence of joint attention, a criterion for secondary intersubjectivity. In terms
of CA, we would like to see evidence that a mother and baby treat these gestures
with the same understanding.
Racine (2005) charted the development of the interactional resources that one
9- to 12-month-old infant had at her disposal when interacting with her mother.
In so doing the roles that infant pointing, pointing-like gestures and reaching
might play in interaction were displayed. We now report on two of these obser-
vations that might be seen to involve declarative pointing. In these examples, M
represents mother; B represents baby. Vocal (V) elements of the interaction, gaze
(G) and positions of the right hand (RH) and left hand (LH) are identified on
separate lines. The temporal dimension of the interaction is represented by move-
ment from left to right along these lines and the horizontal relations between the
various channels were kept as tight as possible. Manual actions are surrounded
by ( ). Overlaps in actions (including speech) are surrounded by [ ]. Some less
relevant aspects of the interaction are simply summarized as gloss transcriptions.
Mothers hand gestures and positions are noted only when they make relevant the
infants behaviour. In cases of an empty line (typically when either M or B did not
vocalize), the line is not listed.
All talk is transcribed orthographically. Numbers in round brackets represent
pauses in seconds, but timings have meaning only within the vocal line. Dots in
156 Noah Susswein and Timothy P. Racine

round brackets indicate noticeable pauses of under 0.2 seconds in length. Words
in brackets represent a description of an action. A colon represents a drawn out
syllable. Further conventions were adopted for this dataset. In cases where a vo-
calization was whispered it is surrounded by o o. Audible inbreathing or out-
breathing accompanying utterances is represented by > and < respectively. In
cases where a vocalization was inaudible the utterance is surrounded by ( ).
Gaze was inferred from orientation of the head. In cases where the target
of gaze was less clear it is surrounded by ( ). When targets are not stated in the
gaze line, M represents gaze at parent and C represents gaze at infant. Dashes
(---) represent sustained gaze at a target. With respect to hand positions (RH
or LH), dashes are used to indicate static positioning of the hand and commas
to represent movement of the hand from an initial position to another in the
sequence. The following additional conventions are employed: Re = reach, and
Po = point.
In the first of two related examples, at the 11-month visit B does what would
be typically coded as a declarative act with outstretched arm and index finger
prominent but without all other fingers in palm. However, M seems to treat this as
asking for permission or to at least minimally signal intent. Thus, this is not just
a point used to direct attention to enhance interaction (Liszkowski, Carpenter,
Striano and Tomasello 2006; Moore and DEntremont 2001). The function of Ms
behaviour seems to be a confirmation of some sort given that it is accompanied
by the utterance I know, but is also a prohibition because the infant ceases to
advance towards the prohibited object.
Example 1
1 B: (G) camera -- - - - - - - - - - - - - - - , , , M - - - - - - - - - - - - - - - - - - - - -
(LH) at side - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(RH) open hand (shoulder level) -, Po (camera) - - , lowers arm - - - - -
(walks towards camera, ~7 from where M is seated, stops 2 from target)
M: (V) no I dont think we can (chuckle) (1.0) yeh (.) i know (.) thats a camera
(G) B - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

At the 12-month visit, B gets up from Ms lap and again takes steps towards the
camera that was forbidden at the previous months visit. While she gets up out
of Ms lap, her RH pointing finger begins to extend. B takes 2 steps towards the
camera with her RH changing into a grasping gesture. As in the previous visit,
we argue that the pointing gesture needs to be made sense of in relation to the
prohibitory social situation.
Sharing mental states 157

Example 2
1 B: (G) camera - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(LH) getting up - - , at side - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(RH) getting up - - , Po - , , Re (not fully extended) - - , drops to side - -
(walks towards camera) (stops) (walks backwards towards M)
M: (V) <<hh:hy>> (0.5) Oget over hereO
(G) B - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

4.3 The meaning of declarative pointing

How are we to make sense of this activity? As noted earlier, a basic form of in-
tersubjectivity is defined by interactions showing an ability to differentiate be-
tween people and objects, whereas another form of intersubjectivity manifests
itself in interactions showing an ability to sequence simultaneous activity with
people and objects. In the two examples we see clear evidence of satisfying the
criteria for this secondary form of intersubjectivity. In Example 1, at 11 months
of age B gets out of Ms lap while playing with her to approach the researchers
video camera while the mother chuckles at this after stating no I dont think we
can. The infant then stops and points at the camera and shifts her gaze towards
her mother. Her mother acknowledges her turn by stating I know and the infant
lowers her pointing arm and proceeds no further. In Example 4, videotaped one
month later, the point is briefer and turns into a reach and there is no eye contact
towards the mother. However, B again does not proceed any further towards the
camera when M whispers get over here. Thus, unlike some of the other interac-
tions reported by in Racine (2005), these are rich tightly coupled interactions that
clearly involve an understanding of what these interactions are about on the part
of both mother and infant.
To focus on the role of the pointing gestures themselves, M responds to the
gesture in Example 1 with a verbal affirmation. But she is not responding to an
extended index finger, but an entire sequence of interaction in a particular social
situation. Setting aside for the moment what exactly the point means, the ges-
ture and the mothers words can mean what they mean in this circumstance of
a camera being set up a few feet from a baby, that baby noticing the camera and
approaching it, that baby noticing that her mother shows that she does want the
baby to approach the camera, etc. (Canfield 1981; Racine 2004). The behavioural
criteria apply in this particular social situation for humans and those who have

. Bates et al. (1975) argued that infants of this age may begin to be more confident in their
mothers attention and therefore do not need to visually check as often. However, Morissette,
Ricard and Decar (1995) report that visual checking increases from 12 to 18 months of age.
158 Noah Susswein and Timothy P. Racine

the capacity to act in ways similar to humans (Wittgenstein 1958). In Example 2,


M exhales in a manner that seems to express mild frustration, which is supported
by the fact that she then playfully instructs her infant to return to her lap and leave
the camera alone. Thus, both points are incorporated into the interaction. And as
one might expect interactions involving pointing gestures show intersubjectivity.
The next issue with regard to the gestures is whether we would be justified in
describing these as declarative points. A declarative point, as the name suggests,
is meant to denote an index finger point that functions to comment on an object
or some state of affairs rather than to request an object or action. But not only
would calling these index finger extensions declarative radically reduce the com-
plexity of these sequences of interaction, doing so would also not seem accurate.
In the research literature, points are said to be declarative, imperative or informa-
tive (Bates et al. 1975; Liszkowski, Carpenter, Striano, and Tomasello 2006, but
see Brinck this volume). There has also been the suggestion that infants might
sometimes point to request object names (e.g., Carpendale and Lewis 2006). But
the way these gestures are used in these examples (i.e., their meaning) does not
seem to be to comment to the mother in a way that could be glossed look, theres
a camera (declarative), but rather I want to touch that camera, I am going to
touch that camera or perhaps can I touch that object? or even theres an ob-
ject that I am not supposed to touch. The I want/I am going/Can I possibilities
are all imperative in a sense in that they express desire for the object as opposed
to wanting to share attention around it. The last possibility, of showing under-
standing of a prohibition, makes the most sense in this interactive context. This
is because this interpretation is consistent with the infants prior turn of walking
towards a forbidden camera and then stopping to point at it and with the mothers
subsequent turn of measured confirmation in Example 1 and overt ruling out in
Example2. But we are not arguing here for the creation of another category of
pointing category; we are arguing for the importance of due care in the investiga-
tion of pointing.

5. Summary and conclusion

Particular forms of social interaction define particular forms of intersubjectiv-


ity. Forms of intersubjectivity do not causally explain the existence of interactive
abilities and behaviours. Very young typically developing infants discriminate
between animate and inanimate objects, and their interactions with caregivers
are richly coordinated and complex. Older infants can simultaneously attend to
object and interlocutor. Intersubjectivity is a taxonomic, technical concept used
to classify interactive behaviours and abilities rather than denoting the vehicles
Sharing mental states 159

or causes of those behaviours and abilities. Such taxonomic descriptions are eas-
ily mistaken for causal hypotheses. That is, intersubjectivity and infants partial
understanding of attention and intention appear to be misconstrued as the causes
of their socially coordinated behaviour. But in truth, it is their socially coordi-
nated behaviours which are logically primary and which justify ascribing inter-
subjectivity and understanding intention and attention to them. This chapter
was written to remind developmental researchers that the relationship between
causal and definitional issues in intersubjectivity is complicated, and not recog-
nized as a problem to be addressed. Many interpretations of important empirical
findings are predicated on problematic definitions of intersubjectivity and related
psychological concepts. We hope that this chapter will stimulate further work on
conceptual and empirical investigations of intersubjectivity.

Acknowledgments

The empirical examples were taken from a study conducted by the second author,
which was funded by the Human Early Learning Partnership and supported by
doctoral fellowships from the Social Sciences and Humanities Research Coun-
cil of Canada and the Michael Smith Foundation for Health Research. Prepara-
tion of this chapter was supported by grants from the Manitoba Health Research
Council and the Social Sciences and Humanities Research Council of Canada to
the second author. We thank Max Bibok, Jeremy Carpendale, Esa Itkonen, Bill
Turnbull, Jordan Zlatev and Chris Sinha for helpful comments on earlier ver-
sions of this chapter.

References

Antaki, C. 2004. Reading minds or dealing with interactional implications. Theory and Psy-
chology 14: 667683.
Austin, J. 1975. How to do Things with Words (2nd ed.). New York: Oxford University Press.
Atkinson, J.M. and Heritage, J. (eds.) 1984. Structures of Social Action: Studies in Conversation
Analysis. Cambridge: Cambridge University Press.
Bakeman, R. and Adamson, L. 1984. Coordinating attention to people and objects in mother-
infant and peer-infant interactions. Child Development 55: 12781289.
Baker, G.P. and Hacker, P.M.S. (1984). Skepticism, Rules and Language. Cambridge: Blackwell.
Barresi, J. and Moore, C. this volume. The neuroscience of social understanding.
Baron-Cohen, S. 1995. Mindblindness: An Essay on Autism and Theory of Mind. Cambridge,
MA: MIT Press.
Bates, E., Camaioni, L. and Volterra, V. 1975. The acquisition of performatives prior to speech.
Merrill-Palmer Quarterly 21: 205226.
160 Noah Susswein and Timothy P. Racine

Bennett, M.R. and Hacker, P.M.S. 2003. Philosophical Foundations of Neuroscience. Oxford:
Blackwell.
Bretherton, I. 1991. Intentional communication and the development of an understanding of
mind. In Childrens Theories of Mind: Mental State and Social Understanding, D. Frye &
C.Moore (eds.), 4975. Hillsdale, NJ: Erlbaum.
Bretherton, I., McNew, S. and Beeghly-Smith, M. 1981. Early person knowledge as expressed
in gestural and verbal communication: When do infants acquire a theory of mind? In In-
fant Social Cognition: Empirical and Theoretical Considerations, M.E. Lamb & L.R. Sherrod
(eds.), 333373. Hillsdale, NJ: Erlbaum.
Brinck, I. this volume, The role of intersubjectivity in the development of intentional com-
munication.
Bruner, J. 1983. Childs Talk: Learning to Use Language. New York: Norton.
Butterworth, G. 2003. Pointing is the royal road to language for babies. In Pointing: Where
Language, Culture, and Cognition Meet, S. Kita (ed.), 933. Mahwah, NJ: Erlbaum.
Canfield, J.V. 1981. Wittgenstein, Language and World. Amherst: University of Massachusetts
Press.
Canfield, J.V. 1993. The living language: Wittgenstein and the empirical study of communica-
tion. Language Sciences 15: 165193.
Carpendale, J.I.M. and Lewis, C. 2006. How Children Develop Social Understanding. Oxford:
Blackwell.
Churchland, P.S. 1986. Neurophilosophy. Cambridge, Mass: MIT Press.
Dupr, J. 1993. The Disorder of Things: Metaphysical Foundations of the Disunity of Science.
Harvard University Press.
Eilan, N., Hoerl, C., McCormack, T. and Roessler, J. (eds.) 2005. Joint Attention: Communica-
tion and Other Minds. Oxford: Oxford University Press.
Gallagher, S. and Hutto, D.D. this volume. Understanding others through primary interaction
and narrative practice.
Garfinkel, H. 1967. Studies in Ethnomethodology. Englewood Cliffs, NJ: Prentice-Hall.
Hacker, P.M.S. 1996. Wittgenstein, Mind and Will. Cambridge: Blackwell.
Hark, M. ter 1990. Beyond the Outer and the Inner: Wittgensteins Philosophy of Psychology.
Dordrecht, The Netherlands: Kluwer.
Hobson, R.P. and Hobson, J.A. this volume. Engaging, sharing, knowing: Some lessons from
research in autism.
Kenny, A. 1989. The Metaphysics of Mind. Oxford, England: Oxford University Press.
Kita, S. 2003. Pointing: Where Language, Cognition and Culture Meet. Mahwah, NJ: Erlbaum.
Lakoff, G. and Johnson, M. 2003. Metaphors We Live By (2nd ed.). Chicago: University of Chi-
cago Press.
Leavens, D.A., Hopkins, W.D. and Bard, K. this volume. The heterochronic origins of explicit
reference.
Liszkowski, U., Carpenter, M., Striano, T. and Tomasello, M. 2006. Twelve- and 18-month-
olds point to provide information for others. Journal of Cognition and Development 7:
173187.
Mill, J.S. 1875. System of Logic (8th ed). London: Longmans. (Original work published 1843).
Moore C., and DEntremont, B. Developmental changes in pointing as a function of atten-
tional focus. Journal of Cognition and Development. 2: 109129.
Moore, C. and Dunham, P.J. (eds.) 1995. Joint Attention: Its Origins and Role in Development.
Hillsdale, NJ: Erlbaum.
Sharing mental states 161

Morissette, P., Ricard, M. and Decar, T.G. 1995. Joint visual attention and pointing in infancy:
A longitudinal study of comprehension. British Journal of Developmental Psychology 13:
163175.
Proudfoot, D. and Copeland, B.J. 2002. Wittgensteins deflationary account of reference. Lan-
guage and Communication 22: 331351.
Racine, T.P. 2004. Wittgensteins internalistic logic and childrens theories of mind. In Social
Interaction and the Development of Knowledge, J.I.M. Carpendale and U. Mller (eds.),
257276. Mahwah, NJ: Erlbaum.
Racine, T.P. 2005. The Role of Shared Practice in the Origins of Joint Attention and Pointing. Un-
published doctoral thesis, Simon Fraser University, Burnaby, BC, Canada.
Racine, T.P. and Carpendale, J.I.M. 2007a. Shared practices, understanding, language and joint
attention. British Journal of Developmental Psychology 25: 4554.
Racine, T.P. and Carpendale, J.I.M. 2007b. The embodiment of mental states. In Body in Mind,
Mind in Body: Developmental Perspectives on Embodiment and Consciousness, W.F. Over-
ton, U. Mller and J. Newman (eds.), 159190. Mahwah, NJ: Erlbaum.
Racine, T.P. and Carpendale, J.I.M. 2007c. The role of shared practice in joint attention. British
Journal of Developmental Psychology 25: 325.
Reimers, M. and Fogel, A. 1992. The evolutions of joint attention of objects between infants
and mothers: Diversity and convergence. Analise Psicologica 1: 8189.
Sharrock, W. and Coulter, J. 2004. ToM: A critical commentary. Theory & Psychology 14:
579600.
Slaney, K.L. and Maraun, M.D. 2005. Analogy and metaphor running amok: An examination
of the use of explanatory devices in neuroscience. Journal of Theoretical and Philosophical
Psychology 25: 153172.
Tomasello, M. 1995. Joint attention as social cognition. In Joint Attention: Its Origins and Role
in Development, C. Moore and P. Dunham (eds.), 103130. Hillsdale, NJ: Erlbaum.
Tomasello, M. 1999. Having intentions, understanding intentions, and understanding com-
municative intentions. In Developing Theories of Intention: Social Understanding and Self-
Control, P.D. Zelazo, J.W. Astington & D.R. Olson (eds.), 6375. Mahwah, NJ: Erlbaum.
Tomasello, M. and Carpenter, M. 2005. The emergence of social cognition in three young
chimpanzees. Monographs of the Society for Research in Child Development, 70 (Serial
No.279).
Tomasello, M., Carpenter, M., Call, J., Behne, T. and Moll, H. 2005. Understanding and sharing
intentions: The origins of cultural cognition. Behavioral and Brain Sciences 28: 675735.
Trevarthen, C. 1977. Descriptive analysis of infant communicative behavior. In Studies in
Mother-Infant Interaction, H.R. Schaffer (ed.), 227270. New York: Academic Press.
Trevarthen, C. 1979. Communication and cooperation in early infancy. A description of prima-
ry intersubjectivity. In Before Speech: The Beginning of Human Communication, M.Bullowa
(ed.), 99136. London: Cambridge University Press.
Trevarthen, C. and Aitken, K.J. 2001. Infant intersubjectivity: Research, theory, and clinical
applications. Journal of Child Psychology & Psychiatry 42: 348.
Trevarthen, C. and Hubley, P. 1978. Secondary subjectivity: Confidence, confiding, and acts
of meaning in the first year. In Action, Gesture and Symbol: The Emergence of Language,
A.Lock (ed.), 183229. London: Academic Press.
Turnbull, W. 2003. Language in Action: Psychological Models of Talk. New York: Psychology
Press.
162 Noah Susswein and Timothy P. Racine

Wittgenstein, L. 1958. Philosophical Investigations (3rd ed.). Englewood Cliffs, NJ: Prentice-
Hall.
Wootton, A.J. 1994. Object transfer, intersubjectivity and third position repair: Early develop-
mental observations of one child. Journal of Child Language 21: 543564.
Wootton, A.J. 1997. Interaction and the Development of Mind. Cambridge, MA: Cambridge
University Press.
part ii

Evolution
chapter 8

What is the nature of the gestural


communication of great apes?

Simone Pika

Human speech is frequently accompanied by movements of the arms and hands


termed gestures. The majority of these gestures is invented spontaneously and
is highly iconic but some gestures are used functionally in ways very similar to
speech that is symbolically, referentially, based on intersubjectively learned and
shared social conventions. Our closest living relatives, the great apes also use
gestures in their natural communication in a variety of contexts such as play,
grooming, sex and agonistic encounters. A deep understanding of apes gestural
signalling might therefore be helpful to get insight into the evolutionary scenar-
io of human communication and cognition. The present chapter investigates the
nature of the gestural signalling of the four great apes, bonobos (Pan paniscus),
chimpanzees (Pan troglodytes), gorillas (Gorilla gorilla) and orangutans (Pongo
pygmaeus), with a special focus on the following three aspects: (1) the intention-
ality of gestures, (2) their referential use, and (3) similarities and differences to
gestures in prelinguistic or just-linguistic human infants.

1. Introduction

Human communication is unique in the animal kingdom in a variety of ways.


Most importantly, human communication depends crucially on linguistic sym-
bols, which are individually learned and intersubjectively shared social conven-
tions used to direct the attentional and mental states of others to real or imaginary
situations (Tomasello 1999). Human communication is also unique in the way it
employs manual and other bodily gestures. For example, to our knowledge only
human beings point to things deictically for conspecifics the basic form of ges-
tural reference simply to share attention or to comment on events and objects
(Tomasello et al. 2005). And only humans use gestures, ranging from conven-
tionalized gestures such as waving goodbye to iconic gestures such as drawing
a circle in the air to depict the shape of the sun. Thus, although the majority of
gestures are performed spontaneously and are highly iconic (McNeill 1992), some
166 Simone Pika

human gestures are used functionally in ways very similar to language, that is
symbolically, referentially, based on intersubjectively learned and shared social
conventions.
Therefore the question arises: What is the nature of communicative signals in
our closest living relatives, the non-human primates, and how do they relate to
human gestures and language? The following chapter will address this question
by focusing on the gestural signalling of the four great ape species, bonobos (Pan
paniscus), chimpanzees (Pan troglodytes), gorillas (Gorilla gorilla) and orangutans
(Pongo pygmaeus), with a special focus on the following three aspects: (1) the
intentionality of gestures, (2) their referential use, and (3) similarities and differ-
ences to gestures in prelinguistic or just-linguistic human infants.

2. State of the art: Communicative signals in primates

By looking for the evolutionary roots of human language, researchers quite


naturally looked at the communication systems evolved in other animal species
and especially in our closest living relatives, the non-human primates (hereafter
primates). Until recently, the majority of studies focused on vocal communica-
tion (e.g., Marler 1980; Owings and Morton 1998; Seyfarth and Cheney 1997;
Snowdon, Brown and Peterson 1982), which might be due to the analogy to hu-
man speech (Liebermann 1968; Seyfarth 1987; Snowdon 1988). This interest has
been stimulated even further by evidence that primates and especially monkeys
use vocalizations to communicate information about their social and physical
environment, in addition to their emotional states (e.g., Cheney and Seyfarth
1990; Zuberbhler2001). As used in the primate literature, animal vocalizations
qualify as referential signals if they: (a) have a distinct acoustic structure, (b)
are produced in response to a particular external object or event, and (c) elicit a
similar response in nearby listeners as the external object or event normally does
(Zuberbhler 2000b). The finding that vervet monkeys (Cercopithecus aethiops)
use different alarm calls in association with different predators (leading to differ-
ent escape responses in receivers, Seyfarth, Cheney and Marler 1980) raised the
possibility that monkeys use vocalizations to make reference to outside entities
(Cheney and Seyfarth 1990). Referential signals have been reported from various
monkey species in their natural habitats (e.g., Zuberbhler 2000a; Zuberbhler
2003), suggesting that referential communication is a widespread and perhaps
universal characteristic of primate communication. These findings have been

. See Zlatev (this volume) on the difference between triadic mimesis which does not re-
quire conventionality, and language, which does.
Gestural communication of great apes 167

taken to suggest that primate referential abilities are the output of a cognitive
ability that could be pivotal to language, namely the capacity to assign mean-
ing to arbitrary sound utterances (Zuberbhler 2002). This conclusion remains
controversial, however, because it has been shown since then that alarm calls of
this type have arisen numerous times in evolution in species that also must or-
ganize different escape responses for different predators, including most promi-
nently prairie dogs and domestic chickens (for an overview see, Owings and
Morton 1998). In addition, related research on apes has provided mixed results.
For example, Uhlenbroek (1996) has demonstrated that East African chimpan-
zees at Gombe (Pan troglodytes schweinfurthii) produce acoustic variants of pant
hoots in three different contexts: travel, food, and encounters with other com-
munity members. However, a comparable study on the same subspecies at the
Mahale Mountains study site found no evidence of context specificity in pant
hoots (Marler and Hobbett 1975; Mitani and Brandt 1994; Mitani et al. 1992).
In addition, Clark and Wranghams (1993; 1994) studies at Kanyawara have sug-
gested that arrival pant-hoots at fruiting trees provide information about the
social context rather than about the food itself. However, recent research shows
that some chimpanzee calls can differ in their fine acoustic structure depending
on the eliciting context, a crucial prerequisite for calls to function referentially
(Crockford and Boesch 2003; Slocombe and Zuberbhler 2005a). Furthermore,
Slocombe and Zuberbhler (2005b) showed in an experimental setting that one
chimpanzee was able to use the information conveyed by rough grunts given in
two distinct contexts by his group members to guide his search for food. These
results therefore suggest that the vocalizations of chimpanzees may also function
referentially. This is consistent with the fact that great apes clearly have demon-
strated referential abilities in captive conditions (see for laboratory trained apes,
Rumbaugh 1977; Savage-Rumbaugh et al. 1993).
Therefore, the absence of evidence might merely reflect a paucity of data,
rather than a lack of referential abilities on behalf of the apes in the wild. However,
one other possible explanation for the apparent lack of referential vocalizations
in great apes might be that they are specialized in a different kind of referential
skill one based on the flexible use of manual gestural signals.

3. Gestural signals

To date, studies on gestural communication in primates are very unevenly dis-


tributed among species (for an overview see, Tomasello and Call 2007). Almost
no systematic studies exist focusing on the gestural signaling of monkeys. The
most interesting observations concern hamadryas baboons (Papio hamadryas)
168 Simone Pika

which were observed to engage in notifying behaviour, before leaving the troop
(Kummer 1968). The behaviour consists in approaching another animal and look-
ing directly into their face, presumably to make sure that the recipient is attend-
ing before engaging in certain activities. In addition, Kummer and Kurt (1965)
described a ground-slap behaviour that seems to serve as an attention getter and
a kind of teasing behaviour during play. Maestripieri (1997; 1999) compared the
gestural behaviour of three macaque species in captivity (Macaca arctoides, Ma-
caca nemestrina, Macaca mulatta) and suggested that characteristics of a social
structure, such as reduced influence of dominance and kinship may select along
with group size for a wider gestural repertoire. In addition, he described a very
interesting behaviour within mother-infant dyads: When pigtail macaque moth-
ers want their infants to follow them and they do not, the mothers sometimes
return and stare in the infants face (or even poke the infant) before leaving again
(Maestripieri 1996).
The gestural communication of apes has received much more research at-
tention, but it has focused mainly on chimpanzees (Pan troglodytes) in captivity
(Ladygina-Kohts 1935; Van Hooff 1973), and in the wild (Plooij 1979, 1987; Van
Lawick-Goodall 1968). Goodall (1986) for instance observed that chimpanzees at
the Gombe National Park use more than a dozen distinct gestures in a variety of
contexts. In addition, signals such as the gesture leaf clipping (Nishida 1980) and
the grooming hand clasp (McGrew and Tutin 1978) provided evidence for the exis-
tence of population-specific differences in chimpanzee communities in the wild.
Concerning their closely related congener, the bonobo (Pan paniscus), Savage-
Rumbaugh and colleagues (Savage-Rumbaugh, Wilkerson and Bakeman1977;
Savage-Rumbaugh and Wilkerson 1978) described the use of 20 gestures in a
sexual context. In addition, de Waal (1988) provided a comparison of the ges-
tural signaling of bonobos and chimpanzees and observed 15 distinct gestures for
bonobos that are linked to particular situations.
Tanner (1998) and Tanner and Byrne (1999) described the use of 30 gestures
in a gorilla (Gorilla gorilla) group in captivity. For western lowland gorillas in
the wild, Parnell and Buchanan-Smith (2001) reported a specific gesture called
the splash display which is used to intimidate other silverbacks, and Fay (1989)
observed hand-clapping behavior in females.
Contrary to the African great ape species, the gestural communication of the
Asian apes has received less research attention. MacKinnon (1974; however see
also, Rijksen 1978) established a repertoire of tactile and visual gestures of wild
Bornean (Pongo pygmaeus) and Sumatran orangutans (Pongo abelii).
Overall, the above mentioned studies have provided evidence that great apes
make frequent use of gestures in their everyday communication. However, they
Gestural communication of great apes 169

all use different definitions of the term gesture or none at all, and did not focus
on processes of social cognition such as the learning of the gestures or their inten-
tional use (with the exception of Plooij 1979, 1987). The major aim of the present
chapter is therefore to investigate whether communicative gestures in great apes
are used as flexibly communicative strategies with individual decision making
involving cognitive processes, which involve at least some degree of intersubjec-
tive understandings.
This chapter will therefore provide an overview of the gestural signalling of
all four great ape species and focus in detail on the following three aspects: (1) the
intentional use of gestures, (2) their referential use, and (3) similarities and differ-
ences to gestures in prelinguistic or just-linguistic human infants.
The presented quantitative data are based on recent papers on the gestural
communication of subadult apes in captivity (Pongo pygmaeus: Liebal, Pika and
Tomasello 2006; Gorilla gorilla: Pika, Liebal and Tomasello 2003; Pan paniscus:
Pika, Liebal, and Tomasello 2005; Pan troglodytes: Tomasello et al. 1994, 1997,
1985, 1989).

4. Gestures in the great apes: Empirical evidence

Gestures are a subset of communicative signals. They can be defined as expressive


movements of the limbs or head and body postures that are directed toward a re-
cipient, are mechanically ineffective and receive a voluntary response. The follow-
ing behavioural criteria were used to infer their communicative intent: (1) gazing
at the recipient, and/or (2) waiting after the signal had been produced, expecting
a response. Thus, gestures that appear to have components of ritualised morphol-
ogy (e.g., chest beat) are also included in this definition, if they meet these above
mentioned criteria.

4.1 Overview: Gestural repertoires

Based on auditory, tactile and visual components we formed three signal catego-
ries: (1) auditory gestures generate sound while performed, (2) tactile gestures in-
clude physical contact with the recipient, and (3) visual gestures generate a mainly
visual component with no physical contact. In addition, the accompanying con-
text was analyzed in own studies.
The bonobos used 20 different distinct gestures (see Table 1), one auditory
(5%), eight tactile (40%) and eleven visual gestures (55%), which were performed
170 Simone Pika

Table 1. Gestural repertoire of the great apes


Pan paniscus Pan troglodytes Gorilla gorilla Pongo pygmaeus
Auditory 1 3 6 0
Visual 11 18 16 14
Tactile 8 9 11 12
Total 20 30 33 26
average/ 11 9.5 20 16
individual

mainly in the play context (55%), but also in the food (14%), travel (10%), nurse
(5%), ride (5%), sex (5%), affiliative (3%), and agonistic contexts (3%).
The chimpanzees used 30 different distinct gestures (see Table 1), three audi-
tory (10%), nine tactile (30%), and 18 visual gestures (60%), in a variety of contexts
including affiliation, agonistic, feeding and nursing, sexual, grooming, travel, and
play. Play was the most important context accounting for between 47% and 70%
of the gestures depending on the studies (Tomasello et al. 1994; 1997).
Overall the gorillas performed 33 different distinct gestures, six auditory
(18%), 11 tactile (33%) and 16 visual gestures (49%). These gestures occurred
mainly in the play (40%) context, but also in the food (15%), ride (10%), nurse
(10%), travel (10%) affiliative (10%) and agonistic (5%) context.
The orangutans used 26 different distinct gestures, 12 tactile and 14 visual ges-
tures. These gestures occurred mainly in the play (32%), feeding (26%), affiliative
(17%), and agonistic context (7%) but also in the context of getting access to ob-
jects (2%), sex (2%), walking (2%), and nursing (2%).
Overall, these data show that all four great ape species have multifaceted ges-
tural repertoires of auditory, tactile and visual gestures, which are used in a vari-
ety of contexts.

4.2 Intentional action

A crucial milestone in human ontogeny is the onset of intentional behaviour,


which develops during the second half of the first year of life (Bates et al. 1979).
Piaget (1952) defined an intention (in the psychological sense) as the differen-
tiation of means and ends, but also emphasized the difficulty to create a valid
definition. This view is supported by the history of psychology which attests to
the difficulty of making a clear distinction between intentional and unintentional
behaviour. However, work in the study of pre-linguistic communication in human
infants has offered some relatively clear operational definitions that may be used
Gestural communication of great apes 171

to differentiate between these two types of behaviours. Elaborating on Piagets


view, Bruner (1981) for instance noted that
an intention is present when an individual operates persistently toward achiev-
ing an end state, chooses among alternative means and/or routes to achieve that
end state, persists in deploying means and corrects the deployment of means to
get closer to the end state, and finally ceases the line of activity when specifiable
features of the states are achieved.

It is worth mentioning that according to this definition much of intentional action


takes place beyond the threshold of reportable awareness (Bruner 1981). Follow-
ing Bratman (1989) an intention can be understood as a plan of action the organ-
ism chooses and commits itself to in pursuit of a goal. An intention thus includes
both a means (action plan) and a goal. Contrary to the definition of Bruner (1981)
in this definition the actor seems to be able to account for or be conscious of the
nature of his intentions.
In addition, distinctions have been made between perlocutionary and illo-
cutionary acts (Bates et al. 1979), or communicative behaviour and intentionally
communicative behavior (Golinkoff 1981). Perlocutionary acts are infant behav-
iours, in which communication occurs only because the receiver is adept at inter-
preting the behaviour of the child (von Glaserfeld 1974, 1976). For instance, an
infant might cry because it can not reach a toy, which causes the mother to come
and give her the toy. Although communicative behaviour occurred, the behaviour
of the infant was not intentionally directed toward the mother. Illocutionary or
intentionally communicative behaviours on the other hand are infant behaviours,
in which the sender is aware a priori of the effect that a signal will have on his
listener, and he persists in that behavior until the effect is obtained or failure is
clearly indicated (Bates et al. 1979:36). For example, the child turns its attention
from the toy to the mother and whines at her. The whining becomes a social-
communicatory act with the intention of obtaining the adults help. This distinc-
tion emphasizes the contributions of each partner to the communicative act and
provides the behavioural tools which permit us to reliably identify intentional
communication:

a. alternations in eye contact between the goal and the intended communica-
tion partner,
b. augmentations, additions and substitutions of signals until the goal has been
obtained, and
c. changes in the form of the signal toward the abbreviated and/or exaggerated
patterns that are appropriate only for achieving a communicative goal.
172 Simone Pika

With respect to intersubjectivity, it can be argued that it is the illocutionary acts


that are most relevant. To investigate whether the behaviour of apes qualifies as
intentional acts, researchers have used the developmental Piagetian and the pre-
verbal communication perspective. Using Speech Acts theory (see Austin 1962;
Bates et al. 1975), Plooij (1978, 1979) for instance studied the development of
communicative signals between mother-infant dyads in chimpanzees in the wild.
He argued that with the onset of begging between the age of 9 and 12 months,
(also often called peering, Pika et al. 2005) followed by the use of gestures such
as initiating tickling, grooming and approach, the chimpanzee infant understands
his mother and conspecifics as social agents. This developmental stage there-
fore marks the onset of the use of imperative gestures (which are used to get an-
other individual to help in attaining a goal) and the developmental shift from
perlocutionary to illocutionary acts. Bard (1992) investigated the communica-
tive abilities of orangutan infants in a food sharing context and focused on the
transition from bifocal behaviour to behavioural sequences. Differentiation was
made between intentional behaviour as shown in bifocal behavioral sequences
involving either objects or social agents (Case 1985) and intentional communi-
cation (behavioral sequences involving coordination between social agents and
objects, e.g., Bretherton, McNew and Beeghly-Smith 1981). Bard (1992) found
that orangutans at the age from 16 months used intentional behaviours, whereas
intentional communication was observed in older orangutans only, ranging from
2 to 5 years of age. In intentional behaviour the action was either directed to the
food, for instance, with a grasp coordinated with eating, or directed to the mother,
for instance, by performing a pull on the mothers body and subsequently eating.
In intentional communication, on the other hand, the animal solicited food from
the mother by using one open, cupped hand, palm up, held underneath but not
necessarily touching the mothers chin.
To identify whether great apes use their gestures as intentional acts while com-
municating with group members, the presented studies focused on (1) Means-
end dissociation of signalling behaviour and goal, and (2) Adjustment to social
circumstances, such as adjustment to audience affects (for a detailed description
of the methods and animals see, Liebal et al. 2006; Pika et al. 2003; Pika, Liebal,
and Tomasello 2005; Tomasello et al. 1994; Tomasello et al. 1997).
Means-ends dissociation is characterized by the flexible relation of signalling
behaviour to the recipient and goal. For example, an individual uses a single ges-
ture for several goals (touch for nursing and riding) or different gestures for the
same goal (slap ground and body beat for play). Audience effects are characterized
by differential use of gestures or other communicative signals as a function of the
attentional states of the recipient.
Gestural communication of great apes 173

Figure 1. Means-ends dissociation. Average number of gestures used in a single context


and average number of gestures used for several contexts. Error bars indicate the SD.

4.2.1 Means-ends dissociation


Our results showed that bonobos used on average in every context approximately
two ( 0.6) different gestures, the chimpanzees 3.2 ( 0.4), the gorillas 3.2 ( 1),
and the orangutans 5.3 (1.2) gestures (Figure 1). Concerning the use of gestures
in different contexts, the bonobos utilized on average 2.7 ( 1.48) gestures in more
than one context, the chimpanzees 1.3 ( 0.2), the gorillas 3.8 ( 2.6), and the
orangutans 1.5 ( 0.9) gestures.

4.2.2 Adjustment to audience effects


Focusing on audience effects we found a significant difference between the use
of tactile and visual gestures among all species based on a variation in the degree
of visual attention of the recipient (Wilcoxon-test: P < 0.05, for further details
see, Liebal et al. 2006; Pika et al. 2003; Pika, Liebal, and Tomasello 2005; Toma-
sello et al. 1994). There was no significant difference between the uses of audi-
tory versus visual gestures and auditory versus tactile gestures. On average, the
bonobos performed 70% ( 10) of their visual gestures to an attending recipient,
the chimpanzees 86% ( 2), the gorillas 89% ( 12), and the orangutans 98.8%
(2). Tactile gestures were performed to an attending recipient in 51% (bono-
bos, 10), 48% (chimpanzees, 10), 66% (gorillas, 13), and 67% (orangutans,
10.3) of the cases only (see Figure 2).
174 Simone Pika

Figure 2. Audience effects. The y-axis indicates the percentage of gestures, the x-axis
indicates the different species. The four different colours indicate the signal category and
the attentional state of the recipient. Error bars indicate the SD.

In sum, these results reveal that great apes use different means, gestures in
the same context interchangeably toward the same end, but also use the same
means/gesture to achieve different ends/goals. Concerning audience effects, the
findings show that great apes preferentially use visual gestures to an already at-
tending recipient. Based on the key characteristics for intentional communication
in human children, we can therefore conclude that great apes use gestures that
classify as intentional acts.
Other relevant studies on intentional communication focused mainly on au-
dience effects. Tanner and Byrne (1993) for instance reported that a female gorilla
repeatedly used her hands to hide her playface from a potential partner, indi-
cating some flexible control of the otherwise involuntary facial expression as
well as a possible understanding of the role of visual attention in the process of
gestural communication. Liebal et al. (2004) showed that chimpanzees tended to
move into the attentional field of the recipient by walking in front of her and then
performed visual gestures. Furthermore, in an experimental setting, Liebal et al.
(2004) showed that all four great ape species take into account the attentional
state of a human experimenter, by using visual gestures preferentially when they
were facing the experimenter.
In addition, anecdotal evidence for intentional communication is available
for the language trained bonobo Kanzi. He was observed to hand a nut to a person
Gestural communication of great apes 175

who was supposed to crack it open. He then slapped the nut and placed a stone on
top of it (Savage-Rumbaugh et al. 1986).

4.3 Referential gesturing

Researchers working on communicative signals of pre-linguistic and just-linguis-


tic children distinguish gestures in terms of direction and function. Concern-
ing the direction of gestures, differentiation has been made between dyadic and
triadic gestures. Dyadic gestures involve two individuals and are used to attract
the attention of others to the self; triadic gestures are used to attract the attention
of others to some entity, e.g. an event or an object. Although this third entity
mainly denotes an outside entity (Bates 1976), it can also be a part of ones body,
e.g. referring to ones own nose. Triadic gestures are therefore clearly referential
and develop in human children at the age of 12 months (Bates et al. 1979). The use
of these gestures has been linked with cognitive capacities such as mental state at-
tribution (Camaioni 1993; Tomasello 1995), because the recipient must infer the
signallers intended meaning.
Concerning the function of gestures, differentiation has been made between
protoimperative and protodeclarative gestures (Bates et al. 1975). Protoimpera-
tives are defined as the childs preverbal intentional use of the listener as an agent
or tool in achieving some end (e.g. to request an object). Protodeclaratives are
defined as the childs preverbal effort to direct the adults attention to some event
or object in the world. This approach suggests continuity between preverbal and
later verbal communication and is useful when focusing on human children
whose gestures precede speech (Bates et al. 1979). However, it is not coherent to
use these terms for species who will never exhibit verbal communication (Leav-
ens 2004; Leavens and Hopkins 1998). The term imperative will thus be used to
refer to gestures being used to get another individual to help in attaining a physi-
cal goal, such as getting an object, playing, etc., and the term declarative will be
used to characterize those gestures which are used to attain a non-physical goal,
namely to draw anothers attention to an object or entity merely for the sake of
sharing attention. Leavens et al. (this volume) and Baron-Cohen (1999) explicitly
exclude protoimperative gestures from the category of intentional communica-
tion (in a similar vein see also, Povinelli et al. 2000; Povinelli et al. 2001), arguing
that only protodeclarative gestures imply the signallers possession of a nascent
theory of mind.

. See, however, Leavens (2004) for a critical view on defining these modes of communication
by reference to underlying psychological processes or mental states.
176 Simone Pika

The majority of gestures used between great apes in their natural communica-
tion are dyadic (Pika et al. 2005). Exceptions are the gestures food begging (an ani-
mal holds out the hand, palm up to obtain food from another, see for orangutans,
Bard 1992; see for chimpanzee Tomasello et al. 1994) food offer (an animal offers
food placed on her arm to another one, Liebal et al. 2006) and pointing. These ges-
tures are clearly triadic a request to another for food or an offer of food to another
is distal since the signaller is not touching the recipient. Food-begging and food-of-
fer have been observed between conspecifics, but pointing has only been reported
for captive chimpanzees interacting with their human experimenters (e.g., Leav-
ens et al. 1996; Leavens et al. 2004) as well as human-raised or language trained
apes (e.g., Gardner and Gardner 1969; Miles 1990; Patterson 1978a; Woodruff and
Premack 1979). Although there is one anecdotal report about the declarative use of
this gesture in a single bonobo in the wild (Vea and Sabater-Pi 1998), it is not clear
yet whether these abilities represent natural communication abilities or are by-
products of living in a human encultured environment (Tomasello and Call 1997).
It may be argued that apes have no need for pointing because they rely on other
behaviors that serve a similar function, such as detection of body orientation and
eye gaze (Gomez1991; Menzel 1974, 1973). While walking mainly quadrupedally
their whole body is pointing (Plooij 1987).
Interestingly, Savage-Rumbaugh and colleagues (Savage and Bakeman 1978;
Savage-Rumbaugh et al. 1977) and Tanner and Byrne (1996) described several ges-
tures that they consider iconic uses of gestures. Iconic gestures are related to their
referent by virtue of some actual physical resemblance between the two (Bates et
al. 1979), such as a desired motion in space or the form of an action. Two individu-
als (one bonobo, one gorilla) seemed to signal with their hand, arm, or head to a
playmate the direction in which they wanted her to move, the action they wanted
her to perform, or the position they wanted her to take. Roth (1995) and Pika et
al. (2003) however, who also focused on the occurrence of iconic gestures in three
groups of bonobos and two groups of gorillas in captivity, did not observe any in-
stances of the iconic use of gestures. It is possible that their analysis did not focus
in sufficient detail on the receivers response to detect gestures of an iconic nature.
Another explanation would be that gesturing of an iconic nature could be a devel-
opmental phenomenon, appearing only at adolescence and promoted by special
social and physical conditions (Tanner and Byrne 1999).
Concerning the function of gestures, the majority of studies have shown that
apes mainly use imperative gesture in their natural communication. Focusing on
human-raised and language-trained apes, Patterson (1978b) reported observa-
tions of showing in one gorilla, and Savage-Rumbaugh (1988) in one bonobo.
Furthermore, Savage-Rumbaugh et al. (1998) described how a bonobo female who
had heard unusual sounds in the forest, directed the human caretakers attention
Gestural communication of great apes 177

Figure 3. The directed scratch gesture. Dorothee Classen

toward these sounds by looking and gesturing in that direction. It should be not-
ed, however, that in all these cases interpretation is an issue.
Interestingly, Pika and Mitani (2006) described the widespread use of a ges-
ture, the so called directed scratch, by wild chimpanzees in the context of groom-
ing (see Figure 3). The gesture involved one chimpanzee making a relatively loud
and exaggerated scratching movement on a part of his body, which could be seen
by his grooming partner.
In the majority of the cases the indicated spot was groomed directly by the
recipient. Pika and Mitani (2006) argue that (1) the gesture may be used commu-
nicatively to indicate a precise spot on the body, and (2) the recipient of the signal
has an understanding of the intended meaning of the gesture. The authors suggest
that directed scratches therefore may qualify as referential.
In sum, the evidence shows that the majority of gestures of great apes used
between conspecifics are imperative and dyadic (see also, Pika et al. 2005). In ad-
dition, the use of referential gestures supports earlier findings that certain impor-
tant cognitive capacities pertaining to intersubjectivity are present in apes (for an
overview see e.g., Zuberbhler, Tomasello & Call 1997).
178 Simone Pika

4.4 Similarities and differences of gestures in apes and human children

Our and other researchers findings provide evidence that similar to prelinguistic
or just-linguistic human children, apes use their gestures as intentional acts, by
operating persistently toward achieving an end state, choosing among alternative
means and adjusting their use of gestures to social circumstances. The majority
of their gestures are used in dyadic interactions, whereas human children gesture
from their very first attempts in addition to dyadic gestures triadically (Carpenter
et al. 1998). However, it is worth noticing that quantitative comparisons are until
now non-existent.
Focusing on the type of gestures, human children perform conventionaliza-
tions, deictics, and symbolic gestures (Bates et al. 1979). Conventionalizations are
gestures in which the signaller uses an effective behaviour for getting something
done. For instance, many infants learn to request being picked up by raising their
arms over their heads while approaching an adult. Great apes often do similar
gestures, such as using a stylized arm-raise to initiate play, ritualized from actual
acts of play hitting in the context of rough-and-tumble play. Many youngsters
also conventionalize signals for asking their mother to lower her back so they can
climb on. For example, a brief touch on the top of the rear end, ritualized from
occasions on which they pushed her rear end down mechanically. The learning
process involved in such cases is most likely an individual learning process called
ontogenetic ritualization, in which a communicatory signal is created by two in-
dividuals shaping each others behaviour in repeated instances of an interaction
(Tomasello 1996). However, it does not involve understanding of communicative
intentions or cultural (imitative) learning of any sort and therefore it does not
create a shared communicative symbol. However, also note that the production of
a variety of gestures of great apes, especially species-typical gestures (e.g. the chest
beat in gorilla), seem to be due to genetic predisposition, while only the use and
response has to be learned (Pika et al. 2003).
The second type of gestures is deictics, which are designed to direct adult
attention to outside entities. This does not automatically mean that the infant is
gesturing in order to induce the adult to share attention with her on that third
entity. Indeed, many infants use arm and index finger extension to orient their
own attention to things and only understand later the function of the gesture
(Carpenter et al. 1998; Franco and Butterworth 1996). Chimpanzees interacting
with human caretakers point to request food and these gestures are clearly deic-
tic. In addition, the gesture food offer seems to bear some relation with deictics,

. Note that this is quite different from the notion of conventions, as used by e.g. Zlatev (this
volume).
Gestural communication of great apes 179

ecause they are triadic, distal and direct the attention of the recipient to a spe-
b
cific event and object. The gesture directed scratch also resembles in some ways a
deictic gesture. However, although it clearly involves some form of reference, this
reference is self-directed.
The third gesture type produced by human children, symbolic gestures, are
communicative acts that are either associated with a referent metonymically (the
gesture refers to an element or attribute of something to mean the thing itself) or
iconically (Acredelo and Goodwyn 1988; Pizzuto and Volterra 2000). Examples
include gestures such as: sniffing for a flower, panting for a dog, holding arms out
for an airplane, raising arms for big things, and blowing for hot things. Empirical-
ly we do not know whether infants learn to produce symbolic gestures via ritual-
ization or via imitation (Lock 1978), but it is much more likely that in most cases
infants are learning these symbolic gestures via imitation. That is, they are learn-
ing by understanding an adults communicative intention in using the gesture first
and then engaging in role reversal imitation to use the gesture themselves when
they have the same communicative intention.
Although, group specific gestures such as leaf clipping (Nishida 1980), the
grooming hand clasp (McGrew and Tutin 1978), somersault (Pika et al. 2005), and
armshake (Pika et al. 2003) used by apes differ qualitatively from symbolic ges-
tures in human children, their existence suggests that social learning based on
intersubjectivity, in the form of some kind of group-specific cultural transmis-
sion, is at work. In addition, gestures such as directed scratches used by chimpan-
zees in the wild, provide evidence that (a) chimpanzees have an understanding
of the intended meaning of the gestures, and (b) signallers and receivers shape
a non-communicative signal into a communicative one with a distinct meaning.
They thus might be useful tools to reconstruct the evolution and development of
symbolic gestures.
Thus, the crucial difference between the gestures of apes and those of prelin-
guistic or just-linguistic human children becomes obvious when focusing on the
function of gestures: Apes mainly gesture for imperative purposes, while human
children gesture for imperative purposes but also quite frequently for declarative
purposes to direct the attention of others to an outside object or event, simply
for the sake of sharing interest in it or commenting on it (Bates et al. 1975; Lisz-
kowski et al. 2004). The propensity to communicate about outside entities and
situations and comment on them seems to be unique for human beings and might
have triggered the onset of symbolic communication, i.e. language. The question
thus arises: why do humans comment on outside entities to share experiences?
This behaviour is probably linked with an increased level of intersubjectivity that
enables humans to understand other people as intentional agents with whom
they may share experience (Tomasello et al. 2005). It therefore might have been
180 Simone Pika

erived from the need to create a new medium for social bonding in humans; a
d
medium to establish and service social relationship and to share experiences. In
our closest living relatives, the bonobos and the chimpanzees, social grooming
permeates virtually every aspect of social life. Grooming might therefore repre-
sent an ancient medium to evaluate and to invest into social relationships which
was lost in our ancestors, who developed different means to perform this function
through vocal grooming (Dunbar 1996) and/or gestural grooming, flowing in a
later stage into linguistic communication.

5. Conclusions

The ability to employ manual and bodily gestures provides a rich source of infor-
mation about the nature of human and primate communication. Similar to the
vocal modality, human beings use some of their gestures symbolically, based on
intersubjectively learned and shared social conventions, to direct the attention of
others referentially and for declarative purposes. This mode of communication
clearly depends on the ability to be aware of others mentalities and an under-
standing of the intentional states of others, which might be unique to the human
species (Tomasello et al. 2005).
Primates use gestures mainly for imperative purposes and in dyadic interac-
tions. However, many of their gestures in contrast to their vocalizations are
clearly learned and are used intentionally, with adjustments for the attentional
state of the recipient and means-ends dissociation. This shows that the capac-
ity of great apes for intersubjectivity, while differing from that of humans, is not
negligible. In addition, since apes in the wild seem to make use of group-specific
and referential gestures, it seems plausible that the gestural modality of our clos-
est living relatives was the modality within which symbolic communication first
evolved (Arbib 2002; Condillac 1971; Corballis 2002; Hewes 1973; see also Hutto
this volume). Future studies will hopefully shed light on potential evolutionary
mechanisms by which the vocal and gestural signals of apes transformed into the
linguistic and gestural symbols of human beings.

Acknowledgements

I am grateful to Josep Call, Katja Liebal and Michael Tomasello, who shared their
data with me. For comments on an earlier draft and lively discussions, I would
like to thank Katja Liebal, Elena Nicoladis and Chris Sinha. I am indebted to
Dorothee Classen, the artist of the drawing of directed scratch in Figure 3.
Gestural communication of great apes 181

References

Acredelo, L.P. and Goodwyn, S.W. 1988. Symbolic gesturing in normal infants. Child Develop-
ment 59: 450466.
Arbib, M.A. 2002. The mirror system, imitation, and the evolution of language. In Imitation in
Animals and Artifacts. Complex Adaptive Systems, K. Dautenhahn and C.L. Nehaniv (eds.),
229280. Cambridge, Masachusetts, USA: MIT Press.
Austin, J. L. 1962. How to Do Things with Words. New York: Oxford University Press.
Bard, K.A. 1992. Intentional behaviour and intentional communication in young free-ranging
orangutans. Child Development 63: 11861197.
Baron-Cohen, S. 1991. Precursors to a theory of mind: Understanding attention in others. In
Natural Theories of Mind: Evolution, Development and Simulation of Everyday Mindread-
ing, A. Whiten (ed.), 233251. Oxford, UK: Blackwell.
Bates, E. 1976. Language and Context: The Acquisition of Pragmatics. New York: Academic
Press.
Bates, E., Benigni, L., Bretherton, I., Camaioni, L. and Volterra, V. 1979. The Emergence of Sym-
bols: Cognition and Communication in Infancy. New York: Academic Press.
Bates, E., Camaioni, L. and Volterra, V. 1975. The acquisition of performatives prior to speech.
Merrill-Palmer Quarterly 21(3): 205226.
Bratman, M.E. 1989. Intention and personal policies. In Philosophical perspectives, Philoso-
phy of mind and action theory, J.E. Tomberlin (ed.), 443469. Northridge: California State
University.
Bretherton, I., McNew, S. and Beeghly-Smith, M. 1981. Early persons knowledge as expressed
in gestural and verbal communication: When do infants require a theory of mind. In
Infant Social Cognition: Empirical and Theoretical Considerations, M. Lamb, E. and
L.R.Sherrod (eds.), 333373. Hillsdale, New York: Erlbaum.
Bruner, J. 1981. Intention in the structure of action and interaction. In Advances in Infancy
Research, L. Lipsitt (ed.), Vol. 1, 4156. New Jersey: Ablex, Norwood.
Camaioni, L. 1993. The development of intentional communication: A re-analysis. In New
Perspectives in Early Communicative Development, J. Nadel and L. Camaioni (eds.), 8296.
London: Routledge.
Carpenter, M., Nagell, K. and Tomasello, M. 1998. Social cognition, joint attention, and com-
municative competence from 9 to 15 months of age. Monographs of the Society for Re-
search in Child Development 36(4): 176179.
Case, R. 1985. Intellectual Development: Birth to Adulthood. New York: Academic Press.
Cheney, D.L. and Seyfarth, R.M. 1990. How Monkeys See the World. Chicago and London: Uni-
versity of Chicago Press.
Clark, A.P. and Wrangham, R.W. 1993. Acoustic analysis of wild chimpanzee pant hoots: Do
Kibale forest chimpanzees have an acoustically distinct food arrival pant hoot? American
Journal of Primatology 31: 99109.
Clark, A.P. and Wrangham, R.W. 1994. Chimpanzee arrival pant-hoots: Do they signify food
or status? International Journal of Primatology 15(2): 185205.
Condillac, E.B.D. 1971. An Essay on the Origin of Human Knowledge; Being a Supplement to Mr.
Lockes Essay on the Human Understanding. A Facism. Reproduction of the Translation of
Thomas Nugent. Gainesville, Florida: Scholars facsimiles and reprints.
182 Simone Pika

Corballis, M.C. 2002. From Hand to Mouth, the Origins of Language. Princeton, New Jersey:
Princeton University Press.
Crockford, C. and Boesch, C. 2003. Context-specific calls in wild chimpanzees, Pan troglodytes
verus: Analysis of barks. Animal Behaviour 66(1): 115125.
de Waal, F.B.M. 1988. The communicative repertoire of captive bonobos (Pan paniscus) com-
pared to that of chimpanzees. Behaviour 106(34): 183251.
Dunbar, R. 1996. Grooming, Gossip and the Evolution of Language. London: Faber and Faber
Ltd.
Fay, J.M. 1989. Hand-clapping in western lowland gorillas. Mammalia 53(3): 457458.
Franco, F. and Butterworth, G. 1996. Pointing and social awareness: Declaring and requesting
in the second year. Journal of Child Language 23: 307336.
Gardner, R.A. and Gardner, B. 1969. Teaching sign language to a chimpanzee. Science 165:
664672.
Golinkoff, R. 1981. The influence of Piagetian theory on the study of the development of com-
munication. In New Directions in Piagetian Theory and Practice, I.E. Sigel and D.M.Brodz-
insky and R. Golinkoff (eds.), 127142. Hillsdale, New York: Erlbaum.
Gomez, J.C. 1991. Visual behaviour as a window for reading the mind of others in Primates.
In Natural Theories of Mind: Evolution, Development and Simulation of Everyday Mind-
reading, A. Whiten (ed.), 195207. Oxford: Basil Blackwell.
Goodall, J. 1986. The Chimpanzees of Gombe, Patterns of Behaviour. Cambridge, England: The
Belknap Press of Harvard University Press.
Hewes, G.W. 1973. Primate communication and the gestural origin of language. Current An-
thropology. 12(12): 524.
Hutto, D.D. this volume. First communions: Mimetic sharing without theory of mind.
Kummer, H. 1968. Social Organization of Hamadryas Baboons. Chicago: University of Chicago
Press.
Kummer, H. and Kurt, F. 1965. A comparison of social behaviour in captive and wild hama-
dryas baboons. In The Baboon in Medical Research, H. Vagtborg (ed.), 116. Texas: Uni-
versity of Texas Press.
Ladygina-Kohts, N.N. 1935. Infant Chimpanzee and Human Child. A Classic 1935 Comparative
Study of Ape Emotions and Intelligence. New York: Oxford University Press.
Leavens, D.A. 2004. Manual deixis in apes and humans. Interaction Studies 5: 387408.
Leavens, D.A. and Hopkins, W.D. 1998. Intentional communication by chimpanzees: A cross-
sectional study of the use of referential gestures. Developmental Psychology. 34: 813822.
Leavens, D.A., Hopkins, W.D. and Bard, K.A. 1996. Indexical and referential pointing in chim-
panzees (Pan troglodytes). Journal of Comparative Psychology 110(4): 346353.
Leavens, D.A., Hopkins, W.D. and Thomas, R.K. 2004. Referential communication by chim-
panzees (Pan troglodytes). Journal of Comparative Psychology 118(1): 4857.
Liebal, K., Call, J. and Tomasello, M. 2004. Chimpanzee gesture sequences. Primates 64: 377
396.
Liebal, K., Pika, S., Call, J. and Tomasello, M. 2004. Great ape communicators move in front of
recipients before producing visual gestures. Interaction studies 5(2): 199219.
Liebal, K., Pika, S. and Tomasello, M. 2006. Gestural communication of orangutans (Pongo
pygmaeus). Gesture 6(1): 138.
Liebermann, P. 1968. Primate vocalizations and human linguistic ability. Journal of the Acous-
tic Society of America 44: 15741584.
Gestural communication of great apes 183

Liszkowski, U., Carpenter, M., Henning, A., Striano, T. and Tomasello, M. 2004. Twelve-
months-olds point to share attention and interest. Developmental Science 7(3): 297307.
Lock, A. 1978. Action, Gesture and Symbol: The Emergence of Language. New York: Academic
Press.
MacKinnon, J.R. 1974. Behaviour and ecology of Orang Utans. Animal Behaviour 22: 374.
Maestripieri, D. 1996. Maternal encouragement of infant locomotion in pigtail macaques (Ma-
caca nemestrina). Animal Behaviour 51: 603610.
Maestripieri, D. 1997. Gestural communication in macaques. Evolution of Communication
1(2): 193222.
Maestripieri, D. 1999. Primate social organization, gestural repertoire size, and communica-
tion dynamics. In The Origins of Language: What Nonhuman Primates Can Tell, B.J. King
(ed.), 5577. Santa Fe: School of American Research Press.
Marler, P. 1980. Primate Vocalization: Affective or Symbolic? New York: Plenum Press.
Marler, P. and Hobbett, L. 1975. Individuality in a long-range vocalization of wild chimpan-
zees. Zeitschrift fr Tierpsychologie 38: 97109.
McGrew, W.C. and Tutin, C.E.G. 1978. Evidence for a social custom in wild chimpanzees?
Man 13: 234251.
Menzel, E. 1974. A group of young chimpanzees in a one-acre field. In Behaviour of Nonhuman
Primates, A. Schrier and F. Stollnitz (eds.), Vol. 5, 83153. New York: Academic Press.
Menzel, E.W. 1973. Chimpanzee spatial memory organization. Science 182(4115): 943945.
Miles, H.L. 1990. The cognitive foundations for reference in a signing orangutan. In Language
and Intelligence in Monkeys and Apes, S.T. Parker and K.R. Gibson (eds.), 511539. Cam-
bridge: Cambridge University Press.
Mitani, J.C. and Brandt, K.L. 1994. Social factors influence the acoustic variability in the long-
distance calls of male chimpanzees. Ethology 96(3): 233252.
Mitani, J.C., Hasegawa, T., Gros-Louis, J., Marler, P. and Byrne, R.W. 1992. Dialects in wild
chimpanzees? American Journal of Primatology 27(4): 233243.
Nishida, T. 1980. The leaf-clipping display: A newly-discovered expressive gesture in wild
chimpanzees. Journal of Human Evolution 9: 117128.
Owings, D.H. and Morton, D.S. 1998. Animal Vocal Communication: A New Approach. Cam-
bridge: Cambridge University Press.
Parnell, R.J. and Buchanan-Smith, H.M. 2001. Animal behaviour: An unusual social display by
gorillas. Nature 412: 294.
Patterson, F. 1978a. Conversations with a gorilla. National Geographic 134(4): 438465.
Patterson, F. 1978b. Linguistic capabilities of a lowland gorilla. In Sign Language and Language
Acquisition in Man and Ape, F.C.C. Peng (ed.), 161201. Boulder, CO: Westview Press.
Piaget, J. 1952. The Origins of Intelligence in Children. New York: Norton.
Pika, S., Liebal, K., Call, J. and Tomasello, M. 2005. The gestural communication of apes.
Gesture 5(1/2): 4156.
Pika, S., Liebal, K. and Tomasello, M. 2003. Gestural communication in young gorillas (Go-
rilla gorilla): Gestural repertoire, learning and use. American Journal of Primatology 60(3):
95111.
Pika, S., Liebal, K. and Tomasello, M. 2005. Gestural communication in subadult bonobos (Pan
paniscus): Gestural repertoire and use. American Journal of Primatology 65(1): 3951.
Pika, S. and Mitani, J.C. 2006. Referential gesturing in wild chimpanzees (Pan troglodytes).
Current Biology. 16(6): 191192.
184 Simone Pika

Pizzuto, E. and Volterra, V. 2000. Iconicity and transparency in sign languages: A cross-lin-
guistic cross-cultural view. In The Signs of Language Revisited: An anthology to honor Ur-
sula Bellugi, K. Emmorey and H. Lane (eds.), 261286. New York: Erlbaum.
Plooij, F.X. 1978. Some basic traits of language in wild chimpanzees? In Action, Gesture and
Symbol, A. Lock (ed.), 111131. London: Academic Press.
Plooij, F.X. 1979. How wild chimpanzee babies trigger the onset of mother-infant play. In
Before Speech, M. Bullowa (ed.), 223243. Cambridge: Cambridge University Press.
Plooij, F.X. 1987. Infant-ape behavioural development, the control of perception and, types of
learning and symbolism. In Symbolism and Knowledge, A. Tryphon and J. Montangero
(eds.), Vol. 8, 2958. Geneva: Jean Piaget Archives Foundation.
Povinelli, D.J., Bering, J.M. and Giambrone, S. 2000. Toward a science of other minds: Escap-
ing the argument by analogy. Cognitive Science 24: 509 541.
Povinelli, D.J., Bering, J.M. and Giambrone, S. 2001. Reasoning about beliefs: A human spe-
cialization? Child Development 72: 691695.
Rijksen, H.D. 1978. A Field Study on Sumatran Orangutans. Wageningen: Mededelingen Land-
bouwhogeschool.
Roth, R.R. 1995. A study of gestural communication during sexual behavior in bonobo (Pan
paniscus, Schwartz). Unpublished PhD dissertation, University of Calgary, Calgary.
Rumbaugh, D.M. 1977. Language Learning by a Chimpanzee. The Lana project. New York: Aca-
demic Press.
Savage, S. and Bakeman, R. 1978. Sexual morphology and behavior in Pan paniscus. In Pro-
ceedings of the Sixth International Congress of Primatology, 613616. New York: Academic
Press.
Savage-Rumbaugh, E.S. 1988. A new look at ape language: Comprehension of vocal speech and
syntax. In Comparative Perspectives in Modern Psychology, Nebraska Symposium on Moti-
vation, D.W. Leger (ed.), Vol. 35, 201256. Lincoln, NB: University of Nebraska Press.
Savage-Rumbaugh, E.S., McDonald, K., Sevcic, R.A., Hopkins, W.D. and Rupert, E. 1986.
Spontaneous symbol acquisition and communicative use by pygmy chimpanzees (Pan
paniscus). Journal of Experimental Psychology: General 115: 211235.
Savage-Rumbaugh, E.S., Murphy, J., Sevcic, R.A., Brakke, K.E., Williams, S.L. and Rumbaugh,
D.M. 1993. Language comprehension in ape and child. Monographs of the Society for
Research in Child Development 58(34): 1256.
Savage-Rumbaugh, E.S., Rumbaugh, D.M. and McDonald, K. 1985. Language learning in two
species of apes. Neurosciences and Biobehavioral Review 9: 653656.
Savage-Rumbaugh, E.S., Shanker, S.G. and Taylor, T.J. 1998. Apes, Language, and the Human
Mind. New York: Oxford University Press.
Savage-Rumbaugh, E.S., Wilkerson, B.J. and Bakeman, R. 1977. Spontaneous gestural com-
munication among conspecifics in the pygmy chimpanzee (Pan paniscus). In Progress in
Ape Research, G.H. Bourne (ed.), 97116. New York: Academic Press.
Savage-Rumbaugh, S. and Wilkerson, B. 1978. Socio-sexual behavior in Pan paniscus and Pan
troglodytes: A comparative study. Journal for Human Evolution 7: 327344.
Seyfarth, R.M. 1987. Vocal communication and its relation to language. In Primate societies,
B. Smuts and D.L. Cheney and R. Seyfarth and R. Wrangham and T. Struhsaker (eds.),
440451. Chicago: University of Chicago Press.
Seyfarth, R.M. and Cheney, D.L. 1997. Some general features of vocal development in nonhu-
man primates. In Social Influences on Vocal Development, C. Snowdon and M. Hausberger
(eds.), 249273. Cambridge: Cambridge University Press.
Gestural communication of great apes 185

Seyfarth, R.M., Cheney, D.L. and Marler, P. 1980. Vervet monkey alarm calls: Semantic com-
munication in a free-ranging primate. Animal Behaviour 28: 10701094.
Slocombe, K.E. and Zuberbhler, K. 2005a. Agonistic screams in wild chimpanzees vary as a
function of social role. Journal of Comparative Psychology 119(1): 6777.
Slocombe, K.E. and Zuberbhler, K. 2005b. Functionally referential communication in a
chimpanzee. Current Biology 15: 11791784.
Snowdon, C. 1988. A comparative approach to vocal communication. In Comparative Perspec-
tives in Modern Psychology, Nebraska Symposium on Motivation, D.L. Leger (ed.), 145199.
Lincoln: University of Nebraska Press, Lincoln.
Snowdon, C.T., Brown, C.H. and Petersen, M.R. 1982. Primate Communication. Cambridge:
Cambridge University Press.
Tanner, J.E. 1998. Gestural communication in a group of zoo-living lowland gorillas. Unpub-
lished PhD, University of St. Andrews, St. Andrews.
Tanner, J.E. and Byrne, R. 1996. Representation of action through iconic gesture in a captive
lowland gorilla. Current Anthropology 37(1): 162173.
Tanner, J.E. and Byrne, R. 1999. The development of spontaneous gestural communication
in a group of zoo-living lowland gorillas. In The Mentalities of Gorillas and Orangutans,
Comparative Perspectives, S.T. Parker and R.W. Mitchell and H.L. Miles (eds.), 211239.
Cambridge: Cambridge University Press.
Tanner, J.E. and Byrne, W.B. 1993. Concealing facial evidence of mood: Perspective-taking in
a captive Gorilla. Primates 34(4): 451457.
Tomasello, M. 1995. Joint attention as social cognition. In Joint Attention: Its Origin and Role in
Development, C. Moore and P.J. Dunham (eds.), 103130. Hillsdale, New York: Erlbaum.
Tomasello, M. 1996. Do apes ape? In Social Learning in Animals: The Roots of Culture, C.M.G.
Heyes, B.G. Jr. (ed.), 319346. San Diego: Academic Press, Inc.
Tomasello, M. 1999. Emulation learning and cultural learning. Behavioral and Brain Sciences.
21: 703704.
Tomasello, M. 2003. Constructing a Language, a Usage-based Theory of Language Acquisition.
Cambridge, Massachusetts, and London, England: Harvard University Press.
Tomasello, M. and Call, J. 1997. Primate Cognition. New York: Oxford University Press.
Tomasello, M. and Call, J. 2007. The Gestural Communication of Monkeys and Apes. Mahwah,
New York: Lawrence Erlbaum Associates.
Tomasello, M., Call, J., Nagell, K., Olguin, R. and Carpenter, M. 1994. The learning and use
of gestural signals by young chimpanzees: A trans-generational study. Primates 35(2):
137154.
Tomasello, M., Call, J., Warren, J., Frost, T., Carpenter, M. and Nagell, K. 1997. The ontogeny
of chimpanzee gestural signals. In Evolution of Communication, S. Wilcox, King, B. and
Steels,L. (ed.), 224259. Amsterdam/ Philadelphia: John Benjamins Publishing Company.
Tomasello, M., Carpenter, M., Call, J., Behne, T. and Moll, H. 2005. Understanding and sharing
intentions: The origins of cultural cognition. Behavioral and Brain Sciences 28: 117.
Tomasello, M., George, B.L., Kruger, A.C., Farrar, M.J. and Evans, A. 1985. The development
of gestural communication in young chimpanzees. Journal of Human Evolution 14: 175
186.
Tomasello, M., Gust, D. and Frost, G.T. 1989. A longitudinal investigation of gestural commu-
nication in young chimpanzees. Primates 30(1): 3550.
Uhlenbroek, C. 1996. The structure and function of the long-distance calls given by male
chimpanzees in Gombe National Park. University of Bristol, Bristol.
186 Simone Pika

Van Hooff, J.A.R.A.M. 1973. A structural analysis of the social behaviour of a semi-captive
group of chimpanzees. In Social Communication and Movement, Studies of Interaction and
Expression in Man and Chimpanzee, M. von Cranach and I. Vine (eds.), 75162. London
& New York: Academic Press.
Van Lawick-Goodall, J. 1968. A preliminary report on expressive movements and communica-
tion in the Gombe stream chimpanzees. In Primates. Studies in Adaptation and Variabil-
ity, P.C. Jay (ed.), 313374. New York: Holt, Rinehart, and Winston.
Vea, J.J. and Sabater-Pi, J. 1998. Spontaneous pointing behaviour in the wild pygmy chimpan-
zee (Pan paniscus). Folia Primatologica 69(5): 289290.
von Glaserfeld, E. 1974. Signs, communication, and language. Journal of Human Evolution 3:
464474.
von Glaserfeld, E. 1976. The development of language as purposive behavior. Annals of the
New York Academy of Sciences 280: 212226.
Woodruff, G. and Premack, D. 1979. Intentional communication in the chimpanzee: The de-
velopment of deception. Cognition 7: 333352.
Zuberbhler, K. 2000a. Interspecific semantic communication in two forest monkeys. Pro-
ceedings of the Royal Society 267: 713718.
Zuberbhler, K. 2000b. Referential labelling in Diana monkeys. Animal Behaviour 59(5):
917927.
Zuberbhler, K. 2001. Predator-specific alarm calls in Campbells monkeys, Cercopithecus
campbelli. Behavioral Ecology and Sociobiology 50(5): 414422.
Zuberbhler, K. 2002. A syntactic rule in forest monkey communication. Animal Behaviour
63: 293299.
Zlatev, J. this volume. The co-evolution of intersubjectivity and bodily mimesis.
chapter 9

The heterochronic origins of explicit reference

David A. Leavens, William D. Hopkins and Kim A. Bard

Explicit reference is the communicative capacity to intentionally pick out a


specific object in the environment and make that object a manifest topic for
shared attention. Pointing is the quintessential example of non-verbal, explicit
reference. Chimpanzees, and other apes in captivity, spontaneously point
without overt training. Because wild apes almost never point, and because both
captive and wild apes are sampled from the same gene pool, this implies that,
for apes, hominoid genes interact with certain environments to elicit pointing.
We propose that changes in the patterns of hominid development interact with
ape-like cognitive capacities to produce features of explicit reference in human
infants, a capacity that emerges in our nearest living relatives when they experi-
ence similar circumstances.

1. Introduction

Despite a number of claims to the contrary (e.g., Butterworth and Grover 1988;
Petitto 1988; Povinelli, Bering and Giambrone 2003a) pointing is frequently dis-
played by captive apes (e.g., Leavens 2004; Leavens and Hopkins 1998, 1999). Cap-
tive apes usually point in apparent requests for delivery of food, but they will also
point out the location of tools required to gain access to food (Call and Tomasello
1994; Russell et al. 2005; Whiten 2000). One chimpanzee, Clint, often pointed to
experimenters shoes, which he subsequently manipulated with apparent satisfac-
tion, upon presentation of said shoes (Leavens, Hopkins and Bard 1996). Pointing
is, manifestly, a referential act, directing the attention, the movements or the actions
of an observer to a specific locus.
It has been argued that nonhumans both do not and cannot point because
the psychological basis for their pointing-like behaviour differs from that of hu-
mans who point (e.g. Povinelli et al. 2003a; Tomasello 2006). We believe that the
psychological aspects of pointing and other communicative acts are distributed
between signaler and receiver, and that the psychological basis of pointing cannot
be correctly attributed to an individual, but to that individual, and any and all
188 David A. Leavens et al.

observers, who form a communicative system. Thus, any appeal to unseen moti-
vations or psychological bases of pointing individuals, specifically (and commu-
nicating entities, more generally), is as much an attribute of the psychology of the
claimant as it is an attribute of the signaler (e.g. Johnson 2001).
It is common contemporary practice to analyze patterns of communicative
behaviour for evidence of hidden mental processes. For example, if a human child
points to an object and alternates her gaze between that distant object and a social
partner, this is interpreted to mean that the child is attempting to manipulate the
mind of their communicative partner and considered to be evidence by many
researchers that the child must, therefore, have a conception of others as mental
beings (e.g. Baron-Cohen 1999; Tomasello 1995). Because, for many years, it was
often erroneously stated that apes did not point to distant entities whilst alternat-
ing their gaze between these distant entities and their communicative partners
(e.g. Butterworth and Grover 1988; Petitto 1988), this was taken as evidence that
apes do not have conceptions of others as mental beings. In fact, as we will ar-
gue below, pointing to distant entities with gaze alternation between those enti-
ties and an observer is as ambiguous with respect to the signalers conceptions
when that signaler is a human child as it is when an ape does the same thing. We
will describe the human transition to intentional communication and describe a
number of behavioural similarities between apes and young human children in
their communicative signaling. Finally, we will outline an evolutionary scenario
that might account for the near-ubiquity of pointing in human and captive ape
populations, and the relative paucity of apparent pointing in wild apes. We turn,
first to a brief consideration of how communication and cognition are instanti-
ated in living systems.

2. Communication is distributed, cognition is communicative

Communication is an interaction between transducing elements. Transduction


is the codification of energy into information. Neither broadcast nor receipt of
information constitutes a communicative act; communication is a distributed
phenomenon, distributed across at least two transducing elements. Thus, cells of
the same or different tissue composition within a body may communicate and
organisms may communicate (transfer information) with organisms of the same
or different species. If caught in an avalanche, the boulder that pins our leg may
influence us, but it does not communicate with us, nor do we communicate with
that boulder when we push it off our leg; communication requires at least two
entities with transductive boundaries. As Gregory Bateson frequently noted (e.g.
1972a:315), communication is about a difference that makes a difference.
Origins of explicit reference 189

Because communication is distributed across boundaries of transduction and


because networks of transduction exist at all levels of living systems, from cells
to organisms and societies, and because cognition (the discrimination and use
of information) is hence an inherently communicative act, therefore it is a cat-
egory error to interpret communicative behaviour as an index to unseen cognitive
processes. Because cognition implies communication (between neurons, between
aggregates of neurons, between individuals), therefore it is also a category error
to attribute cognitive processes to an individual element, because cognition is a
manifestation of communicative processes; i.e, information is distributed across
at least two transductive boundaries. It might be argued that networks of com-
munication between neurons comprise functional cognitive systems, or modules,
that are properties of individual brains, but because of the distributed nature of
cognitive processes, where no cognitive activity can develop or be manifested in
a sensorimotor vacuum, all cognitive activity is co-constituted by organisms plus
the physical consequences of action and sensation (e.g. Barrett and Henzi 2005;
Bateson 1972b; Brinck 2007, this volume; Johnson 2001). We know, as an em-
pirical fact, that organisms are not material objects with clear boundaries, or as
William Bateson put it: We commonly think of animals and plants as matter, but
they are really systems through which matter is continually passing (1906, in
C. B.Bateson 1928:209). The same general principle is true for communication
and its special case: cognition. We are systems through which ideas (bits of infor-
mation) are continually passing. The distributed nature of cognition across the
transductive boundary of an individual can be masked by the sometimes deferred
nature of environmental input and effects of action, due to memory processes
influencing ongoing activity. But the fact that organisms can re-represent (as it
were) environmental events only means that experience (or learning history) is
important in understanding cognition; it does not mean that cognition can be
isolated inside the skull. Gregory Bateson (1972c) expressed it this way:
A priori it can be argued that all perception and all response, all behavior and all
classes of behavior, all learning and all genetics, all neurophysiology and endo-
crinology, all organization and all evolution one entire subject matter must be
regarded as communicational in nature, and therefore subject to the great gener-
alizations or laws which apply to communicative phenomena.
 (1972c:282283)

In short, living systems are open systems at all levels of analysis. There are sub-
stantive implications for psychology, generally, and comparative psychology, in
particular, of the multiscalar dependence of the systems we study, of which we
wish to briefly note, here, the widespread dualistic assumption that individual
brains constitute the loci for computations, the products of which then cause
190 David A. Leavens et al.

overt behaviour. We note that the communicative phenomena we discuss, here


(pointing and its accompaniments), are interactive, distributed phenomena that
are usually not manifest except in particular social and physical contexts (e.g.
Leavens et al. 1996; Leavens, Hopkins & Bard 2005a), which we describe below.
As Johnson (2001) argued, social cognition is manifest in communicative inter-
action; this suggests that media (contexts), modes, and both individual and rela-
tionship histories are all vital components of particular communicative episodes.
Communication (construed as manifest cognition, rather than an index to hid-
den mental processes) is simultaneously a preface and a denouement. Because
cognition has been historically defined as the mental processes of an individual,
there is a widespread contemporary misconception that cognition is a property
of individuals. It is not and cannot be, and therefore a revolution in our tradi-
tional approaches to the acquisition, storage, retrieval and use of information
is overdue (see e.g. Barrett and Henzi 2005; Bateson 1972c; King 2004; Shanker
and King2002). In what follows, we will occasionally write of individuals mak-
ing discriminations and displaying evidence for having certain concepts; this is
shorthand for describing what organisms do in particular social, cultural, his-
torical, and experimental contexts.

3. Intentional communication and intersubjectivity

When we speak of intentional communication, we are specifying a sub-class of


communication that is manifest in its flexible accommodation to the behavioural
state of a social partner (e.g. Bard 1992; Bates, Camaioni and Volterra 1975; Leav-
ens, Russell and Hopkins 2005b; Pika et al. 2005a; Sinha 2004; Sugarman 1984;
Tomasello et al. 1994). Implicit within this definition is the idea that an organism
who is intentionally communicating can perceive and respond to the independent
agency of others and therefore intentional communication is manifest at higher
than sub-organismal levels (e.g. Trevarthen 1998). There is a contemporary intel-
lectual fashion towards re-defining intentional communication so as to limit it
only to organisms who have concepts of others as mental agents (as contrasted
with concepts of others as behavioural agents; e.g. Baron-Cohen 1999; cf. Povinel-
li, Bering and Giambrone 2000), but because we can perceive or attribute mental-
ity only through manifest behaviour (see e.g. Brinck 2001; Mitchell 2000; Racine
2005), therefore all putative instances of mental state attribution reduce to either
(a) behavioural analysis or (b) mental illness. Suppose we tell you that there is

. Are we really trying to assert that only crazy people attribute mental states? No. What we
are asserting is that there is no essential difference between appeals to mental states as causes of
Origins of explicit reference 191

an organism and ask you, What does this organism desire/intend/believe, at this
moment? Is any statement about that organisms intentional or epistemic status
rational in the absence of any more information than that the organism exists?
Obviously not. We interpret this inescapable opacity of mental states to indicate
simply that whatever publicly available information an observer uses to make at-
tributions of mental states must partially constitute (and therefore define) those
mental states. The fact that people, the world over, attribute complex motives and
beliefs to their pets, other animals, other people, and mythical entities does not
constitute evidence for the independent existence of these complex motives and
beliefs, which are not available to the senses. Thus, no organism actually discrimi-
nates or attributes mentality to other organisms on a purely empirical or inductive
basis; people learn, for example, to characterize behaviour of others in symbolic
terms that are, by their very nature, distributed within a language-using commu-
nity. This is not to say that organisms do not perceive regularities in the behaviour
of others, nor is it to say that many behavioural regularities are not publicly avail-
able; to interpret behaviour in terms of hypothetical constructs such as belief or
the notion that mental states are distinct from and, somehow, cause behaviour
is to make a commitment to a currently fashionable, dualistic model of mental
functioning that is historically situated in Western philosophy and its narrative
structures (e.g., Gallagher and Hutto this volume; Susswein and Racine this vol-
ume). Commonsense models like these are acquired from our cultures, not from
inductive observation (cf. Mitchell 2000).

behaviour and appeals to, for example, demons as causes of behaviour (e.g. The Devil made me
do it; see also Mitchell 2000; Susswein and Racine this volume; Thompson 1994). We cannot
see, hear, smell, taste, feel, or take photographs, spectrograms, temperatures, weights, volumes
or any other measure of either demons or mental states. To attribute behavioural phenomena
to either demonic influences or mental state influences is, in both cases, a culturally situated
manner of speaking. It is no more or less crazy to appeal to mental states than it is to appeal to
demonic possession in describing what organisms do, depending upon the cultural precepts of
the individual attempting to account for the behaviour of others. What definitely is irrational is
to make these appeals to unseen entities in the complete absence of any behavioural informa-
tion whatsoever. Thus, if cognitive scientists wish to use concepts like epistemic states in mod-
els of psychological processes, then is it incumbent upon them to supply definitions of these
hypothetical constructs in measurable terms.
. Are we advocating methodological behaviourism? Yes. The following quotation from the
recent obituary of Gregory A. Kimble describes his position, with which we are in strong agree-
ment:

. . . Kimble believed that the so-called cognitive revolution had not in an Oedipal
frenzy slain behaviorism, as was proclaimed by some cognitively oriented psycholo-
gists late in the 20th century. Instead, Kimble argued, cognitive psychology had not
192 David A. Leavens et al.

The concept of intersubjectivity was predicated on Trevarthens (e.g. 1977)


observation that even very young babies act differently towards inert objects
and animate agents, implying that the discrimination of independent agency oc-
curs within two months of birth. According to Trevarthen, then, babies arrive
more-or-less equipped to engage their worlds with two different motivations: a
praxic mode for interaction with objects and a communicative mode for interac-
tion with agents. In primary intersubjectivity (roughly 25 months of age), ba-
bies share emotional attitudes with their social partners, whereas in secondary
intersubjectivity (after about 9 months of age) babies share emotional attitudes
about events, objects and circumstances external to the dyad. In concrete terms,
then, intentional communication is defined by the display of publicly observable
behaviour which accommodates to the publicly observable behavioural correlates
of intersubjective propensity (expressed emotion, orientation of gaze, etc.).
If intentional communication requires the concept of independent agency and
if even very young babies manifest this concept as young as 2 months, then why
do so many researchers speak of the human developmental transition to inten-
tional communication much later, at 9 months of age? That young babies clearly
do discriminate states of engagement in their social partners is demonstrated by
the still-face procedure (e.g., Adamson and Frick 2003, for review); babies react to
sudden lapses in engagement by their mothers in dyadic contexts and they make
bids to re-engage with one or both parents in triadic contexts, long before the
traditional transition to intentional communication (see also Fivaz-Depeursinge
and Corboz-Warnery 1999; Reddy 2003).

at all avoided the behaviorist requirement that intervening concepts and dependent
variables be anchored to observables, in other words, to responses of some kind be
they overt muscular movements, verbal responses, or electrophysiological readings.
Cognitive psychology, thus, could not escape its behavioristic roots.
 (Boneau and Wertheimer 2006:632)

Although none of us would describe ourselves as philosophical behaviourists, we are unani-


mous in believing that all essential theoretical concepts in cognitive science must rest on pat-
terns of publicly observable behaviour, as broadly defined in this quotation. From a methodo-
logical standpoint, cognitive scientists have not escaped the same rigorous requirements for
grounding their hypothetical processes in observable behaviour under which behaviourists op-
erate (cf. MacCorquodale and Meehl 1948:esp. 105106). In short, we believe that scientifically
useful concepts can be operationalized, at least in principle (see e.g. Brinck in press). Finally, we
note that philosophers, theologians and other scholars are not necessarily subject to the same
narrow requirements for public availability of core theoretical concepts to which scientists are
required to adhere; we would not like to be construed as implying that only scientific endeavour
is worthwhile.
Origins of explicit reference 193

In contrast, most researchers evoke certain novel behavioural capacities that


typically emerge at about 9 months of age as definitive of the transition to inten-
tional communication, including pointing and use of other manual gestures to
manipulate others to act on the world (e.g., Bates, Camaioni and Volterra 1975;
Butterworth 2001, 2003) and, crucially, certain concomitants of manual gestures,
such as visual monitoring of the social partner in explicitly triadic contexts (e.g.,
Bates et al. 1977; Franco and Butterworth1996; Tomasello 1995). Another be-
havioural capacity that emerges near the end of the first year of life is the ability
to follow anothers gaze or pointing gestures to increasingly specific loci in the
environment (e.g. Butterworth and Grover 1988; Lock 2001).
Thus, the currently mainstream view of the transition to intentional com-
munication can be characterized as a focus on the dawning of attentionality;
the capacity to monitor, capture, and redirect the attention of a social partner. Be-
cause the discrimination or attribution of intentional behaviour and attentional
behavior are both predicated on (a) the specific interactive histories of the organ-
isms involved (the level of trust, the frequency of interaction, the amount of joy,
etc.), (b) the ongoing motivational states of the interactants, (c) spatial relations
obtaining between the interactants, and (d) specific manifestations of contex-
tual markers (i.e., proxemic, behavioural, and physical correlates of routines; cf.
Savage-Rumbaugh 1991), then the central difference between early intentional
communication (in the first year of life) and late attentional communication
(near the end of the first year of life and continuing into the second year) is the
development of the capacity to integrate actions on objects with communicative
acts directed toward people; this is the advent of Piagets sensorimotor stage IV,
or coordinated secondary circular reactions (cf. Sugarman 1984), or secondary
intersubjectivity (Trevarthen and Hubley 1978).
Whether or not one cares to argue that intentionality characterizes babies
communication throughout the first year of life or that intentionality dawns later
with the advent of triadic use and responses to deictic gestures, virtually all re-
searchers agree that there is a developmental elaboration of communicative be-
haviours in humans, near the end of the first year of life. Despite much debate over
whether this pattern is better characterized as a primary discontinuity in cogni-
tive development (e.g. Baron-Cohen 1995; Lock 2001) or the product of continu-
ous processes manifest in a developing organism of maturing motoric capabilities
(cf. Moore and Corkum 1994; Reddy 2001, 2003), it is empirically true that, in
many cultures, at the end of the 20th Century, babies begin to point to distant
events, agents, and objects with gaze alternation between these elements and their
social partners by about one year of age (Bates et al. 1975, 1977; Blake, ORourke
and Borzellino 1994; Franco and Butterworth 1996; Leung and Rheingold 1981;
see Butterworth 2001 and Lock 2001, for reviews). In these populations, during
194 David A. Leavens et al.

the second year of life, additional changes occur in how babies deploy their man-
ual gestures and visual orienting behaviour: they become sensitive to whether
their social partners are attending to themselves or to distant loci; in other words,
there is a well-documented behavioural transition characterized by sensitivity to
the behavioural correlates of visual attention in others, or attentionality (e.g.
Bakeman and Adamson 1986; Franco and Gagliano 2001; ONeill 1996). By 18
months of age, human babies in the cultures studied, to date, exhibit a robust
capability to monitor, capture, and direct the attention of their social partners,
through pointing; this is the capacity for explicit reference. In accordance with
the introductory remarks, our position is that reporting on the capacity to moni-
tor, capture, and direct attention is to describe typical behavioural development
in particular cultural contexts; moreover, organisms can exhibit these capabilities
in the absence of any explicit theory of mental functioning (see e.g. Brinck 2003;
Doherty 2006; Mitchell 2000).

4. The phylogeny of explicit reference

4.1 Pointing

Having briefly summarized the ontogeny, or development of explicit reference,


we turn now to the phylogeny, or evolutionary history of this capacity. Human-
kinds nearest living relatives are the African great apes: gorillas (Gorilla gorilla
gorilla, chimpanzees (Pan troglodytes), and bonobos (Pan paniscus). We shared
a common ancestor approximately seven million years ago (e.g. Hacia 2001).
Orangutans (Pongo pygmaeus) are Asian apes with which we shared a common
ancestor approximately 15 million years ago (Hacia 2001). Apes and humans are
a group of close relatives that are relatively distantly related to monkeys: monkeys
and apes shared a common ancestor approximately 30 million years ago (Steiper,
Young and Sukarna 2004). Thus, humans and chimpanzees have approximately
23 million years of shared evolutionary history between the time at which we
shared a common ancestor with monkeys and the time at which humans and
chimpanzees diverged. To put this another way, the lineage of the last common
ancestor of chimpanzees and humans existed for more than 75% of the length of
the modern human lineage since the split with monkeys. In general, any strong
claim that a species-specific cognitive or behavioural capacity (of which speech is
the most salient example) evolved de novo in humans must therefore also claim
(a) that the selective contexts in which these traits appeared are strictly limited to
recent times (from slightly before the Miocene/Pliocene boundary to the pres-
ent), (b) that the selective contexts pre-dating the Miocene/Pliocene boundary
Origins of explicit reference 195

are irrelevant to understanding both the evolution and the development of those
traits, and therefore (c) the study of our nearest living relatives, the Asian and Af-
rican great apes, will not produce data relevant to understanding the development
in humans of the traits in question.
To date, there is only one published report of unambiguous pointing by any
wild ape, a bonobo (Ve and Sabater-Pi 1998). In this episode, a male bonobo
pointed repeatedly (with outstretched arm and ring and index fingers; pre-
sumably the 2nd and 4th rays) towards the location of several human observ-
ers who were partially hidden behind some shrubbery, whilst looking back-and-
forth between these observers and the rest of his troop. Wild apes do extend their
hands towards each other, in various contexts (e.g. van Lawick-Goodall 1968),
but it is the triadic use of the outstretched arm and hand that seems to be exceed-
ingly rare. Recently, Pika and Mitani (2006) reported that wild chimpanzees use
a characteristic, directed scratching behaviour to elicit grooming at that part of
the body from their grooming partners. Thus, although manual pointing by wild
apes appears to be extremely rare, the report by Pika and Mitani (2006) suggests
that the capacity for explicit reference may be expressed more commonly through
different kinds of behaviours.
In strong contrast to the rarity of pointing by wild apes, captive apes com-
monly and spontaneously point in the complete absence of overt training (e.g.
Call and Tomasello 1994; Krause and Fouts 1997; Leavens and Hopkins 1998;
Leavens et al. 2005a, b; Leavens, Hopkins, and Thomas 2004a; Menzel 1999;
Miles 1990; de Waal 1982; Whiten 2000; reviewed by Leavens 2004; Leavens and
Hopkins 1999; de Waal 2001). By far the most common context in which captive
apes use pointing gestures is one in which they point to desirable, but unreachable
food, in the presence of a human observer.
A brief digression is warranted about what constitutes pointing behaviour.
When human infants point with their whole hands (outstretched arms and most
or all fingers extended), researchers have long termed this reaching (Blake et
al. 1994; Leung and Rheingold 1981; Murphy and Messer 1977; see Leavens and
Hopkins 1999, for review). The term reach has two primary meanings: (a) to
attempt to grasp something and (b) to extend the arm and hand. Neither mean-
ing captures the communicative significance of these gestures, which has been
noted by many infancy researchers (e.g., Murphy and Messer 1977; Leung and
Rheingold 1981). Franco and Butterworth (e.g. 1996) have employed the term
indicate or indicative for these whole-handed gestures, but this implies the
same function as the term pointing. For these reasons, we refer to these as
whole-hand points. (Recently, at a public science lecture, a member of the audi-
ence asked, sardonically, if we stayed up nights developing this terminology). In
this usage, we join, for example, Kendon and Versante (2003), Haviland (2003),
196 David A. Leavens et al.

and Wilkins (2003) in recognizing that adult humans in diverse cultures do, in-
deed, point with their whole hands in naturalistic contexts (see photographs and
drawings in these sources). Pointing has an attention-directing function, and
people can and do point with their eyes, with their lips (Enfield 2001; Wilkins
2003), with their whole hands (Wilkins 2003), and with their index fingers.
Some researchers argue that pointing with the index finger has a special status
as a human species-specific biological adaptation for definite reference (see, esp.,
Butterworth 2003), implying that the gesture is derived from our species-specific
adaptations for language and speech, but because (a) apes point with their index
fingers (see below) and (b) some humans do not point with their index fingers
(Wilkins 2003), we believe that both the claim for the species-specificity of point-
ing and the alleged adaptive derivation of the gesture from adaptations for speech
are challenged, at our present state of knowledge.
Most captive apes who point, point with their whole hands (Call and Tomasello
1994; Leavens and Hopkins 1998, 1999; Leavens et al. 2004a; de Waal 1982). That
these are communicative signals and not attempts to reach for obviously unreach-
able food is demonstrated by the necessity of an audience for the display of these
gestures (Call and Tomasello 1994; Hostetter, Cantero and Hopkins 2001; Leavens
et al. 1996; Leavens et al. 2004a; see Table 1). Thus, apes in captivity do not point
to obviously unreachable items, with either their index fingers or with all fingers
extended, in the absence of a human observer; these are communicative signals,
not abbreviated reaches for obviously unreachable food.

Table 1. Apes require an audience to display points and other manual gestures in the
presence of unreachable food: Summary of experimental studies.
Study Species N (subjects) % Presencea
Call and Tomasello (1994) Orangutans 2 95b
Leavens, Hopkins, and Bard (1996) Chimpanzees 3 99
Hostetter, Cantero, and Hopkins (2001) Chimpanzees 49 97
Leavens, Hopkins, and Thomas (2004a) Chimpanzees
Visible Banana Condition 101 98
Hidden Banana Condition 101 98
Experiment 2 35 100

Notes. aThis is the percent of trials in which subjects pointed (Call and Tomasello 1994),per-
cent of gestures, some of which were points (Leavens et al. 1996; Hostetter et al. 2001), or
percent of subjects who gestured, including those who pointed (Leavens et al. 2004a) in the
presence, as compared to the absence of human observers. 
bIn 48 of the total of 96 experimental trials in this study, the human observer, although

present, either had his eyes closed or was facing away from the subjects.
Origins of explicit reference 197

However, there is considerable diversity in the preferred form of pointing


across captive groups of apes: language-trained apes point overwhelmingly with
their index fingers (Figure 1). As Figure 1 shows, language-trained chimpanzees
point preferentially with their index fingers. Similar observations were reported for
two orangutans, Puti (not language-trained) and Chantek (language-trained) by
Call and Tomasello (1994). There are numerous observations of language-trained
apes that support the generalization that they point primarily, or at least very fre-
quently, with their index fingers (Bodamer and Gardner 2002; Call and Tomasello
1994; Krause and Fouts 1997; Menzel 1999; Miles 1990; Savage-Rumbaugh 1986;
Whiten 2000). It is not clear which aspects of language-training result in this
group difference between different populations of captive apes, but it is clear that
apes who have more prolonged and direct interactions with humans point more
frequently with their index fingers than do apes who are raised with less human
contact. Thus, for captive apes, the preferred form of pointing is attributable to
differential environmental influences on communicative development; the form
of pointing is attributable to epigenetic (i.e., other than exclusively genetic) pro-
cesses (see Leavens 2004 and Leavens et al. 2005b, for elaboration of this specific
point, and Sinha 2004, for more general considerations of epigenetic effects on
human communicative development).

Figure 1. Indexicality index of pointing in captive chimpanzees: Differences in percent-


ages of index-finger to whole-hand extensions to objects distal to both chimpanzees and
their human observers. Negative numbers reflect a majority of whole-hand extensions,
whereas positive numbers reflext a preponderance of index-finger extensions. Sources:
Language-naive chimpanzees Leavens et al. 1996, and a re-analysis of data reported in
Leavens and Hopkins 1998; Language-trained chimpanzees data from Kause and Fouts
1997, Experiments 1 and 2 combined. Figure and caption adapted from Leavens and
Hopkins (1999, their Figure 1).
198 David A. Leavens et al.

4.2 Gaze alternation

In humans, alternation of gaze while gesturing in triadic contexts is a defining be-


havioural criterion for the development of mature intentional communication in
the second year of life (e.g. Bates et al. 1975, 1977; Franco and Butterworth 1996;
Leung and Rheingold 1981). That human infants do alternate their gaze between
distant entities and their human caregivers while pointing is well-established, but
what it might signify for the infants, themselves, is really quite unclear. A rich
interpretation of this behaviour was offered by Tomasello (1995) who argued that
gaze alternation while pointing signifies that the child understands that the adult
is a separate person who has intentions and attention that may differ from its
own (p. 109), largely because he claims that babies direct affectively laden facial
expressions toward adults, but not the objects to which they point. Empirically,
this turns out not to be the case: babies frequently smile while pointing to dis-
tant entities before they turn to look at their social partners (e.g. Jones and Hong
2001; Leavens and Todd unpublished raw data). In an earlier paper (Leavens et
al. 1996), we also suggested that this gaze alternation implied an awareness by
the signaler that the recipient of the gesture had a distinct visual perspective, in
accordance with Tomasellos (1995) claim. Since that time, we have rejected this
interpretation (e.g. Leavens 2004). A more cautious interpretation of this visual
orienting behaviour is that babies are monitoring the effect of their gestures, im-
plying that they have some expectations that their communicative bids will have
social or instrumental consequences.
In our studies of captive apes, housed in a research facility, we find that be-
tween 85% and 100% of individuals who gesture in the context of unreachable food
also display gaze alternation between the food and the social partner (Table2).
As Table 2 illustrates, it takes human infants two years to reach the same levels of
accompanying gaze alternation displayed by chimpanzees; unfortunately, there
are no relevant data, to our knowledge, on the early development of gaze alterna-
tion in chimpanzees. Nevertheless, it is clear that chimpanzees raised in what may
be considered to be impoverished conditions, nevertheless acquire this pattern of
visual orientation in the absence of any explicit training to do so, just as people do.
Also like people, chimpanzees in captivity monitor facial expressions of their hu-
man caregivers in social referencing contexts (Russell, Bard and Adamson 1997),
thus alternating their gaze between social partners and distant entities in both
information-providing and information-seeking contexts.
Origins of explicit reference 199

Table 2. Comparison between chimpanzees and human infants of the percent


of subjects displaying gestures accompanied by gaze alternation.
Species Studya N (Subjects) % Subjects w/GAb
Chimpanzees Leavens and Hopkins (1998) 78 87
Leavens et al. (2004a)
Visible Banana Condition 76 86
Hidden Banana Condition 73 85
Experiment 2 11 91
Leavens et al. (2005b)
Predelivery: Banana 22 91
Predelivery: Half-Banana 20 100
Predelivery: Chow 24 92
Humans Bates et al. (1977)
9.5 Months 25 0
10.5 Months 25 28
11.5 Months 25 36
12.5 Months 25 56
Lempers (1979)
9.0 Months 36 8
12.0 Months 36 8
14.0 Months 36 64
Desrochers, Morissette, and Ricard (1995)
9.0 Months 25 0
12.0 Months 25 13
15.0 Months 25 54
18.0 Months 25 79
24.0 Months 25 100

Notes. aAll chimpanzees sampled from the same population at the Yerkes National Primate
Research Center, Atlanta, Georgia, U.S.A., and aged between 3 and 56 years. All human ges-
tures are index-finger points. Methodological differences in the assessment of visual orienting
behaviour between studies of apes and humans render these comparisons more qualitative
than quantitative. Specifically, gaze alternation in apes was defined as looks to an experi-
menter during an observational interval that varied substantially between subjects and studies
(typically, these observation intervals ranged between 1s and 60s), whereas human visual
orienting was typically defined as looks to caregivers from 1s before point onset to 1s after
point termination.
bSubjects w/GA means percent of subjects gesturing with gaze alternation.
200 David A. Leavens et al.

4.3 Sensitivity to attentional status of an observer

There is a widespread misconception that captive apes are relatively insensitive to


the attentional status of others (e.g. Povinelli et al. 2000, 2003a), despite numerous
demonstrations to the contrary. Table 3 lists a representative sampling of experi-
mental and observational demonstrations of the sensitivity of chimpanzees to the
visual attention of both conspecific and human observers. These studies include
captive apes from both ends of an enrichment spectrum (i.e. the studies in Table
3 used apes in biomedical research centers and zoos, as well as much more so-
cially enriched language-trained apes). In short, as the studies listed in Table 3
make clear, apes discriminate direct gaze, they follow gaze, and they selectively
deploy their communicative signals in accordance with the attentional status of
an observer. Apes without any special training display a procedural awareness of
the attentionality that characterizes human infant communication between ap-
proximately 9 and 18 months of age. A recent spate of claims to the effect that in
their pre-verbal signaling behaviour human infants evince evidence for some ad-
ditional representational capacity not available to non-human primates is subject
to criticism by appeal to the empirical data which clearly and almost unanimously
demonstrate that apes also discriminate visual attention in their social partners.

Table 3. Apes are sensitive to the visual attention of an observer: Representative studies.
These studies variously demonstrated that apes discriminated direct gaze, followed the
gaze of others, or displayed visual signals selectively when social partners were looking at
them.
Type Study Species
Observational Tanner and Byrne (1993) Gorillaa
Observational Tomasello et al. (1994) Chimpanzees
Observational Tanner and Byrne (1996) Gorilla
Observational Tomasello et al. (1997) Chimpanzees
Observational Pika, Liebal, and Tomasello (2003) Gorillas
Observational Liebal, Call, and Tomasello (2004a) Chimpanzees
Observational Liebal, Pika, and Tomasello (2004b) Siamangs
Observational Pika, Liebal, and Tomasello (2005) Bonobos
Observational Liebal, Pika, and Tomasello (2006) Orangutans
Experimental Call and Tomasello (1994) Orangutans
Experimental Itakura (1996) Orangutan
Experimental Povinelli and Eddy (1996a) Chimpanzeesb
Experimental Povinelli and Eddy (1996b) Chimpanzees
Experimental Krause and Fouts (1997) Chimpanzees
Experimental Povinelli and Eddy (1997) Chimpanzees
Origins of explicit reference 201

Table 3 (continued)
Type Study Species
Experimental Tomasello, Call, and Hare (1998) Chimpanzees
Experimental Itakura and Tanaka (1998) Chimpanzees, Orangutan
Experimental Itakura, Agnetta, Hare and Tomasello (1999) Chimpanzees
Experimental Peignot and Anderson (1999) Gorillasc
Experimental Povinelli, Bierschwale, and Cech (1999) Chimpanzeesd
Experimental Tomasello, Hare, and Agnetta (1999) Chimpanzees
Experimental Hare, Agnetta, Call, and Tomasello (2000) Chimpanzees
Experimental Hare, Call, and Tomasello (2001) Chimpanzees
Experimental Hostetter, Cantero, and Hopkins (2001) Chimpanzees
Experimental Bodamer and Gardner (2002) Chimpanzees
Experimental Okamoto et al. (2002) Chimpanzee
Experimental Povinelli, Theall, Reaux, and Dunphy-Lelii (2003)
Chimpanzees
Experimental Liebal, Pika, Call, and Tomasello (2004c) Chimpanzees
Experimental Leavens, Hostetter, Wesley, and Hopkins (2004b) Chimpanzees
Experimental Braer, Call, and Tomasello (2005) Bonobos, Chimpanzees,
Gorillas, Orangutans
Experimental Melis, Call, and Tomasello (2006) Chimpanzees
Experimental Poss, Kuhar, Stoinski, and Hopkins (2006) Gorillas, Orangutans
Experimental Hostetter, Russell, Freeman, and Hopkins (2007) Chimpanzees
Experimental Hopkins, Russell, and Leavens (In press) Chimpanzees

Notes: aA gorilla covered her facial expression with her hand, implying an awareness of the
social consequences of her expressions in the visual domain.
bThis study is often cited as evidence against discrimination of human gaze by chimpanzees

but, in fact, the chimpanzees readily discriminated human gaze in almost all experimental
contexts either spontaneously or with a modicum of training.
cThe gorillas in this study readily discriminated head orientation, but not eyes only.
dThe authors interpreted their data to indicate that chimpanzees lacked a high-level model

of mental functioning, but the chimpanzees in this study outperformed the human children
in an object-choice task, using aspects of an experimenters attentional cues and, furthermore,
reliably followed the experimenters gaze to a point behind them.

There are two empirical grounds on which some researchers base claims for a
uniquely human cognitive adaptation in the domain of non-verbal communica-
tion: (a) an occasional absence of discrimination by apes of the focus of the eyes,
specifically, and (b) the phenomenon of pointing to distant events or objects by
human babies in the apparent absence of any attempt by the baby to manipulate
the social partner to act on that distal element (so-called protodeclarative com-
munication, Bates et al. 1975; see also Brinck this volume; Susswein and Racine
this volume). With respect to discrimination of eye direction, we have argued that
whether discriminations are based on eye direction, head orientation, or other
202 David A. Leavens et al.

postural cues is irrelevant to the cognitive implications, which are simply that
humans and great apes discriminate different states of visual attention in others
(Leavens et al. 2005a:294). In other words, if an organism can use the behavioural
correlates of visual attention in social agents to exercise choice over their modality
of signaling, to find food, to effectively manipulate others to retrieve food, etc.
all of which have been well-demonstrated in apes then the concept of visual
attention is manifest in the interplay of that organism with its social environment,
and this is true irrespective of the specific behavioural cues that organism might
use (see Table 3).
Every extant, published, alleged species difference between apes and humans
in the capacity to discriminate and use visual attention in others is predicated on
an experimental confound between early rearing history and species classification:
apes are typically orphans raised in cages without primary, stable adult attachment
figures and humans are raised by their biological parents in rich environments filled
with laughter, joy, and frequent face-to-face interaction with their primary caregiv-
ers. Consider the following thought experiment: raise human boys from birth in
the same relatively impoverished circumstances in which captive apes are typically
raised. Let the comparison group be human girls raised by their biological parents
in their homes, who are cherished, and unreservedly and reliably loved by their
caregivers. Years later, assess the sensitivity of the boys and the girls to subtle cues of
visual attention in human adults. Suppose the girls, unsurprisingly, perform better
than the boys would any researcher in their right mind attribute the difference
to a gender difference between boys and girls? Of course not, rearing history is
clearly confounded with the gender of the subjects. Yet substitute apes for boys and
humans for girls in this research design and how often have researchers trumpeted
a species difference between apes and humans in various aspects of sensitivity to
visual attention (e.g. Povinelli and Eddy1996a; Theall and Povinelli 1999)? If the
practice of almost completely ignoring the effects of pre-experimental experience
(or what used to be called the preparation of the organisms under scrutiny) were
not so widespread in contemporary comparative psychology, this would be laugh-
able (for a notable exception to this general methodological failing, see Carpenter,
Tomasello and Savage-Rumbaugh1995). The reports listed in Table3 adequately
demonstrate that captive apes discriminate different states of visual attention in oth-
ers, despite the impoverishment of their early rearing histories.
With respect to so-called protodeclarative pointing, there are very few re-
ports of pointing by apes with the apparent goal of merely directing the attention
of their social partners to distant goals; this scarcity has been noted by, among
others, Baron-Cohen (1999), Butterworth (2001), Povinelli et al. (2003b), and
Tomasello (1999). These authors suggest that the absence or scarcity of protodec-
larative pointing in apes and in humans with autism is diagnostic of an inability
Origins of explicit reference 203

to represent the perspectives or mental states of others. However, in the original


formulation by Bates et al. (1975), protoimperatives were described as pre-verbal
attempts by human babies to elicit action from a social partner, and protodeclara-
tives were described as attempts to elicit positive emotional engagement from a
social partner. Thus, in the original formulation, both protoimperative and proto-
declarative pointing were presented as instrumental, imperative gestures, differing
only in the apparent goals of the signaler (retrieval of objects and social responses,
respectively). Hence, the term protodeclarative has undergone an equivocation,
changing from a label for an instrumental act that signifies the same cognitive
processes as protoimperative pointing, to a label for a communicative act that
indexes a nascent theory of mind. This has occurred in the absence of any signifi-
cant new empirical findings on typical human communicative development; in-
deed, recent research confirms the necessity of positive emotional engagement to
satisfy babies who point protodecaratively (Liszkowski et al. 2004). If, as Bates and
her colleagues originally claimed, babies sometimes point to elicit positive affec-
tive responses from their caregivers, then any apparent absence of such pointing
in certain psychopathological human populations, or in nonhuman populations,
can only signify a difference in motivation, not a difference in representational
capacities, as the latter are not implicated in protodeclarative pointing. If human
babies typically receive positive emotional consequences to their signaling behav-
iour, it is not implausible to suggest that they may increasingly act to bring about
such consequences as they mature. Therefore, we suggest that if these kinds of
consequences are necessary for the development of protodeclarative pointing in
human children, then it is reasonable to suggest that humans learn to point proto-
declaratively (cf. Moore and Corkum 1994). In other words, humans may develop
protodeclarative pointing in human species-typical caregiving contexts that may
not require human species-unique cognitive capacities.
Moreover, empirically, there are several reports of apparent declarative point-
ing by apes (pointing in the absence of any evidence that apes are attempting
to instrumentally manipulate a social partner to act on the indicated element):
the single report of pointing by a wild ape was an apparent declarative (Ve and
Sabater-Pi 1998), and there are numerous reports of apparently declarative point-
ing by language-trained apes (e.g. Miles 1990; Savage-Rumbaugh et al. 1998). These
language-trained apes are notable in particular for having experienced unusually
close emotional bonds with human caregivers; the putative species difference in
propensity to point declaratively may, therefore, be attributable to differences in the
degree of exposure to particular kinds of caregiving practices. To put this another
way, humans may inculcate the motivation to share attention to distant events by
making such shared attention reinforcing through extravagant displays of contin-
gent joy. Organisms with histories of joyful consequences to shared attention might
204 David A. Leavens et al.

reasonably be expected to instigate such episodes in the future. The much-touted


species difference in propensity to point declaratively may simply reflect different
degrees of exposure to some, particularly Western, caregiving practices.

5. Heterochrony and the Referential Problem Space

Both the propensity to point and the form of pointing in captive apes are influenced
by rearing history (as may be also the motivation to engage in declarative behav-
iour). The propensity to point with outstretched arms and fingers can be charac-
terized as a cultural difference between wild and captive apes, in the same way
that differential propensities to point with the lips is a human cultural difference.
Apes are, thus, malleable in their gestural repertoires (e.g. Bard 1998; Leavens et
al. 2005b; Pika et al. 2005a; Tomasello et al. 1994). Given this flexibility, what is
it about captive environments, which are so impoverished in many respects, that
fosters the development of pointing in captive apes, in the absence of explicit
training? We believe a plausible answer lies in consideration of the circumstances
in which human infants begin to point.
Human infants have both endogenous and exogenous barriers to free move-
ment. With considerable inter-individual variability, humans do not achieve bi-
pedal locomotion until approximately a year of age and mastery of this mode of lo-
comotion takes several years (e.g. Cheron et al. 2001). Chimpanzees, on the other
hand, are capable of independent quadrupedal locomotion (technically known as
knucklewalking) by about five months of age (van Lawick-Goodall 1968). The
significant delay in locomotor development in our species, relative to other pri-
mates, is largely attributable to maturational factors affecting the stability of the
trunk in a vertical mode humans are biomechanically unstable in this posture
for much of the first year of life (e.g., Adolph and Berger 2005). Direct evidence
for bipedal locomotion in hominids (Australopithecus afarensis) dates to over 3.5
million years ago with the footprints in the Laetoli lava beds (Leakey and Hay
1979) and more controversial claims exist for bipedal locomotion in much older
hominids dated to nearly twice that old (Orrorin tugenensis, Senut, et al. 2001). It
is an open question whether the relatively protracted, peripatetic ontogeny of lo-
comotor development that characterizes modern humans was also characteristic
of the earliest bipedal hominids or whether this is a more recent feature of human
development related to the very oversized heads of infant representatives of the
genus Homo, dating from about 2.5 million years ago. The difference between hu-
mans and apes in the attainment of independent locomotion constitutes a change
in our lineage in the relative timing of this motoric competency, or heterochrony.
In addition to these endogenous limitations that uniquely affect human babies,
Origins of explicit reference 205

secondary consequences of bipedal locomotion in adult caregivers give rise to


widespread exogenous barriers to free movement in babies: babies are carried
in restraining devices or left physically restrained in a variety of settings, such as
cribs or feeding chairs, typically for their own safety (e.g. Super 1990).
At the age at which modern human babies begin to point, near the end of the
first year of life, they exhibit novel capacities for means-ends reasoning, or tool
use (Bates, Thal and Marchman 1991; Leavens 2004; Sugarman 1984). Thus, at
the same age at which babies begin to point and to otherwise use communica-
tion instrumentally, there is a concomitant advent of the use of indirect means
to achieve goals, characteristic of late Piagetian sensorimotor sub-stage IV (co-
ordinated secondary circular reactions) and early sub-stage V (tertiary circular
reactions) cognitive development. These capacities are also well-demonstrated
in the great apes (e.g. Bard 1990; Gibson 1996; Parker 1999; Pot and Spinozzi
1994). However, in the wild, because apes develop independent locomotor com-
petence many months prior to the advent of means-ends reasoning, they are not
dependent upon others to act on the world for them. In contrast, human infants
experience multitudinous barriers to the direct attainment of distant objects and
are reliant upon others to retrieve those objects for them: this is the Referential
Problem Space, a related series of circumstances in which babies are dependent
upon the successful capture and re-direction of the attention of their caregiver
to specific loci for instrumental ends (Figure 2). In order to obtain distant items,
babies manipulate their caregivers to deliver them, and this requires a means to
capture and re-direct the attention of others. Because infants have long histories
in which their caregivers have retrieved distant items for their manipulation, the
caregivers become established means to an end. The innovation of pointing is that
it combines an established means (caregiver) with a novel means (pointing) to
numerous established and novel ends.
When apes are raised in captivity, we put them directly into the Referential
Problem Space. Captive apes have no access to food without the direct provision-
ing by human caregivers, so histories of dependencies upon caregivers are well-
established in these populations. Pointing to request food or other items develops
in this problem space (e.g. Call and Tomasello 1994; Krause and Fouts 1997; Leav-
ens and Hopkins 1998; Leavens et al. 1996, 2004a, 2005a, b; Figure 2). Because
wild apes develop early locomotor competence, they circumvent the Referential
Problem Space: they are never reliant upon others to retrieve distant objects for
them. They go on to manifest their problem-solving capacities in well-docu-
mented foraging contexts: fishing for termites, using stones to crack nuts, etc., but
pointing or otherwise manipulating others to act vicariously on the world does
not develop. In contrast, both human infants and captive chimpanzees face the
Referential Problem Space: they cannot retrieve distant objects except through
206 David A. Leavens et al.

Figure 2. The Referential Problem Space. Wild chimpanzess, because of their ability
to independently travel to virtually any object of interest, do not need to apply their
problem-solving skills to referential contexts (downward-pointing arrow at 45 months
indicates the onset of independent locomotion). Apes go on to display means-ends
problem-solving capacities in foraging domains, but do not develop pointing as an
instrumental tool because it is not required in those contexts (but see Pika and Mitani
2006). In contrast, humans and captive chimpanzees experience both barriers to direct
attainment of desirable objects and long histories in which caregivers deliver desirable
objects to them. By virtue of humans long-delayed development of bipedal locomotion
(upward-pointing arrow), they are restricted in movement and dependent upon others
to act on their world at a time in development in which they have sophisticated problem-
solving capacities. Pointing emerges, then, in this problem space.

manipulation of others and they have the advanced sensorimotor problem-solv-


ing capacities to use existing means (human caregivers) to act on the world for
them (Bard 1990; Leavens 2004).
One implication of this hypothesis is that there is no human-specific adaptation
for definite reference through non-verbal means. This suggestion runs counter to
an existing body of theory that either explicitly or implicitly construes non-verbal
reference as being derived from human species-specific cognitive adaptations for
symbolic reference (e.g. Butterworth 2003). Because of the heterochronic changes
Origins of explicit reference 207

in locomotor independence in our lineage, possibly long before the development


of speech our ancestors experienced novel ontogenetic conundrums for which
the development of referential behaviour had tangible payoffs. Because apes in
similar circumstances also spontaneously point, this implies that a trajectory into
referential communication does not require adaptations for speech or adaptations
for bipedal locomotion, as apes have evolved neither capacity. According to this
view, pointing derives from an interaction between a particular set of environ-
mental circumstances (the Referential Problem Space) and cognitive capacities
for means-ends reasoning that are shared by humans and the great apes. Babies
and some apes who point and then experience very positive emotional responses
from their caregivers to this pointing, may come to point declaratively, yet none-
theless instrumentally to elicit these states of positive mutual engagement; i.e.,
they may generalize their pointing to request action on distant objects to contexts
in which pointing leads to positive emotional, rather than physical reinforcement,
when the social environment provides those affective contingencies (Leavens
2004; Moore and Corkum 1994).
Thus, we do not believe that pointing by captive apes implies any cognitive
capacity not also manifest in wild apes, in problem-solving contexts. Indeed,
we believe one of the reasons captive apes so frequently point to request food
from human observers may be because, in many captive contexts, humans are so
oblivious to subtle indicators of gaze by chimpanzees that these apes are forced
to deploy extraordinarily explicit means of capturing and re-directing human at-
tention. Through pointing, apes are able to scaffold humans into more responsive
interactions (cf. de Waal 2001).

Acknowledgements

We thank the editors of this volume, Jordan Zlatev, Tim Racine, Chris Sinha,
and Esa Itkonen for inviting this chapter. Special thanks to Tim Racine for his
helpful feedback on several earlier versions of this manuscript. We gratefully ac-
knowledge inspirational conversations about pointing and related matters with
Robin Banerjee, Irwin Bernstein, Joanna Blake, Ingar Brinck, the late George But-
terworth, Josep Call, Robert Corruccini, Deborah Custance, Susan Ford, Doro-
thy Fragaszy, Fabia Franco, Janet Frick, Juan-Carlos Gomez, R. Peter Hobson,
Autumn Hostetter, Jim Hurford, Mark Krause, Maria Legerstee, Katja Liebal, Ulf
Liszkowski, Chris Moore, B. E. Mulligan, Simone Pika, Vasu Reddy, Connie Rus-
sell, Jamie Russell, Chris Sinha, Roger Thomas, Mike Tomasello, Colwyn Trevar-
then, Katherine Whitcome, Nicola Yuill, and many others.
208 David A. Leavens et al.

References

Adamson, L.B. and Frick, J.E. 2003. The Still-face: A history of a shared experimental para-
digm. Infancy 4: 451473.
Adolph, K.E. and Berger, S.E. 2005. Physical and motor development. In Developmental Sci-
ence: An Advanced Textbook, 5th Ed., M.H. Bornstein and M.E. Lamb (eds.), 223281.
Mahwah, NJ: Lawrence Erlbaum Associates.
Bakeman, R. and Adamson, L.B. 1986. Infants conventionalized acts: Gestures and words with
mothers and peers. Infant Behavior and Development 9: 215230.
Bard, K.A. 1990. Social tool use by free-ranging orangutans: A Piagetian and developmental
perspective on the manipulation of an animate object. In Language and Intelligence in
Monkeys and Apes: Comparative Developmental Perspectives, S.T. Parker and K.R. Gibson
(eds.), 356378. Cambridge: Cambridge University Press.
Bard, K.A. 1992. Intentional behavior and intentional communication in young free-ranging
orangutans. Child Development 62: 11861197.
Bard, K.A. 1998. Social-experiential contributions to imitation and emotion in chimpanzees. In
Intersubjective Communication and Emotion in Early Ontogeny, S. Brten (ed.), 208227.
Cambridge: Cambridge University Press.
Baron-Cohen, S. 1995. Mindblindness: An Essay on Autism and Theory of Mind. Cambridge,
Mass.: MIT Press.
Baron-Cohen, S. 1999. The evolution of a theory of mind. In The Descent of Mind: Psycho-
logical Perspectives on Hominid Evolution, M.C. Corballis and S.E.G. Lea (eds.), 261277.
Oxford, UK: Oxford University Press.
Barrett, L. and Henzi, P. 2005. The social nature of primate cognition. Proceedings of the Royal
Society B 272:18651875.
Bates, E., Benigni, L., Bretherton, I., Camaioni, L. and Volterra, V. 1977. From gesture to the
first word: On the nature of cognitive and social prerequisites. In Interaction, Conversation
and the Development of Language, M. Lewis and L. Rosenblum (eds.), 247307. New York:
John Wiley and Sons.
Bates, E., Camaioni, L. and Volterra, V. 1975. Performatives prior to speech. Merrill-Palmer
Quarterly 21: 205226.
Bates, E., Thal, D. and Marchman, V. 1991. Symbols and syntax: A Darwinian approach to lan-
guage development. In Biological and Behavioral Determinants of Language Development,
N.A. Krasnegor, D.M. Rumbaugh, R.L. Schiefulbusch, and M. Studdert-Kennedy (eds.),
2965. Hillsdale, NJ: Erlbaum.
Bateson, C.B. 1928. William Bateson, F.R.S.: His Essays and Addresses, together with a Short Ac-
count of his Life. Cambridge: Cambridge University Press.
Bateson, G. 1972a. The Cybernetics of Self : A Theory of Alcoholism. In Steps to an Ecology of
Mind, G. Bateson (ed.), 309337. New York: Ballantine Books. [Original article published
1971 in Psychiatry 34: 118]
Bateson, G. 1972b. Form, Substance, and Difference. In Steps to an Ecology of Mind, G. Bate-
son (ed.), 448465. New York: Ballantine Books. [Original article published 1970 in Gen-
eral Semantics Bulletin, no. 37, Institute of General Semantics]
Bateson, G. 1972c. The Logical Categories of Learning and Communication. In Steps to an
Ecology of Mind, G. Bateson (ed.), 279308. New York: Ballantine Books.
Origins of explicit reference 209

Blake, J., ORourke, P. and Borzellino, G. 1994. Form and function in the development of
pointing and reaching gestures. Infant Behavior and Development 17: 195203.
Bodamer, M.D. and Gardner, R A. 2002. How cross-fostered chimpanzees (Pan troglodytes)
initiate and maintain conversations. Journal of Comparative Psychology 116: 1226.
Boneau, C.A. and Wertheimer, M. 2006. Gregory A. Kimble (19172006) [Obituary]. Ameri-
can Psychologist 61: 632633.
Braer, J., Call, J. and Tomasello, M. 2005. All primates species follow gaze to distant locations
and around barriers. Journal of Comparative Psychology 119: 145154.
Brinck, I. 2001. Attention and the evolution of intentional communication. Pragmatics and
Cognition 9: 255272.
Brinck, I. 2003. The pragmatics of imperative and declarative pointing. Cognitive Science
Quarterly 3: 429446.
Brinck, I. 2007. Situated cognition, dynamic systems, and art. Janus Head, 9, 407431.
Brinck, I. In press. From similarity to uniqueness: Method and theory in comparative psychol-
ogy. In Learning from animals? Examining the Nature of Human Uniqueness. L.S. Rska-
Hardy and E.M. Neumann-Held (eds.), London: Psychology Press.
Butterworth, G. 2001. Joint visual attention in infancy. In The Blackwell Handbook of Infant
Development, J.G. Bremner and A. Fogel (eds.), 213240. Hove, U.K.: Blackwell.
Butterworth, G. 2003. Pointing is the royal road to language for babies. In Pointing: Where
Language, Culture, and Cognition Meet, S. Kita (ed.), 933. Mahwah, NJ: Erlbaum.
Butterworth, G. and Grover, L. 1988. The origins of referential communication in human in-
fancy. In Thought without Language, L. Weiskrantz (ed.), 524. Clarendon Press: Oxford.
Call, J. and Tomasello, M. 1994. Production and comprehension of referential pointing by
orangutans (Pongo pygmaeus). Journal of Comparative Psychology 108: 307317.
Carpenter, M., Tomasello, M. and Savage-Rumbaugh, S. 1995. Joint attention and imitative
learning in children, chimpanzees, and enculturated chimpanzees. Social Development
4: 217237.
Cheron, G., Bouillot, E., Dan, B., Bengoetxea, A., Draye, J.-P. and Lacquaniti, F. 2001. Devel-
opment of a kinematic coordination pattern in toddler locomotion: Planar covariation.
Experimental Brain Research, 137: 455466.
Doherty, M. 2006. The development of mentalistic gaze understanding. Infant and Child De-
velopment 15: 179186.
Enfield, N. J. 2001. Lip-pointing: A discussion of form and function with reference to data
from Laos. Gesture 1: 185212.
Fivaz-Depeusinge, E. and Corboz-Warnery, A. 1999. The Primary Triangle: A Developmental
Systems View of Mothers, Fathers, and Infants. New York: Basic Books.
Franco, F. and Butterworth, G. 1996. Pointing and social awareness: Declaring and requesting
in the second year. Journal of Child Language 23: 307336.
Franco, F. and Gagliano, A. 2001. Toddlers pointing when joint attention is obstructed. First
Language 21: 289321.
Gibson, K. 1996. The ontogeny and evolution of the brain, cognition, and language. In Hand-
book of Human Symbolic Evolution, A. Lock and C.R. Peters (eds.), 407431. Hove, U.K.:
Blackwell.
Hacia, J.G. 2001. Genome of the apes. Trends in Genetics 17: 637645.
Hare, B., Call, J., Agnetta, B. and Tomasello M. 2000. Chimpanzees know what conspecifics do
and do not see. Animal Behaviour 59: 771785.
210 David A. Leavens et al.

Hare, B., Call, J. and Tomasello, M. 2001. Do chimpanzees know what conspecifics know?
Animal Behaviour 61: 139151.
Haviland, J.B. 2003. How to point in Zincatn. In Pointing: Where Language, Culture, and
Cognition Meet, S. Kita (ed.), 139169. Hillsdale, NJ: Erlbaum.
Hopkins, W.D., Russell, J.L. and Leavens, D.A. In press. Multi-modal communication and its
social contextual use in captive chimpanzees (Pan troglodytes). Animal Cognition.
Hostetter, A.B., Cantero, M. and Hopkins, W.D. 2001. Differential use of vocal and gestural
communication in response to the attentional status of a human. Journal of Comparative
Psychology 115: 337343.
Hostetter, A.B., Russell, J.L., Freeman, H. and Hopkins, W.D. 2007. Now you see me, now you
dont: Evidence that chimpanzees understand the role of the eyes in attention. Animal
Cognition, 10, 5562.
Itakura, S. 1996. An exploratory study of gaze monitoring in nonhuman primates. Japanese
Psychological Research 38: 174180.
Itakura, S. and Tanaka, M. 1998. Use of experimenter-given cues during object-choice tasks
by chimpanzees (Pan troglodytes), an orangutan (Pongo pygmaeus), and human infants
(Homo sapiens). Journal of Comparative Psychology 112: 119126.
Itakura, S., Agnetta, B., Hare, B. and Tomasello, M. 1999. Chimpanzee use of human and con-
specific social cues to locate hidden food. Developmental Science 2: 448456.
Johnson, C.M. 2001. Distributed primate cognition: A review. Animal Cognition 4: 167183.
Jones, S.S. and Hong, H.-W. 2001. Onset of voluntary communication: Smiling looks to moth-
er. Infancy 2: 353370.
Kendon, A. and Versante, L. 2003. Pointing by hand in Neapolitan. In Pointing: Where Lan-
guage, Culture, and Cognition meet, S. Kita (ed.), 109137. Hillsdale, NJ: Erlbaum.
King, B.J. 2004. The Dynamic Dance: Nonvocal Communication in African Great Apes. Cam-
bridge, Mass.: Harvard University Press.
Krause, M.A. and Fouts, R.S. 1997. Chimpanzee (Pan troglodytes) pointing: Hand shapes, ac-
curacy, and the role of eye gaze. Journal of Comparative Psychology 111: 330336.
Lawick-Goodall, J. van 1968. Behaviour of free-living chimpanzees in the Gombe Stream area.
Animal Behaviour Monographs 1: 163311.
Leakey, M.D. and Hay, R.L. 1979. Pliocene footprints in the Laetolil Beds, at Laetoli, northern
Tanzania. Nature 278: 317328.
Leavens, D.A. 2004. Manual deixis in apes and humans. Interaction Studies 5: 387408.
Leavens, D.A. and Hopkins, W.D. 1998. Intentional communication by chimpanzees: A cross-
sectional study of the use of referential gestures. Developmental Psychology 34: 813822.
Leavens, D.A. and Hopkins, W.D. 1999. The whole hand point: The structure and function of
pointing from a comparative perspective. Journal of Comparative Psychology 113: 417
425.
Leavens, D.A., Hopkins, W.D. and Bard, K.A. 1996. Indexical and referential pointing in chim-
panzees (Pan troglodytes). Journal of Comparative Psychology 110: 346353.
Leavens, D.A., Hopkins, W.D. and Bard, K.A. 2005a. Understanding the point of chimpanzee
pointing: Epigenesis and ecological validity. Current Directions in Psychological Science
14: 185189.
Leavens, D.A., Hopkins, W.D. and Thomas, R.K. 2004a. Referential communication by chim-
panzees (Pan troglodytes). Journal of Comparative Psychology 118: 4857.
Origins of explicit reference 211

Leavens, D.A., Hostetter, A.B., Wesley, M.J. and Hopkins, W.D. 2004b. Tactical use of uni-
modal and bimodal communication by chimpanzees, Pan troglodytes. Animal Behaviour
67: 467476.
Leavens, D.A., Russell, J.L. and Hopkins, W.D. 2005b. Intentionality as measured in the per-
sistence and elaboration of communication by chimpanzees (Pan troglodytes). Child De-
velopment 76: 291306.
Leung, E.H.L. and Rheingold, H.L. 1981. Development of pointing as a social gesture. Devel-
opmental Psychology 17: 215220.
Liebal, K., Call, J. and Tomasello, M. 2004a. Use of gesture sequences in chimpanzees. Ameri-
can Journal of Primatology 64: 377396.
Liebal, K., Pika, S., Call, J. and Tomasello, M. 2004c. To move or not to move: How apes adjust
to the attentional state of others. Interaction Studies 5: 199219.
Liebal, K., Pika, S. and Tomasello, M. 2004b. Social communication in siamangs (Symphalan-
gus syndactylus): Use of gestures and facial expressions. Primates 45: 4157.
Liebal, K., Pika, S. and Tomasello, M. 2006. Gestural communication of orangutans (Pongo
pygmaeus). Gesture 6: 138.
Lillard, A.S. 1998. Ethnopsychologies: Cultural variation in theories of mind. Psychological
Bulletin 123: 332.
Liszkowski, U., Carpenter, M., Henning, A., Striano, T. and Tomasello, M. 2004. Twelve-
Month-Olds Point to Share Attention and Interest. Developmental Science 7: 297307.
Lock, A. 2001. Preverbal communication. In The Blackwell Handbook of Infant Development,
J.G. Bremner and A. Fogel (eds.), 379403. Hove, U.K.: Blackwell.
MacCorquodale, K. and Meehl, P.E. 1948. On a distinction between hypothetical constructs
and intervening variables. Psychological Review 55: 95107.
Melis, A., Call, J. and Tomasello, M. 2006. Chimpanzees (Pan troglodytes) conceal visual and
auditory information from others. Journal of Comparative Psychology 120: 154162.
Menzel, C.R. 1999. Unprompted recall and reporting of hidden objects by a chimpanzee (Pan
troglodytes) after extended delays. Journal of Comparative Psychology 113: 426434.
Miles, H.L. 1990. The cognitive foundations for reference in a signing orangutan. In Lan-
guage and Intelligence in Monkeys and Apes: Comparative Developmental Perspectives,
S.T.Parker and K.R. Gibson (eds), 511539. Cambridge: Cambridge University Press.
Mitchell, R.W. 2000. A proposal for the development of a mental vocabulary: With special
reference to pretense and false belief. In Childrens Reasoning and the Mind, P. Mitchell and
K.J. Riggs (eds.), 3765. Hove: U.K.: Psychology Press.
Moore, C. and Corkum, V. 1994. Social understanding at the end of the first year of life. De-
velopmental Review 14: 349372.
Murphy, C.M. and Messer, D.J. 1977. Mothers, infants and pointing: A study of a gesture.
In Studies in Mother-infant Interaction, H.R. Schaffer (ed.), 325354. London: Academic
Press.
ONeill, D.K. 1996. Two-year-old childrens sensitivity to a parents knowledge state when
making requests. Child Development 67: 659677.
Okamoto, S., Tomonaga, M., Ishii, K., Kawai, N., Tanaka, M. and Matsuzawa, T. 2002. An in-
fant chimpanzee (Pan troglodytes) follows human gaze. Animal Cognition 5: 107114.
Parker, S.T. 1999. The development of social roles in the play of an infant gorilla and its re-
lationship to sensorimotor intellectual development. In The Mentalities of Gorillas and
Orangutans: Comparative Perspectives, S.T. Parker, R.W. Mitchell, and H. Lyn Miles (eds.),
367393. Cambridge, U.K.: Cambridge University Press.
212 David A. Leavens et al.

Peignot, P. and Anderson, J.R. 1999. Use of experimenter-given manual and facial cues by
gorillas (Gorilla gorilla) in an object-choice task. Journal of Comparative Psychology 113:
253260.
Petitto, L. 1988. Language in the prelinguistic child. In Development of Language and Lan-
guage Researchers, F. Kessel (ed.), 187222. Hillsdale, NJ: Lawrence Erlbaum Associates.
Pika, S., Liebal, K. and Tomasello, M. 2003. Gestural communication in young gorillas (Go-
rilla gorilla): Gestural repertoire, learning, and use. American Journal of Primatology 60:
95111.
Pika, S., Liebal, K., Call, J. and Tomasello, M. 2005a. The gestural communication of apes.
Gesture 5: 4156.
Pika, S., Liebal, K. and Tomasello, M. 2005b. The gestural repertoire of bonobos (Pan panis-
cus): Flexibility and use. American Journal of Primatology 65: 3961.
Pika, S. and Mitani, J. 2006. Referential gestural communication in wild chimpanzees (Pan
troglodytes). Current Biology 16: R191R192.
Poss, S.R., Kuhar, C., Stoinski, T.S. and Hopkins, W.D. 2006. Differential use of attentional
and visual communicative signaling by orangutans (Pongo pygmaeus) and gorillas (Gorilla
gorilla) in response to the attentional status of a human. American Journal of Primatology,
68, 978992.
Pot, P. and Spinozzi, G. 1994. Early sensorimotor development in chimpanzees (Pan troglo-
dytes). Journal of Comparative Psychology 108: 93103.
Povinelli, D.J., Bering, J.M. and Giambrone, S. 2000. Toward a science of other minds: Escap-
ing the argument by analogy. Cognitive Science 24: 509541.
Povinelli, D. J., Bering, J.M. and Giambrone, S. 2003. Chimpanzee pointing: Another error
of the argument by analogy? In Pointing: Where Language, Culture, and Cognition Meet,
S.Kita (ed.), 3568. Hillsdale, NJ: Erlbaum.
Povinelli, D.J., Bierschwale, D.T. and Cech, C.G. 1999. Comprehension of seeing as a referen-
tial act in young children, but not juvenile chimpanzees. British Journal of Developmental
Psychology 17: 3760.
Povinelli, D.J. and Eddy, T.J. 1996a. What young chimpanzees know about seeing. Mono-
graphs of the Society for Research in Child Development 61(3, Serial No. 247).
Povinelli, D.J. and Eddy, T.J. 1996b. Chimpanzees: Joint visual attention. Psychological Science
7: 129135.
Povinelli, D.J. and Eddy, T.J. 1997. Specificity of gaze-following in young chimpanzees. British
Journal of Developmental Psychology 15: 213222.
Povinelli, D.J., Bering, J.M. and Giambrone, S. 2000. Toward a science of other minds: Escap-
ing the argument by analogy. Cognitive Science 24: 509541.
Povinelli, D.J., Bering, J.M. and Giambrone, S. 2003a. Chimpanzee pointing: Another error
of the argument by analogy? In Pointing: Where Language, Culture, and Cognition Meet,
S.Kita (ed.), 3568. Hillsdale, NJ: Erlbaum.
Povinelli, D.J., Bierschwale, D.T. and Cech, C.G. 1999. Comprehension of seeing as a referen-
tial act in young children, but not juvenile chimpanzees. British Journal of Developmental
Psychology 17: 3760.
Povinelli, D.J., Theall, L.A., Reaux, J.E. and Dunphy-Lelii, S. 2003b. Chimpanzees spontane-
ously alter the location of their gestures to match the attentional orientation of others.
Animal Behaviour 66: 7179.
Racine, T.P. 2005. The Role of Shared Practice in the Origins of Joint Attention and Pointing. Un-
published doctoral thesis: Simon Fraser University, Burnaby, Canada.
Origins of explicit reference 213

Reddy, V. 2001. Mind knowledge in the first year: Understanding attention and intention.
In Blackwell handbook of infant development, J.G. Bremner and A. Fogel (eds.), 241264.
Hove, U.K.: Blackwell.
Reddy, V. 2003. On being the object of attention: Implication for self-other consciousness.
Trends in Cognitive Science 7: 397402.
Russell, C.L., Bard, K.A. and Adamson, L.B. 1997. Social referencing by young chimpanzees
(Pan troglodytes). Journal of Comparative Psychology 111: 185193.
Russell, J.L., Braccini, S., Buehler, N., Kachin, M.J., Schapiro, S.J. and Hopkins, W.D. 2005.
Chimpanzee (Pan troglodytes) intentional communication is not contingent upon food.
Animal Cognition 8: 263274.
Savage-Rumbaugh, E.S. 1986. Ape Language: From Conditioned Response to Symbol. New York:
Columbia University Press.
Savage-Rumbaugh, E.S. 1991. Language learning in the bonobo How and why they learn.
In Biological and Behavioral Determinants of Language Development, N.A. Krasnegor,
D.M.Rumbaugh, R.L. Scheifelbusch, and M. Studdert-Kennedy (eds), 209233. Hillsdale,
N.J.: Lawrence Erlbaum Associates.
Savage-Rumbaugh, E.S., Shanker, S.G. and Taylor, T.J. 1998. Apes, Language, and the Human
Mind. Oxford: Oxford University Press.
Senut, B., Pickford, M., Gommery, D., Mein, P., Cheboi, K. and Coppens, Y. 2001. First hominid
from the Miocene (Lukeino Formation, Kenya). Comptes Rendus de lAcadmie de Sciences
332: 137144.
Shanker, S.G. and King, B.J. 2002. The emergence of a new paradigm in ape language research.
Behavioral and Brain Sciences 25: 605656.
Sinha, C. 2004. The evolution of language: From signals to symbols to system. In Evolution of
Communication Systems: A Comparative Approach, D. Kimbrough Oller and Ulrike Grieb-
el (eds), 217235. Vienna Series in Theoretical Biology. Cambridge, MA: MIT Press.
Steiper, M.E., Young, N.M. and Sukarna, T.Y. 2004. Genomic data support the hominoid slow-
down and an Early Oligocene estimate for the hominoid-cercopithecoid divergence. Pro-
ceedings of the National Academy of Sciences 101: 1702117026.
Sugarman, S. 1984. The development of preverbal communication: Its contribution and limits
in promoting the development of language. In The Acquisition of Communicative Compe-
tence, R.L. Scheifelbush and J. Pickar (eds.), 2367. Baltimore: University Park Press.
Super, C.W. 1990. The cultural regulation of infant and child activities. In Activity, Energy
Expenditure and Energy Requirements of Infants and Children, B. Schurch and N.S. Scrim-
shaw (eds.), 321333. Lausanne: Nestle.
Tanner, J.E. and Byrne, R.W. 1993. Concealing evidence of mood: Evidence for perspective-
taking? Primates 34: 451457.
Tanner, J.E. and Byrne, R.W. 1996. Representation of action through iconic gesture in a captive
lowland gorilla. Current Anthropology 37: 162173.
Theall, L.A. and Povinelli, D.J. 1999. Do chimpanzees tailor their gestural signals to fit the at-
tentional states of others? Animal Cognition 2: 207214.
Thompson, N.S. 1994. The many perils of ejective anthropomorphism. Behavior and Philoso-
phy 22: 5970.
Tomasello, M. 1995. Joint attention as social cognition. In Joint Attention: Its Origins and
Role in Development, C. Moore and P.J. Dunham (eds.), 103130. Hillsdale, NJ: Lawrence
Erlbaum Associates.
214 David A. Leavens et al.

Tomasello, M. 1999. The Cultural Origins of Human Cognition. Cambridge, MA: Harvard Uni-
versity Press.
Tomasello, M. 2006. Why dont apes point? In Roots of Human Sociality: Culture, Cognition
and Interaction, N. Enfield and S.C. Levinson (eds.), 506524. Oxford: Berg.
Tomasello, M., Call, J. and Hare, B. 1998. Five primate species follow the gaze of conspecifics.
Animal Behaviour 55: 10631069.
Tomasello, M., Call, J., Nagell, K., Olguin, K. and Carpenter, M. 1994. The learning and use
of gestural signals by young chimpanzees: A trans-generational study. Primates 35: 137
154.
Tomasello, M., Call, J., Warren, J., Frost, T., Carpenter, M. and Nagell, K. 1997. The ontogeny
of chimpanzee gestural signals: A comparison across groups and generations. Evolution of
Communication 1: 223253.
Trevarthen, C. 1977. Descriptive analyses of infant communicative behavior. In Studies in
Mother-infant Interaction, H. R. Schaffer (ed.), 227270. London: Academic Press.
Trevarthen, C. 1998. The concept and foundations of infant intersubjectivity. In Intersubjec-
tive Communication and Emotion in Early Ontogeny, S. Brten (ed.), 1546. Cambridge:
Cambridge University Press.
Trevarthen, C. and Hubley, P. 1978. Secondary intersubjectivity: Confidence, confiding and
acts of meaning in the first year. In Action, Gesture and Symbol, A. Lock (ed.), 183229.
New York: Academic Press.
Ve, J.J. and Sabater-Pi, J. 1998. Spontaneous pointing behaviour in the wild pygmy chimpan-
zee (Pan paniscus). Folia Primatologica 69: 289290.
de Waal, F.B.M. 1982. Chimpanzee Politics: Power and Sex among Apes. New York: Harper and
Row.
de Waal, F.B.M. 2001, January 19. Pointing primates: Sharing knowledge . . . without language.
Chronicle of Higher Education B7B9.
Whiten, A. 2000. Chimpanzee cognition and the question of mental re-representation. In
Metarepresentation: A Multidisciplinary Perspective, D. Sperber (ed.), 139167. Oxford:
Oxford University Press.
Wilkins, D. 2003. Why pointing with the index finger is not a universal (in sociocultural and
semiotic terms). In Pointing: Where Language, Culture, and Cognition Meet, S. Kita (ed.),
171215. Hillsdale, NJ: Erlbaum.
chapter 10

The co-evolution of intersubjectivity


and bodily mimesis

Jordan Zlatev

This chapter presents an evolutionary and developmental model, according to


which intersubjectivity is intimately tied to bodily mimesis the use of the body
for communicative and representational purposes to an extent that inter-
subjectivity can be said to co-evolve with it. I review some relevant evidence
concerning non-human primates which shows that feral and captive apes are
capable of the first two levels (involving e.g. empathy, shared attention and imi-
tation), but not of the third level which involves an understanding of commu-
nicative signs, i.e. triadic mimesis. In contrast, enculturated language-trained
apes show some aspects of triadic mimesis, suggesting how our predecessors
could have bootstrapped themselves to this level without language (and without
a theory of mind). The emergence of language, on the other hand, opens the
way to the highest two levels of intersubjectivity, bringing forth the understand-
ing of beliefs and the use of folk psychology.

1. Introduction

Many if not most would agree that there is a close relationship between intersub-
jectivity and language. But what more precisely is this relation? The first impedi-
ment to answering this question is definitional. As several of the contributions
to this volume show, there are rather different understandings of the concept of
intersubjectivity. For the purpose of this chapter, intersubjectivity will be taken
to be the sharing of affective, perceptual and reflective experiences between two
or more subjects. Such sharing can take different forms, some more immedi-
ate, while others more mediated by higher cognitive processes, e.g. what Barresi
and Moore (this volume) call understanding as opposed to simply sharing.
The phenomenon of joint attention (e.g. Moore and Dunham 1995), for exam-
ple, would qualify as a paradigmatic form of perceptual intersubjectivity (Zlatev,
Brinck and Andrn 2008).
216 Jordan Zlatev

If it is difficult to reach a consensus concerning the notion of intersubjectivity,


it is even more so when it comes to language. Therefore I will not here argue for
but simply assume that language is a conventional (normative) semiotic system for
communication and thought. The signs constituting language are predominantly
symbolic, i.e. conventional pairings of expression and content. The expressions can
be spoken (oral), signed (manual-brachial) or written and the denoted concepts
(or uses in a more action-oriented Wittgensteinian approach) are commonly
known to those who are fluent in the language (see Itkonen 1978, this volume;
Zlatev 2007a, 2007b).
Given these provisional definitions, we can reformulate our question as fol-
lows: is intersubjectivity a prerequisite for the learning and use of language or is
language a prerequisite for understanding others (and perhaps even ones own)
mind? Thus phrased, the question is a classic dilemma in the literature on the-
ory of mind, with arguments in favor for each side of the dependence relation.
On the one hand, Bloom (2000:2) argues persuasivly that it is impossible to
explain how children learn the meanings of a word without understanding of
certain non-linguistic mental capacities, including how children think about the
minds of others. There is, indeed, strong evidence that children understand much
about adults (visual) attention and communicative intentions prior to 18 months
(Baldwin 1991, 1993; Tomasello 1999, 2003), and that such understanding ap-
pears pivotal for word learning. An often reported result is the following: a child
in his second year of life is given a novel toy A to play with, while another toy B
is placed out of view. As he is playing with toy A, the experimenter looks at toy B
and says: Its an X. The child looks at the experimenter, follows his gaze and dis-
covers toy B for the first time. Importantly, the child assumes that X is the name
of toy B, not toy A that he was playing with when he heard X for the first time.
Such findings are problematic for pure associationist models of word learning
(e.g. Plunkett 1998).
On the other hand, there is accumulating evidence that language acquisition
itself is a determining factor for the development of certain forms of intersubjectiv-
ity, especially those which have been linked to the understanding of (false) beliefs.
For example, deaf children who are not exposed to signed language at an early age
understand others (false) beliefs significantly later than those with signing parents,
or hearing children (Peterson and Siegal 1995). Longitudinal co-relational studies
indicate that language development predicts performance in tasks of theory of
mind, but not vice versa (de Villiers and Pyers 1997; Astington and Jenkins 1999).
Furthermore, training in sentential complement constructions (with or without

. For an interesting discussion of these issues, though from a predominantly theory of


mind perspective, cf. www.interdisciplines.org/coevolution.
Intersubjectivity and bodily mimesis 217

mental predicates) significantly improves performance in false belief tasks (de


Villiers and Pyers 1997; Hale and Tager-Flusberg 2003; Lohmann and Tomasello
2003) and exposure to discourse involving different perspectives independently
enhances false belief understanding (Lohmann and Tomasello 2003).
So is it the chicken (language) or the egg (intersubjectivity) that comes first?
In order to resolve this dilemma, we need first to clear up our conceptual dusty
corners a bit more. Most importantly, it should be emphasized: Intersubjectivity
theory of mind! Elsewhere (Zlatev 2007b) I address the different perspectives to
social cognition these two concepts apply, but for present purposes it is impor-
tant to point out the following three characteristics to the approach to intersub-
jectivity that is here adopted:

1. Intersubjectivity is not a unitary capacity: it involves understanding not only


beliefs and other proposition-like entities, but other less explicit forms of con-
sciousness: emotions, attentional foci and intentions (cf. Tomasello, Carpen-
ter; Call, Behne and Moll 2005).
2. Intersubjectivity develops in a stage-like manner (in ontogeny and phylog-
eny) with lower stages serving as prerequisites for higher ones (e.g. the
relationship between empathy and cognitive empathy, Preston and de Waal
2002).
3. Intersubjectivty is bodily-based: understanding others involves identifying
with them on a direct bodily level (Merleau-Ponty 1962; Gallagher 2005,
2007; Gallagher and Hutto this volume; Hobson and Hobson this volume),
with recent progress in understanding the neural underpinnings of this ca-
pacity (Gallese, Keyners and Rizzolatti 2004; Arbib 2005; Barresi and Moore
this volume).

These three characteristics allow linking intersubjectivity quite naturally to bodily


mimesis (Donald 1991, 2001; Zlatev 2005, 2007a, 2007b), in a way first suggested
by Zlatev, Persson and Grdenfors (2005a). In this chapter, I will elaborate on
this linkage, showing how five levels of the mimesis hierarchy defined in the fol-
lowing section correspond to different levels of intersubjectivity. By looking at
recent evidence from primatology, and to a lesser degree child development, I will
argue that the first two of these levels/stages of intersubjectivity are (to a consid-
erable extent) common to human beings and great apes, and are therefore quite
clearly pre/non-linguistic. The last two are specific for us as a species, but they are
also clearly linked to language, and thus post-mimetic. The most theoretically

. See also Sinha and Rodruiguz (this volume) and Hutto (this volume) for critiques of overly
mentalist interpretations of intersubjectivity.
218 Jordan Zlatev

i nteresting level is the one in between, which appears to be both pre-linguistic


and specific for human beings (in its full form). This is our capacity for triadic
mimesis, as evidenced in e.g. declarative pointing and iconic gesturing (cf. Zlatev,
Persson and Grdenfors 2005b). I will argue that this social-cognitive capacity is
the cradle for another cognitive capacity which appears to be uniquely human:
third-order mentality (e.g. to see that you see that I see), which on its side is central
for our ability to share (semantic) knowledge (cf. Itkonen, this volume).
Thus, part of the story to be presented is an egg-based solution to the dilem-
ma: intersubjectivity grounds language, which then propels the rocket to higher
levels. However, if we inquire about the evolutionary origins of triadic mimesis, it
appears likely that it is gestural communication itself that provided the evolution-
ary niche for its selection. This brings back focus to the precursors of language as
a causal factor in the development of intersubjectivity. Thus, the story here told is
one of co-evolution.

2. Bodily mimesis and the mimesis hierarchy

In his influential theory of human evolution, Donald (1991) proposed that a form
of cognition crucially based on mimesis, and a corresponding culture based on
mimetic skills such as tool use, imitation, ritual dance and gestural communi-
cation mediated between the episodic cognition of the common ape-human
ancestor and the emergence of language as a dominant mode of human com-
munication (see also Hutto, this volume). Mimetic representations are according
to Donald conscious, self-initiated, representational acts that are intentional but
not linguistic (Donald 1991:168). This rather broad definition includes a num-
ber of different skills such as imitation, the re-enactment of actions in imagina-
tion (and hence planning and rehearsal), and the use of iconic and deictic ges-
tures for intentional communication. Others have suggested a similar mimetic
stage in ontogeny, but have proposed quite different interpretations of its scope
(Nelson1996; Zlatev 2002, 2003), making it clear that the concept of mimesis
requires a more precise definition. Building on Donalds work, but taking into ac-
count some more recent evidence in social neuroscience (see Barresi and Moore,
this volume; Zlatev 2007b, in press) and evidence on the mimetic capacities of
non-human primates (Zlatev et al. 2005a), I have in a number of recent publica-
tion (Zlatev 2005, 2007a, 2007b) proposed the concept of bodily mimesis, which
can be defined as follows:
Intersubjectivity and bodily mimesis 219

Def: A particular bodily act of cognition or communication is an act of bodily


mimesis if and only if:
a. It involves a cross-modal mapping between exteroception (i.e. perception
of the environment, normally dominated by vision) and proprioception
(perception of ones own body, normally through kinesthetic sense);
b. It is under conscious control and corresponds to either iconically or
indexically to some action, object or event, while at the same time being
differentiated from it by the subject;
c. The subject intends the act to stand for some action, object or event for an
addressee (and for the addressee to recognize this intention);
d. Without the act being conventional-normative, and
e. Without the act dividing (semi)compositionally into meaningful sub-acts
that systematically relate to each other and other similar acts.

This definition allows us to clarify the relationship between bodily mimesis and
a number of related phenomena, on the basis of an evolutionary and devel-
opmental model referred to as the mimesis hierarchy. The model is unasham-
edly progressivist, and defines each successive stage through the attainment
of a new semiotic capacity: (b), (c) and the positive versions of (d) and (e), i.e.
with conventionality-normativity (d-poss) and with compositionality (e-poss).
At the same time, it is not a classical stage model in the spirit of Piaget, where
each consecutive stage brings with it total reorganization, but a layered mod-
el (Stern1985) where earlier capacities continue to co-exist with newer ones,
which may subsume but not abolish them. The model (in its current version)
distinguishes between 5 stages/levels, determined on the basis of the definition
of bodily mimesis given above:

1. Proto-mimesis: only (a), e.g. neonatal mirroring, contagion, mutual gaze (cf.
primary intersubjectivity, Trevarthen 1979)
2. Dyadic mimesis: only (a) and (b), e.g. action imitation, shared attention, mir-
ror self-recognition
3. Triadic mimesis: only (a), (b) and (c): e.g. declarative pointing, iconic gestures,
full joint attention (cf. pantomime, Arbib 2005)
4. Post-mimesis1: (a), (b), (c) and (d-poss), (cf. protolanguage Bickerton 2003;
protosign Arbib 2005)
5. Post-mimesis2: (a), (b), (c), (d-poss) and (e-poss): e.g. spoken/signed language
(cf. symbolic reference, Deacon 1997)
220 Jordan Zlatev

The first stage, proto-mimesis, is made possible by a cross-modal mapping be-


tween ones felt bodily actions and the observed actions of others. This serves
as a basis for bodily mimesis and intersubjectivity but only a basis, since it is
neither under (full) conscious control, nor representational. It can nevertheless
account for certain forms of social cognition such as emotional and behavioral
contagion and neonatal mirroring. It is in many ways similar of the notion of
primary intersubjectivity and interaffectivity in the developmental literature
(Trevarthen1979; Stern 1985).
For level 2, dyadic mimesis, a mimetic act needs to be volitional and repre-
sentational, as in Donalds original characterization of mimesis, given earlier. As
condition (b) of the definition states, the notion of representation is understood in
line with Piagets (1945) criterion of differentiation between signifier and signi-
fied from the subjects point of view (cf. Sonesson 2007), adding the requirement
that the signifier is a bodily act. Piagets example of an infant opening and closing
her mouth to model the opening and closing of a matchbox would be an example
of an iconic correspondence. Childrens acts of pointing for themselves in order to
help guide their attention (Bates, Camaioni and Volterra 1975) would qualify as
indexical mimetic acts.
Level 3 is brought about by adding condition (c), which introduces the neces-
sary triadic element in order to make bodily mimesis communicative: the rep-
resentation or sign is intended to be recognized as such by an addressee, along
with the communicative intention itself. This introduces a Gricean element of
intentional communication (Grice 1957), involving intentional attitudes but not
propositional ones (cf. Hutto, this volume). An example of an iconic sign that ful-
fills all three conditions is the miming of eating by pretending to move a spoon to
ones mouth (made behind a glass door) in order to communicate to a colleague a
desire to go for lunch. An indexical mimetic sign would be, for example, a para-
digmatic form of declarative pointing (Brinck 2003).
Condition (d-poss) distinguishes triadic mimesis from post-mimesis, in which
the communicative representations are conventional (i.e. commonly known) and
normative (i.e. their application is governed by criteria of correctness, of which
the users are at least to some degree aware). This qualifies them as being symbolic,
though not in the sense of Deacon (1997) who insists above all on the property
mentioned in (e), the presence of systematic semantic and grammatical relations

. See Barresi and Moore (this volume) for a detailed neuroscientific model of how first-per-
son and third-person information appears to be matched in the posterior parietal cortex.
. The iconic, but not the indexical aspects of triadic mimesis are similar to the notion of pan-
tomime, constituting Stage 5a in the somewhat similar evolutionary model proposed by Arbib
(2003, 2005).
Intersubjectivity and bodily mimesis 221

between the symbols as a definitional criterion for symbolic reference. In previ-


ous work (Zlatev 2003, 2005) I combined these criteria and did not distinguish
between the corresponding two levels of post-mimesis. While it is possible that
conventionality and systematicity always come in tandem (in both evolution and
ontogeny) I now believe (in agreement with e.g. with Sonesson 2007) that we
should not make this an inherent characteristic of the definition of symbolicity.
It is at least possible that the one-word stage in childhood, and the linguistic
skills of language taught apes such as Kanzi (Savage-Rumbaugh and Lewin 1994)
qualify as a protolanguage, as suggested by Bickerton (2003), i.e. as an inven-
tory of symbols, but with very little knowledge of their interrelations. The more
holophrastic version of this hypothesis according to which the first expres-
sions are/were more like formulaic sentences such as How are you? than words
(Zlatev1997; Wray 2000) has been referred to as protosign (Arbib 2003, 2005).
The two possibilities are not exclusive and it seems that some early expressions are
more wordlike while others are more formulaic, and it is well-known in the
child language literature that some children vary in their preference for the two
strategies (Nelson 1985).
It is the pervasive grammatical and semantic systematicity of all human lan-
guages that has traditionally attracted most of the attention of philosophers
and linguists that brings about the final level: Post-mimesis2. However, as the
model implies, this level would be impossible if it did not, at least in part, rest on
the earlier levels. Nelson and Shaws (2002) poignant definition of language as
a socially shared symbolic system implies this as well: language may indeed be
called a system, but if it were not for socially shared symbolism, it would not
be language.

3. The mimesis hierarchy and levels of intersubjectivity

What gives us ground to apply the mimesis hierarchy to the development and
evolution of intersubjectivity? As suggested in the introduction: above all the ap-
proach arising from the combination of phenomenology and neuroscience that
links social cognition and embodiment, perhaps best summarized by Gallagher
(2005, 2007). It was originally Husserl, who over one hundred years ago first
criticized the intellectualist perspective on the understanding of other minds
as a matter of inference or analogy from the knowledge of ones own mind. As

. One case in which language acquisition relies (much) less on bodily mimesis is autism.
Predictably, however, even the language of high-functioning persons with autism displays se-
mantic and pragmatic abnormalities (Menyuk and Quill 1985; Zlatev forthcoming).
222 Jordan Zlatev

summarized by Gallagher (2007:286): For Husserl, understanding another


person is not a matter of intellectual inference, but a matter of sensory activa-
tions that are unified in or by the animate organism or lived body that is per-
ceiving another animate organism. This perspective on intersubjectivity was
further elaborated by Merleau-Ponty (1962), in particular through the notion of
the corporeal schema which serves as a normal means of knowing other bod-
ies (Merleau-Ponty 2003:218). This was originally stated long before neurosci-
ence corroborated the role of the body in understanding others through various
mirror neuron systems (Rizzolatti et al. 1996; Gallese et al. 2004; Arbib 2005;
Zlatev in press), which link the perception of another person and the subjects
own proprioception and action. Gallaghers (2005, 2007) major contribution to
this discussion has been to elaborate the distinction between body schema and
body image where the first is pre-conscious and serves as a precondition and
backdrop for intentionality, while the latter is a (sometimes conscious) system
of perceptions, attitudes, beliefs and dispositions pertaining to ones own body
(Gallagher 2005:37). While one may still question this distinction, and require
a finer division within the latter the perceptions of the body are quite different
from the beliefs and attitudes towards it I will in general accept it, and reply on
it in the discussion of the first two levels of intersubjectivity.
Within this embodied and action-oriented perspective on the understand-
ing of others as well as the self, bodily mimesis becomes clearly relevant. A central
question in relating the mimesis hierarchy and intersubjectivity is whether the
respective level serves as a precondition and a causal factor for the development
of corresponding skills of intersubjectivity. Or is it rather that independently
reached insights into the minds of others makes increasingly complex forms of
bodily mimesis possible? The problem for the latter scenario is that one would
need to account for the emergence of theory of mind skills (or modules)
independently, which remains a problem for both evolutionary and child psy-
chology (cf. Hutto 2008, this volume). Furthermore, from a Vygotskyan perspec-
tive stating the priority of the inter-personal to the intra-personal in cultural
development (see Zlatev et al. this volume), one would expect bodily mimesis
to bootstrap (in ontogeny) and provide selection pressures for (in phylogeny)
the development of more refined skills in mind reading. At the same time, as
with the chicken-and-egg relationship between intersubjetctivity and language,
we will see that the causality is not unidirectional, and the relationship between
mimesis and intersubjectivity may be more pertinently described in terms of
co-evolution.
Intersubjectivity and bodily mimesis 223

3.1 Proto-mimesis

According to the definition of bodily mimesis we can regard some of the most
basic forms of intersubjectivity as proto-mimetic to the extent that they consist
of interpersonal interactions that involve cross-modal mapping between proprio-
ception and the (visual) perception of others, but lack the characteristics volition
and differentiation. In terms of the distinction between body schema and body
image (Gallagher 2005) mentioned above, proto-mimesis can be said to involve
(above all) the body schema, which is largely innate (in the sense of being present
at birth) and pre-conscious, rather than the body image, which is gradually con-
structed with experience and accessible to consciousness. Proto-mimetic forms of
intersubjectivity do not require a conceptual differentiation between self and oth-
er, which is necessary for establishing a correspondence relation between them.
This is not to say that the young infant lives in a completely undifferentiated world
in which there is no awareness of self whatsoever, as pointed out by Stern (1985).
Nevertheless, even a modern developmental psychologist who emphasizes the
role of the awareness of others attention and the presence of affective self-con-
sciousness in the first months of life points out that older infants reveal a greater
focus on the self and the younger ones reveal a more immersed, less detached focus
on the other (Reddy 2003:401). This more immersed, less detached quality
of the earliest forms of intersubjectivity motivates the classification of them as
proto-mimetic.
Can this analysis be extended to the (early) interpersonal relations among
apes? Neonatal mirroring has also been observed, and appears to be common
in chimpanzees (Myowa-Yamakoshi 2001; Myowa-Yamakoshi et al. 2004). Since
this is typically attributed to a form of identification with the person imitated
when children are concerned (Meltzoff and Gopnik1993; Gallagher 2005), it can
be viewed as evidence that at least chimpanzees, and possibly other apes and non-
human primates possess the capacity for basic proto-mimetic intersubjectivity.
The function of such mirroring can be related to what is possibly the most basic
form of intersubjectivity, both ontogenetically and phylogenetically: the ability
to share emotions, or empathy (Einfhlung). As a proto-mimetic, non-represen-
tational capacity, this is testified in early infancy and sometimes referred to as in-
teraffectivity (Stern 1985). The well-known experiments described by Trevarthen
(1992) show that parent-infant interactions in the first few months take the form
of a reciprocal rhythmic dance, and that frustration follows if this attunement
is disrupted (notice the musical metaphors). The suggestion is that emotions
such as joy and suffering are perceived directly, possibly involving mirror-neuron
224 Jordan Zlatev

s tructures similar to those involved in action recognition and imitation, rather


than inferences to underlying states (Gallese et al. 2004).
Preston and de Waal (2002) have argued persuasivly that as a basic biological
mechanism involving the linkage of perception and action, empathy is available
to most if not all mammal species. Defining empathy as any process where the at-
tended perception of the objects state generates a state in the subject that is more
applicable to the objects state or situation than to the subjects own prior state or
situation (ibid:4), they see a clear evolutionary motivation for its emergence in
the ability to recognize and understand the behavior of con-specifics. It is an open
question how much of such matching between the visually perceived body of the
other and the proprioceptively perceived body of oneself is domain-general and
thus can be expected to be general across species and how much is specialized in
the form of species-specific communicative signals such as facial expressions. It is
characteristic that signals such as the famous play-face expression of great apes
(an evolutionary precursor to the human smile) typically carry emotional rather
than referential meaning.
A second mode of intersubjectivity that appears to be of a proto-mimetic
nature, at least in human children, is attention. Reddy (2000, 2003, 2005) has
argued that prior to awareness of the others attention to an external object and
much prior to joint attention appearing around 12 months (see below), children
show an awareness of others as attending beings, as well as an awareness of self
as an object of others attention (Reddy 2003:357), displayed in phenomena such
as eye-contact, intense smiling, coyness, calling vocalizations, showing-off etc.
Since awareness of self (at this stage) is largely proprioceptive, while awareness of
the attention of others (in seeing children) is mostly based on vision, this satisfies
the first criterion for bodily mimesis. Reddys claim is that such dyadic (though
not dyadic-mimetic) interactions underpin later developments in intersubjectiv-
ity. Evidence for this is the observation that autistic children show difficulties even
with such simple interpersonal engagements, suggesting that whatever is going
on in dyadic attentional engagements may indeed be critical, not just as a source
of information and experience about attentional behavior, or as a scaffold for the
subsequent development of awareness of attention, but also as evidence of aware-
ness of attention (Reddy 2005:95).
Until recently it has not been clear whether such awareness of anothers atten-
tion exists in the interaction between infant apes and their mothers, but in a recent
study, Bard et al. (2005) report that the rates of mutual gaze between infants and
their mothers are virtually the same in 3-month old human children and 3-month
old chimpanzees; 1820 and 17 times per hour, respectively (though humans tend
to engage in much longer bouts of mutual gaze). Furthermore, the authors noticed
a cultural difference between the apes at Primate Research Institute, Japan and
Intersubjectivity and bodily mimesis 225

those at Yerkes National Primate Research Center, USA, with the ones in Japan
engaging in mutual gaze at much higher rates (22 vs. 12 times per hour), while
the ape mothers in the USA cradled their infants more often (71% vs. 40% of the
total time). Intriguingly, a similar inverse correlation between visual and tactile
contact has been observed in human societies, with traditional cultures favoring
touch and Western ones gaze: With reduced physical contact found in Western
societies, mutual engagement shifts to the visual system, arguably a more evolu-
tionarily derived pattern (Bard et al. 2005:621). Since apes do not seem to differ
from humans in the capacity to perform this shift, and possibly even transmit it
culturally to their descendents, this supports the conclusions from the studies on
neonatal imitation that the difference between the species on the proto-mimetic
level is not of a qualitative nature.

3.2 Dyadic mimesis

In the previous discussion, I proposed that proto-mimetic intersubjectivity, with-


out a clear differentiation between self and other is based on the (mostly) unat-
tended body schema and similarly unattended mechanisms for body copying.
On the other hand, dyadic mimetic intersubjectivity is based on the conscious
control of the movements of ones body and attention to their correspondence to
the body of another, whereby one can imagine what the other experiences on the
basis of ones own experiences in similar circumstances. In the terminology of
Gallagher (2005), I propose that the role that bodily mimesis proper plays for the
development of intersubjectivity implies not the body schema but the body image.
While the body schema and the body image normally interact, Gallagher (2005)
shows how in certain pathologies they can be disassociated. Below, I will describe
how understanding others emotions, attention and intentions can be seen as in-
timately related to dyadic mimesis.
Whereas (simple) empathy is proto-mimetic, what Preston and de Waal
(2002) call cognitive empathy requires a differentiation between subject and object
where the subject is thought to use perspective-taking processes to imagine or

. In a recently completed study, however, we found remarkable differences in both rates and
durations of mutual gaze between ape and human mother-infant dyads, with five dyads per
group, and somewhat older ages for the infants than those in the study of Bard et al. (2005): 58
months. The rates for the apes (3 chimpanzee, one bonobo and one gorilla) was on average 2/
hour, while those for the 5 human infants (observed in Lund, Sweden) was as many as 35/hour
on average (Zlatev 2008). There are many methodological issues that need to be addressed, but
these findings suggest the need for precaution before concluding that there are few differences
between ape and human proto-mimesis (and thus, primary intersubjectivity).
226 Jordan Zlatev

project into the place of the object (ibid:18). Evidence that this is not an isolated
phenomenon, but shows a more advanced level of intersubjectivity is the fact that
cognitive empathy appears to emerge developmentally and phylogenetically with
other markers of mind including perspective taking , mirror self-recogni-
tion , deception, and tool-use. (ibid:18). Research concerning cognitive em-
pathy in apes has focused on their consolation behavior, which is well-attested
in at least chimpanzees, but has not been found in monkey species (de Waal and
Aureli1996) or any other mammalian species. Consolation is cognitively more
complex than simple empathy since the consoling individual not only feels that
somebody else experiences a particular negative emotion, but also intends to help
relieve this, implying an ability to imagine the more positive emotional state.
This supports the interpretation that cognitive empathy involves a more so-
phisticated representational capacity than what is necessary for simple empathy.
Since dyadic mimesis involves both the ability to identify with the other, and at
the same time to differentiate between self and other, a natural hypothesis is that
it is dyadic mimesis, implicated in e.g. imitation (Zlatev et al.2005a) that scaffolds
the development of such representational capacity (cf. Hutto this volume, for a
similar proposal).
Since dyadic mimesis allows to place oneself in the shoes of others, it also
gives the opportunity to understand what someone else is attending to. Such sec-
ond-order attention is well testified among great apes (Hare et al. 2000). When
two individuals become aware that both are attending to the same object, what
results is shared attention. This comes a good deal towards the construction of a
consensual reality, but does not quite reach it. To make a given object X fully
intersubjective between you and me, I would need not only to see that you see X,
(second-order attention, see Figure 1a), but also to see that you see that I see X
(third-order attention, see Figure 1b) and vice versa which is one interpretation
of what it means to engage in joint attention. Full joint attention thus involves
third-order mentality and possibly because of this (see below) appears to be be-
yond the cognitive capacities of apes (Tomasello 1999). It also goes beyond dyadic
mimesis, so I will leave this capacity for the time being, but return to it in the fol-
lowing subsection.
Concerning the understanding of anothers intentions, it was the received view
until the end of the last century that apes cannot do this (e.g. Tomasello1999). Re-
cently, however, there has been mounting evidence that (at least) [c]himpanzees

. The terms shared attention and joint attention have, unfortunately, not been standardi-
zed in the literature and are often used interchangebaly. One exception is Emery (2000), who
however uses the terms in nearly the reverse sense as that adopted here, with shared attention
being the more high-order phenomenon.
Intersubjectivity and bodily mimesis 227

Figure 1a. Shared attention: Second-order attention: I see that you see X (and vice
versa, though only the second-order arrow for one of the participants is shown)

Figure 1b. Joint attention: Third-order attention: I see that you see that I see X (and
vice versa, though only the third-order arrow for one of the participants is shown)

understand psychological states the question is which ones and to what ex-
tent, which is the title of Tomasello, Call and Hare (2003). A wealth of experi-
ments supports this claim. For example, a subordinate and a dominant chim-
panzee compete for food placed on the subordinates side of two barriers, so that
in some cases only the subordinate, but not the dominant can see the food and
monitor the visual access of her competitor. The results showed that the subordi-
nates preferentially retrieved the food that dominants could not see (and had not
seen in the past), implying that chimpanzees are aware of the perceptual states of
con-specifics. Together with awareness of the competitors goal (i.e. to obtain the
food) this allows the prediction of the others actions and acting accordingly. A
dyadic-mimetic interpretation of these facts is that such an understanding can be
obtained through the projection of ones own perceptual and motivational state
228 Jordan Zlatev

onto the other (I would get the food if I were in her place!) at the same time as
distinguishing between the self and the other, and does not require explicit (prop-
ositional) reasoning or inference. There is mounting evidence that even non-en-
culturated and language-nave apes are capable of such mimetic enactments.
At the same time, it has not been shown that (non-enculturated) apes are
capable of understanding anothers mental states about their own mental states,
which would involve, as with joint attention, third-order mentality. In natural
settings, some cases of deception may be interpreted in a way to involve third-
order mentality, but do not require this. For example de Waal (1982) describes
the chimpanzee Yeroen who has had a fight with Nikkie, and continues to fake a
limp only when in the presence of Nikkie, apparently in an act of wishing to pro-
voke Nikkies empathy: I wish to make you see that I hurt. The more parsimoni-
ous explanation, however, is that Yeroen has learned from previous experience
that he is not bothered by Nikkie when he is hurt: He may have learned from
incidents in the past in which he is seriously wounded that his rival was less hard
on him during periods when he was (of necessity) limping (ibid:3536), so he
mimes the appropriate behavior. Here we have a clear correspondence between
dyadic mimesis and second-order intersubjectivity. Notice that Yeroens limping
was not a form of intentional communication obviously he did not wish Nik-
kie to understand that he was faking a limp if he did, that would be a case of
triadic mimesis.
The conclusion that can be drawn from these various examples is that wild
apes as well as those who are exposed to different degrees of human contact
(captive, nursery-raised and laboratory-trained apes), but are not raised in a
something like a human cultural environment and thus enculturated (Call and
Tomasello 1996:372) can indeed understand second-order mentality. However,
such apes do not seem able to master third-order mentality (in the domains of
emotion, attention nor intention) in which their own mental state needs to be
either intentionally communicated in collaboration, or hidden in competi-
tion. This corresponds well with the capacity of apes for dyadic mimesis, but their
relative difficulty with triadic mimesis, as argued below.

. While it has still not been conclusively shown that apes are capable of such mental projec-
tion, and it is conceivable how the evidence can be explained in a more behaviouristic man-
ner involving learning generalizations over other individuals behaviour in relation to food,
the mental explanation is (a) ultimately more parsimonious (cf. Tomasello and Call 2006)
and (b) consistent with the performance of human children in roughly comparable stages of
development.
Intersubjectivity and bodily mimesis 229

3.3 Triadic mimesis

In the case of triadic mimesis, there is not only an understanding of the repre-
sentational relation between ones bodily motion and the object, action or event
it corresponds to, but an understanding that such a representational relation can
be used communicatively. This requires some understanding that the representa-
tion (sign) has the same meaning for the addressee as for the sender. This involves
at least second-order mentality, which was shown above to correlate with dyadic
mimesis. But having the same meaning is a reflexive notion and implies at least
some degree of third-order mentality (see Itkonen this volume). Consider the
simple example of what knowing the meaning of the word cat implies:
(1) I know that cat means a small furry animal that meows.
(2) I expect you know that cat means a small furry animal that meows.
(3) I expect that you know that I know that cat means a small furry animal that
meows.

While it is possible for intentional communication to begin without a full re-


alization of (3), it is practically inevitable that discursive experience (including
failures in communication) will promote the development of third-order mental-
ity. Therefore it is possible that triadic mimesis was one of the major driving forces
behind the development of intersubjectivity in hominid evolution. Unlike compet-
ing hypotheses related to Machiavellian intelligence (Byrne and Whiten 1988),
this puts the focus on cooperation rather than competition (see also Brinck and
Grdenfors 2003; Tomasello et al. 2005). A prediction from this hypothesis is that
enculturated apes and these have all been taught at least some degree of sign
use will develop higher-level skills of intersubjectivity. There is support for this
prediction. In summarizing some 200 studies of the role of human influence, Call
and Tomasello (1996) conclude that [t]he sociocognitive domains in which hu-
mans seem to have the highest effect on apes are intentional communication and
social learning (ibid:391).
As stated out earlier, wild apes do not seem to be capable of engaging in full,
third-order joint attention. Furthermore, as Tomasello (1999:21) points out, wild
apes do not (a) point to objects; (b) hold up objects to show them to others; (c) take
someone along to a place to show them something; (d) actively offer something to
someone; and (e) intentionally teach other individuals new behaviors. Tomasellos
original account of these absences was based on the claim that apes are unable
to understand anothers intentions. Given the more recent evidence, this expla-
nation is no longer tenable, and indeed Tomasello et al. (2005) suggest instead
that the crucial difference between apes and humans involves the motivation to
230 Jordan Zlatev

participate in joint collaborative engagements, and the lack of this motivation


prevents apes from constructing dialogical cognitive representations.
The explanation I propose is similar but more specific: non-enculturated apes
fail to master triadic mimesis, and related to that, the ability to engage in third-or-
der mentality. The motivational difference between apes and humans appealed to
by Tomasello et al. (2005) cannot be the full explanation since enculturated apes
such as Koko, Kanzi and Chantek manage at least (a) and apparently communica-
tive skills (be) listed above as well (Miles 1999), even if in restricted forms. This
seems to imply that the human cultural environments of the enculturated apes
have taught them the basics of intentional, sign-mediated communication, and
thereby (the roots of) third-order mentality.
How this could occur can be seen again with respect to joint attention,
which can be seen as emerging from second-order attention combined with the
recognition of anothers intention concerning my attention: I see that you see
X (second-order attention, Figure 1a), and furthermore I realize that you want
me to look at X (Brinck 2001). In other words, joint attention can be brought
forth by understanding a simple form of communicative intention, combined
with already existing second-order attention. Thus, communicating the inten-
tion to jointly attend may be said to involve the simplest kind of triadic mimesis:
whatever kind of sign that is used to convey that intention (see the example in
the next paragraph) can be said to stand for that intention for both sender and
interpreter.
Without enculturation, experiments indicate that apes do not under-
stand communicative intentions. A rather typical example is an experiment by
Tomasello, Call and Gluckman (1997), where the authors in different ways in-
dicated for both chimpanzees and two- to three-year-old children which out of
three containers contained a reward: by pointing to the correct container; by
placing a marker on top of the correct container; and holding up a replica of the
correct container. Tomasello (1999:102) summarizes the results of the experi-
ment as follows:
Children already knew about pointing, but they did not know about using mark-
ers and replicas as communicative signs. They nevertheless used these novel signs
very effectively to find the reward. In contrast, no ape was able to do this for any
of the communicative signs that they did not know before the experiment. One
explanation of these results is that the apes were not able to understand that the
human beings had intentions toward their own attentional states. The apes thus
treated the communicative attempts of the human as discriminative cues on par

. Consisting of what Wittgenstein (1953) called the forms of life, which provide the neces-
sary context for the emergence and functioning of intentional communication and language.
Intersubjectivity and bodily mimesis 231

with all other types of discriminative cues that have to be laboriously learned
over repeated experiences. The children, in contrast, treated each communicative
attempt as an expression of the adults intention to direct their attention in ways
relevant to the current situation. [my emphasis]

In other words, while the children clearly understood the communicative inten-
tions of the experimenter, the apes did not. This interpretation is supported by a
similar experiment designed to test false beliefs (Call and Tomasello 1999), in
which the enculturated and language-taught orangutan Chantek clearly performed
differently from all the other apes in understanding a human communicators sig-
nals. Even though this was not the goal of the experiment, and Chantek did not
score better than the other apes in the false beliefs task, his much better perfor-
mance could be explained by considering that he understood the signals as com-
municative signs (in this case indexes), rather than as discriminative cues.
Finally, we can consider the case of captive apes living in a zoo and thus in-
volved with at least some degree of interaction with human culture. In their study
of spontaneous gestural communication in a group of gorillas in the San Fran-
cisco zoo, Tanner and Byrne (1996, 1999), found a wealth of gestures used by sev-
eral members of the group, in particular by the adult male Kubie, some of which
seemed to be used in a communicative way so that:
[w]hether the receiving partner was a human or another ape, the signaling ape
made sure that visual contact was established (except for tactile close gestures),
and seemed to understand both the others potential actions and what the partner
might, in turn, understand from his (the signalers) performance of gestures.
 (Tanner and Byrne 1999:231, [my emphasis])

We can conclude that triadic mimetic intersubjectivity, involving understanding


not only of con-specifics intentions, but their communicative intentions, and con-
sequently a degree of third-order mentality, appears to be difficult but not com-
pletely beyond the cognitive potential of apes. To realize this potential, however, apes
need an environment that is rich in opportunities for developing triadic mime-
sis, i.e. a particular form of enculturation. Thus triadic mimesis may be said to be
within apes Zone of Proximal Development (ZPD), albeit in its periphery.10 If
enculturation provides the ZPD for present-day apes, it is reasonable to suppose
that it did the same for some particularly social group of hominids through a form
of self-domestication giving rise to a bootstrapping spiral of sign use and inter-
subjectivity. In the terms of Donald (2001), triadic mimesis must have been within

10. ZPD is the notion introduced by Vygotsky (1978) to refer to skills that children could ac-
quire with the help of adults, but not alone.
232 Jordan Zlatev

the common ancestors zone of proximal evolution, and is therefore a likely can-
didate for constituting the missing link in human cognitive evolution.

3.4 Post-mimesis

What differentiates post-mimetic, or symbolic, cognition from mimesis is the use


of fully conventional signs, interrelated within a system (Deacon 1997; Zlatev2003).
The most obvious example of post-mimesis, involving all the previous features but
also symbolicity is a conventional, institutionalized signed language such as ASL
(Stokoe 1960) or Swedish Sign Language (Ahlgren 2003). What is the relation
between acquiring such a system and intersubjectivity? Prior to addressing this
question, let us make the distinction between Post-mimesis1 (protolanguage) and
Post-mimesis2 (language), pointed out in Section 2, where only the latter has (ex-
tensive) systematicity. A case can be made for apes acquiring the first but hardly
the second.

3.4.1 Post-mimesis1: Protolanguage


Evidence from four of the most successful projects involving the teaching of
language to great apes the chimpanzee (Pan troglodytes) Washoe (Fouts 1972,
1973), the gorillas Koko and Michael (Patterson 1978, 1980) and the bonobos
(Pan paniscus) Kanzi and Panbanisha (Savage-Rumbaugh and Lewin 1994; Sav-
age-Rumbaugh et al. 1998) and the orangutan Chantek (Miles 1990) has shown
that some of the characteristics of language are within the grasp of our nearest
non-human relatives. As with children a precondition for the success of these
projects has been a cultural environment rich with intersubjectivity and a variety
of activities to stimulate communication (Miles 1990). The ape language litera-
ture contains rather convincing evidence that apes can:

comprehend the referential (representational) function of spoken words, ASL


signs and visual lexigrams, and combinations of these;
use the sign-tokens in the absence of their referents, i.e. displacement
(Hockett 1960);
acquire a considerable vocabulary of words/signs, according to some mea-
surements extending 600 signs, but even according to the most conservative
criteria no less than 140 signs
understand novel combinations of spoken or signed words;
produce novel combinations of signs.

The following have also been reported, but are considerably less well docu-
mented:
Intersubjectivity and bodily mimesis 233

apes can regard the acquired signs as conventional-normative (consensual),


to the point of correcting their teachers if the latter do not use these appro
priately;
apes can use language for a number of different functions (speech acts), in-
cluding labeling, answering, expressing emotion, arguing and insulting;
apes can use language not only for communication, but for thinking (private
speech).

It is therefore possible to agree with Miles (1999:204), that all great apes have the
intelligence for a rudimentary, referential, generalizable, imitative, displaceable,
symbol system but with an emphasis on rudimentary. It has, for example, not
yet been clearly demonstrated whether the spoken or signed utterances of apes
conform to consistent principles of grammatical organization. Greenfield and
Savage-Rumbaugh (1990, 1991) describe two ordering rules in the two symbol
combinations of Kanzi, but the data show at best a weak statistical correlation
between preferred order and semantic (communicative) function. The most plau-
sible conclusion to the prolonged ape language debate therefore seems to be a
tie between the extreme proponents and opponents: apes such as Koko and Kanzi
can be said to have acquired a form of protolanguage, which is different from both
mimesis due to conventionality, and full language due to a lack of systematicity
which, on its part, is necessary for the production of narratives.
But can it even be truthfully said that Koko and Kanzi have acquired se-
mantic conventions? A convention (Lewis 1969; Clark 1996) or a norm (Itkonen
1978) exists as a form of common knowledge among the members of the group
that share the convention. A common explication of common knowledge is
that it consists of third-order knowledge: I know that you know that I know X
(Itkonen 1978, this volume). If this knowledge must be in explicit proposition-
al form, then it is unlikely that we can attribute it to the language-taught apes,
making it dubious whether we could even call their communicative acquisitions
protolanguage.
However, is it even warranted to make this requirement when it comes to
children? While I earlier argued that triadic mimesis is connected to third-order
mentality, it is not necessary that the understanding on all three orders is explicit
enough to be a matter of belief, i.e. a propositional representation that is active-
ly held to be true. Consider again the three orders of knowing the conventional
meaning of the word cat given as (3) above and repeated for convenience:
(3) I expect that you know that I know that cat means a small furry animal that
meows.
234 Jordan Zlatev

The highest order thought, my expectation that you know that I know, is not a
belief for the 4 year old child, since it is taken for granted, without pondering on
whether it is true or not. For younger children, even the second order thought is
unlikely to be propositional, as evidenced by their inability to understand beliefs
proper false or otherwise. Perhaps it best to call the most basic form of shared
cultural knowledge a sharing of expectations: we both expect each of us to behave
in a certain way given certain conditions (e.g. a red light) and are, however dimly,
both aware that this expectation is mutual, and thereby binding. Given this, (3)
can be reformulated in a Wittgensteinian manner into (4):
(4) I expect that we are both using cat to mean a small furry animal that
meows.

If we are prepared to attribute a degree of semantic knowledge to 23 year old


children in terms such as (4), then I doubt whether we have good, non-anthro-
pocentric reasons not to do likewise with Chantek and Kanzi. Furthermore, as
mentioned earlier, Chanteks experiences with protolanguage obviously boot-
strapped his understanding of communicative intentions and third-order men-
tality, even if he, as all apes so far, was not able to pass a false-belief task (Call
and Tomasello 1999).

3.4.2 Post-mimesis2: Language


Irrespective of modality spoken, signed or written language is characterized
by a form of combinatorialness that is unprecedented in animal communication.
Recent studies of the spontaneous emergence of Nicaraguan Sign Language (NSL)
during the past 25 years show that signed languages have their origin in (triadic)
mimesis, but quickly acquire the properties of conventionality and systematicity.
Senghas, Kita and zyrek (2004:1791) compared the co-speech gestures of Ni-
caraguan speakers of Spanish, with the signing of three cohorts, or generations,
of learners of NSL and could document some aspects of this transition in detail:
The movements of the hands and body in the sign language are clearly derived
from a gestural source. Nonetheless, the analyses reveal a qualitative difference
between gesturing and signing. In gesture, manner and path were integrated by
expressing them simultaneously and holistically, the way they occur in the mo-
tion [event] itself. Despite this analogue, holistic nature of the gesturing that sur-
rounded them, the first cohort of children, who started building NSL in the late
1970s, evidently introduced the possibility of dissecting out manner and path
and assembling them into a sequence of elemental units. As second and third
cohorts learned the language in the mid 1980s and 1990s, they rapidly made this
Intersubjectivity and bodily mimesis 235

segmented, sequenced construction the preferred means of expressing motion


events. NSL thus quickly acquired the discrete, combinatorial nature that is hall-
mark of language. [my emphasis]

Given their mimetic-gestural origin signed languages have a much greater degree
of iconicity than spoken languages and it has been proposed that this plays a role
in their faster acquisition by (deaf) children (Brown 1977). Recent studies have
questioned this, however, since only a minority of the signs of signed language
have transparent iconic meanings, and in a study of 22 children acquiring ASL it
was shown that of the 44 different signs produced by the children before the age
of 13 months, 36% were classified as iconic, 30% as metonymic, and 34% as arbi-
trary (Bonvillan and Patterson 1999:253).11 In this study, the authors compared
the rate and pattern of acquisition of ASL by children and that of two gorillas,
Koko and Michael. Despite certain differences the childrens acquisition was
(unsurprisingly) faster it was shown that that the similarities in early develop-
ment across the species outweigh the differences (ibid:260). Thus, it can be con-
cluded that gorillas, and by inference other great apes, not only can acquire the
basics of a post-mimetic symbolic system such as ASL, but that they do this in a
similar way. Interestingly, however, the gorillas seemed to rely somewhat more on
the iconicity of the signs in comparison with the children, so that the proportion
of their first 46 signs was somewhat different to the one reported above: 42% icon-
ic, 32% metonymic and 26% arbitrary, while for the first 10 signs this difference
was even clearer: 60% iconic, 20% metonymic and 20% arbitrary. This suggests
that the apes relied to a greater degree on triadic mimesis than the children in
their acquisition of the sign language, which also would explain why the children
quickly progressed beyond the initial level of vocabulary acquisition to learn the
systematic character, i.e. the grammar, of the language, while the apes stagnated
on a simple, protolanguage level.
With the rapid development of grammar and vocabulary around the age of
4, most children also become capable of understanding that others have or lack
knowledge or have false beliefs (e.g., Perner 1991; Mitchell 1997), implying a
metarepresentational capacity. It appears that these two developments are closely
connected, and that acquiring a language, spoken or signed, is a major causal fac-
tor for developing a fully-fledged theory of mind. At least four different sides to
language (use) combine to promote metarepresentational capacity:

11. Metonymic signs are such that involve some degree of iconicity between the sign and the
referent, but the tie between the sign and its meaning is not readily apparent one would be
unlikely to guess the meaning of a metonymic sign simply seeing it produced (Bonvillan and
Patterson 1999:252).
236 Jordan Zlatev

Language is a conventional symbolic system, and as such its mastery implies


third-order mentality, which would carry with it training in the understand-
ing of others beliefs.
Two specific (universal) features of human languages are (a) mental predi-
cates such as think, believe and know and (b) sentential complement
constructions such as say that X. If one can meaningfully formulate sen-
tences such as I know that you think that X, then one should be able to think
the corresponding thought.
Not just the semantic/grammatical structure of language, but its use in dis-
course would promote the understanding of others as mental agents: There
are at least three kinds of discourse, each of which requires [children] to
take the perspective of another person in a way that goes beyond the per-
spective-taking inherent in comprehending individual linguistic symbols and
constructions. (Tomasello 1999:173): disagreements, repairs/explanations
and meta-discourse.
Closely related to the above is the narrative practice hypothesis that with
linguistic proficiency (usually) comes first apprenticeship and then various
degrees of mastery in understanding and producing narratives, through
which children become familiar with both the core structure of folk psychol-
ogy and its norm-governed possibilities for using it in practice (Hutto 2008;
Gallager and Hutto, this volume).

As pointed out in the introduction, there is accumulating evidence for a strong


connection from language to the understanding of beliefs and folk psychol-
ogy in the case of children (e.g. Peterson and Siegal 1995; de Villiers and Pyers
1997; Astington and Jenkins 1999; Hale and Tager-Flusberg 2003; Lohmann and
Tomasello 2003). There is also negative evidence for apes, enculturated or not: as
mentioned, Call and Tomasello (1999) used a non-verbal false belief task with
chimpanzees and orangutans as well as with human children. While the childrens
performance on verbal and nonverbal false belief tasks was highly correlated, sup-
porting the hypothesis of a possible causal connection, none of the apes including
Chantek, could pass the nonverbal false belief task even though they succeeded
in all of the control trials indicating mastery of the general task demands. A pre-
diction from the present analysis would be that if Chantek, or any of the other
language apes that have been the subject of so much controversy, could progress
in their language development from protolanguage to (systematic and narrative)
language, they would also be able to pass false belief tasks. It is indicative that no
such evidence has so far been offered.
Intersubjectivity and bodily mimesis 237

Table 1. The mimesis hierarchy, intersubjectivity skills and types of mentality


Level Intersubjectivity skills Type of mentality
1. Proto-mimesis neonatal imitation 1st order: lack of complete
(simple) empathy differentiation between self
mutual attention and other
2. Dyadic mimesis cognitive empathy 2nd order: understanding the
shared attention other through projection
understanding others (identification, but
intentions (in competitive differentiation)
contexts)
3. Triadic mimesis joint attention 3rd order (attention and
having and understanding intentions)
communicative intentions
4. Post-mimesis1: semantic conventions 3rd order (expectations)
protolanguage
5. Post-mimesis2: (false) belief understanding 3rd order and higher (beliefs)
language

4. Summary and conclusions

In this chapter I have argued that there is a close connection between the 5 levels
of the evolutionary and developmental model referred to as the mimesis hierar-
chy and corresponding skills in intersubjectivity. There is furthermore a connec-
tion between the five levels and what we can call the type of mentality in-
volved reminding that mentality refers to various kinds of states and processes
of consciousness, and not only to propositional attitudes. These correlations are
summarized in Table 1.
Proto-mimesis is crucially implicated in mutual attention and the awareness
of others feelings, through a species-general capacity for empathy that has pos-
sibly been further developed in the ultra-social species Homo sapiens. Dyadic
mimesis leads to the ability to map between ones own body and that of others
in a more detached, differentiated manner, and in this way understand others
emotions, i.e. cognitive empathy, shared attention and even intentions through
a (conscious) process of projection: what would I see/feel/wish if I were you.
Unlike earlier claims to the contrary, newer evidence and analyses show that apes
do not have much difficulties with this level and that they have the capacity for
second-order mentality.
One of the main claims of this chapter is that the crucial step in the evolution
and development of human intersubjectivity involves triadic mimesis, implying
having and understanding others communicative intentions. I have argued that
238 Jordan Zlatev

this requires third-order mentality: I want you to do X (e.g. share attention on


an object) by recognizing my intention that you do this from the senders per-
spective and I understand that you want me to do X from the recipients, but
also pointed out that this need not (and does not at this stage) involve beliefs and
propositions. Triadic mimesis is clearly difficult for apes to attain, especially in
natural conditions. However, through enculturation and especially through ex-
tensive sign use, some understanding of communicative intentions seems to be
within the reach of apes Zone of Proximal Development. Evidence for this is
the relative mastery of joint attention by enculturants, and as argued this can be
seen to originate in (dyadic mimetic) second-order attention combined with the
understanding of the others intention that I attend.
Post-mimesis1 or protolanguage implies some understanding of semantic
conventions, which I suggested can emerge as shared expectations of common us-
age, with little if any explicit third-order (propositional) knowledge. The long-term
studies of language-taught enculturated apes suggest that this level as well, with
much persistence, is at least in part accessible to our nearest animal cousins.
Post-mimesis2, which is identical to language as we know it, has on top of
everything else the command of a conventional/normative system for communi-
cation and thought. Arguably it is first with this level that the real payoff of using
the same system for both meta-functions comes into play, giving us the cognitive
benefits of (logical) reasoning, inference, long-term planning etc. that we take
pride in as a species. It gives us, but no other creature on our planet, a metarepre-
sentational capacity, allowing (at least) second-order beliefs, e.g. I think that you
know (or dont).
In summary, bodily mimesis in its proto, dyadic and triadic forms can be
argued to be a (and possibly the) major factor in the evolution of intersubjectiv-
ity, with higher mimetic levels bringing along with them more advanced forms
of intersubjectivity such as joint attention and third-order mentality. Sign use
itself was suggested to be a driving force in the development of an understand-
ing of third-order mentality, and the performance of enculturants shows that this
achievement is within the reach of apes, albeit in special conditions. Therefore
one can conclude that (mimetic) sign use was possibly within the zone of proxi-
mal evolution (Donald 2001) of the common ape-human ancestor, considerably
more so than language, characterized by full conventionality and systematicity.
To return to the chicken-and-egg question at the beginning of this chapter,
the intersubjective skills in the first three levels of the mimesis hierarchy (see Ta-
ble1) are indeed pre-linguistic according to the analysis offered in this chapter,
and serve as a ground for language in both evolution and ontogenetic develop-
ment. However, since they are not theory of mind modules, but social skills
arising through face-to-face and body-to-body interactions, it would be incorrect
Intersubjectivity and bodily mimesis 239

to say that intersubjectivity per se is a prerequisite for language. It is rather the


mimetic first communions (Hutto this volume), in which the various skills of
intersubjectivity are a natural part that prepare the way for language. The emer-
gence of the latter marks a major transition, or even two such transitions: to con-
ventionality and systematicity, and it is possible that the two are necessarily linked
(Deacon 2003). The understanding of (false) beliefs and folk psychological rea-
soning are therefore post-mimetic forms of intersubjectivity, since they are based
on language, either spoken or signed. This analysis does not contradict the claims
of those who like Bloom (2000) argue that the acquisition of language presuppos-
es theory of mind skills such as joint attention and communicative intentions.
What it does contradict is classing such skills as theory of mind modules or
competencies, since they are triadic mimetic phenomena that are far from being
theoretical.
Finally, I should point out that dividing intersubjectivity along different evo-
lutionary/developmental levels as here suggested could help resolve even other
controversies, such as those voiced by Sinha and Rodriguz (this volume) on the
relationship between intersubjectivity and common knowledge. Instead of setting
the two in opposition it is quite possible to analyze common knowledge as an
advanced, post-mimetic and language-dependent form of intersubjectivty, which
I take to be the intention of Itkonen (1978, this volume).

Acknowledgments

This chapter originates from joint work with Tomas Persson and Peter Grden-
fors, which gave rise to the analysis presented by Zlatev, Persson and Grdenfors
(2005a, 2005b). Gran Sonesson, Ingar Brinck, Chris Sinha, Esa Itkonen and Mats
Andrn have always been very helpful with feedback related to this and related
topics. The Lund University project Language, Gesture and Pictures in Semiotic
Development and the EU-project Stages on the Evolution and Development of Sign
Use (SEDSU) provided the funding and the interdisciplinary framework for the
type of cognitive semiotic investigations that this chapter has dealt with.

References

Ahlgren, I. 2003. Teckensprk. In Sveriges officiella minoritetssprk. Finska, menkieli, samis-


ka, romani, jiddisch och teckensprk. En kort presentation. Stockholm: Nordstedt.
240 Jordan Zlatev

Arbib, M. 2003. The evolving mirror system: A neural basis for language readiness. In Lan-
guage Evolution, M. Christiansen and S. Kirby (eds.), 182200, Oxford: Oxford University
Press.
Arbib, M. 2005. From monkey-like action recognition to human language: An evolutionary
framework for neurolinguistics. Behavioral and Brain Sciences 28/2: 105124.
Astington, J.W. and Jenkins, J.M. 1999. A longitudinal study of the relation between language
and theory-of-mind development. Developmental Psychology 35/5: 13111320.
Baldwin, D. 1991. Infants contributions to the achievement of joint reference. Child Develop-
ment 62: 875890.
Baldwin, D. 1993. Infants ability to consult the speaker for clues to word reference. Journal of
Child Language 20: 395418.
Bard K.A., Myowa-Yamakoshi, M., Tomonaga, M., Tanaka, M., Costall, A. and Matsuzawa, T.
2005. Group differences in the mutual gaze of chimpanzees. Developmental Psychology
41: 616624.
Barresi, J. and Moore, C. this volume. The neuroscience of social understanding.
Bates, E., Camaioni, L. and Volterra, V. 1975. Performatives prior to speech. Merrill-Palmer
Quarterly 21: 205226.
Bickerton, D. 2003. Symbol and structure: A comprehensive framework for language evolu-
tion. In Language Evolution, M. Christiansen and S. Kirby (eds), 7793, Oxford: Oxford
University Press.
Bloom, P. 2000. How Children Learn the Meaning of Words. Cambridge, MA: MIT Press.
Bonvillan, J.D. and Patterson, F.G. 1999. Early sign-language acquisition: Comparison between
children and gorillas. In The Mentalities of Gorillas and Orangutans: Comparative Perspec-
tives, S.T. Parker, R.W. Mitchell and H.L. Miles (eds.), 240264. Cambridge: Cambridge
University Press.
Brinck, I. 2001. Attention and the evolution of intentional communication. Pragmatics and
Cognition 9/2: 255272.
Brinck, I. 2003. The pragmatics of imperative and declarative pointing. Cognitive Science
Quarterly 3 (4): 429446.
Brinck, I. and Grdenfors, P. 2003. Co-operation and communication in apes and humans.
Mind and Language 18(5): 484501.
Brown, R. 1977. Why are signed languages easier to learn than spoken languages? In Proceed-
ings of the National Symposium on Sign Language Research and Training, W. Stokoe (ed.),
924. Silver Spring, MD: National Association of the Deaf.
Byrne, R. and Whiten, A. 1988. Michiavelian Intelligence. Social Expetise an the Evolution of
Intellect in Monkeys, Apes and Humans. Oxford: Oxford University Press.
Call, J. and Tomasello, M. 1996. The effect of humans in the cognitive development of apes. In
Reaching into Thought: The Minds of Great Apes. A. Russon, K. Bard and S. Parker (eds.),
371403. Cambridge: Cambridge University Press.
Call, J. and Tomasello, M. 1999. A nonverbal false belief task: The performance of children and
great apes. Child Development 70/2: 381395.
Clark, H. 1996. Using Language. Cambridge: Cambridge University Press.
de Villiers, J. and Pyers, J. 1997. Complementing cognition: The relationship between language
and theory of mind. In Proceedings of the 21st Annual Boston University Conference on
Language Development. Somerville, MA: Cascadillia Press.
de Waal, F.B.M 1982. Chimpanzee Politics. London: Cape.
Intersubjectivity and bodily mimesis 241

de Waal, F.B.M. and Aureli, F. 1996. Consolation, Reconciliation and a Possible Cognitive Differ-
ence Between Macaques and Chimpanzees. Cambridge: Cambridge University Press.
Deacon, T. 1997. The Symbolic Species: The Co-Evolution of Language and the Brain. New York:
Norton.
Deacon, T. 2003. Universal grammar and semiotic constraints. In Language Evolution, M.
Christiansen and S. Kirby (eds.), 111139. Oxford: Oxford University Press.
Donald, M. 1991. Origins of the Modern Mind. Three Stages in the Evolution of Culture and Cog-
nition. Harvard: Harvard University Press.
Donald, M. 1998. Mimesis and the Executive Suite: Missing links in language evolution. In
Approaches to the Evolution of Language, J. Hurford, M. Studdert-Kennedy and C. Knight
(eds.), 4467. Cambridge: Cambridge University Press.
Donald, M. 2001. A Mind So Rare. The Evolution of Human Consciousness. New York: Norton.
Emery, N.J. 2000. The eyes have it: the neuroethology, function and evolution of social gaze.
Neuroscience and Biobehavioural Reviews 24: 581604.
Fouts, R.S. 1972. Use of guidance in teaching sign language to a chimpanzee (Pan troglodytes).
Journal of Comparative and Physiological Psychology 80/3: 51522.
Fouts, R.S. 1973. Acquisition and testing of gestural signs in four young chimpanzees. Science
180: 978980.
Gallagher, S. 2005. How the Body Shapes the Mind. Oxford: Oxford University Press.
Gallagher, S. 2007. Phenomenological and experimental contributions to understanding em-
bodied experience. In Body, Language and Mind. Vol 1. Embodiment, T. Ziemke, J. Zlatev
and R. Frank (eds.), 271293. Berlin: Mouton.
Gallagher, S. and Hutto, D.D. this volume. Understanding others through primary interaction
and narrative practice.
Gallese, V., Keyners, C. and Rizzolatti, G. 2004. A unifying view of the basis of social cogni-
tion. Trends in Cognitive Sciences 8/9: 396403.
Greenfield, P.M. and Savage-Rumbaugh. E.S. 1990. Grammatical combination in Pan panis-
cus: Processes of learning and invention in the evolution and development of language.
In Language and Intelligence in Monkeys and Apes, S.T. Parker and K. Gibson (eds.),
540578. Cambridge: Cambridge University Press.
Greenfield, P.M. and Savage-Rumbaugh, E.S. 1991. Imitation, grammatical development, and
the invention of protogrammar. In Biological and Behaviorial Determinants of Language
Development, N. Krasnegor, D. Rumbaugh, M. Studdert-Kennedy and R. Schiefelbusch
(eds.), 235258. Hillsdale, NJ: Erlbaum.
Grice, P. (1957). Meaning. Philosophical Review 66: 37788.
Hale, C.M. and Tager-Flusberg, H. 2003. The influence of language on theory of mind: A train-
ing study. Developmental Science 6/3: 346359.
Hare, B., Call, J. and Tomasello, M. 2001. Do chimpanzees know what conspecifics know?
Animal Behaviour 61: 13951.
Hare, B., Call, J., Agnetta, B. and Tomasello, M. 2000. Chimpanzees know what conspecifics do
and do not see. Animal Behaviour 59: 77185.
Hockett, C.F. 1960. The origin of speech. Scientific American, 23: 8996.
Hobson, R.P. and Hobson, J.A. this volume. Engaging, sharing, knowing: Some lessons from
research in autism.
Hutto, D.D. 2008. Folk Psychological Narratives: The Socio-cultural Basis of Understanding Rea-
sons. Cambridge MA: MIT Press.
Hutto, D.D. this volume. First communions: Mimetic sharing without theory of mind.
242 Jordan Zlatev

Itkonen, E. 1978. Grammatical Theory and Metascience. Amsterdam: Benjamins.


Itkonen, E. this volume. The central role of normativity for language and linguistics.
Lewis, D.K. 1969. Convention: A Philosophical Study. Cambridge MA: Harvard University
Press.
Lohmann, H. and Tomasello, M. 2003. The role of language in the development of false belief
understanding: A training study. Child Development 74/4: 11301144.
Meltzoff, A. and Gopnik, A. 1993. The role of imitation in understanding other persons and
developing a theory of mind. In Understanding other Minds: Perspectives from Autism,
S.Baron-Cohen, H. Tager-Flusberg and D.J. Cohen (eds.) 335366. Oxford: Oxford Uni-
versity Press.
Menyuk, P. and Quill, K. 1985. Semantic problems in autistic children. In Communication
Problems in Autism E. Schoper and G. Mesibov (eds.), 127145. New York: Plenum Press.
Merleau-Ponty, M. 1962 [1945]. Phenomenology of Perception. Routledge and Kegan Paul.
Merleau-Ponty, M. 2003. La Nature: Notes, Cours du College de France. Trans. Robert Vallier,
Northwestern University Press.
Miles, H.L. 1990. The cognitive foundations for reference in a signing orangutan. In Lan-
guage and Intelligence in Monkeys and Apes, S.T. Parker and K.R. Gibson (eds.), 511539.
Cambridge: Cambridge University Press.
Miles, H.L. 1999. Symbolic communication with and by great apes. In The Mentalities of Goril-
las and Orangutans, S. Taylor Parker, R.W. Mitchell and H.L. Miles (eds.), 197210. Cam-
bridge: Cambridge University Press.
Mitchell, P. 1997. Introduction to Theory of Mind: Children, Autism and Apes. London: Arnold.
Moore, C. and Dunham, P.J. (eds). 1995. Joint Attention: Its Origins and Role in Development.
Hilldale, NJ: Lawrence Erlbaum.
Myowa-Yamakoshi, M. 2001. Evolutionary foundation and development of imitation. In Pri-
mate Origins of Human Cognition and Behavior, T. Matsuzawa (ed.), 349367. Dordrecht:
Springer.
Myowa-Yamakoshi, M., Tomonaga, M., Tanaka, M. and Matsuzawa, T. 2004. Imitation in neo-
natal chimpanzees (Pan troglodytes). Developmental Science 7/4: 43742.
Nelson, K. 1985. Making Sense. The Acquisition of Shared Meaning. London: Academic Press.
Nelson, K. 1996. Language in Cognitive Development. The Emergence of the Mediated Mind.
Cambrdige: Cambridge University Press.
Nelson, K. and Shaw, L.K. 2002. Developing a socially shared symbolic system. In Language,
Literacy and Cognitive Development, J. Byrnes and E. Amseli (eds), 2757. Hillsdale, NJ:
Lawrence Erlbaum.
Patterson, F. 1978. The gestures of a gorilla: Language acquisition in another pongid. Brain
and Language 5: 7297.
Patterson, F. 1980. Innovative use of language in a gorilla: A case study. In Childrens Language
Vol 2, K. Nelson (ed.), 497561. New York: Garnder Press.
Perner, J. 1991. Understanding the Representational Mind. Cambridge, MA: MIT Press.
Perner, J., Leekam, S. and Wimmer, H. 1987. Three-year-olds difficulty with false belief: The
case for a conceptual deficit. British Journal of Developmental Psychology 5: 125137.
Peterson, C.C. and Siegal, M. 1995. Deafness, conversation and the theory of mind. Journal of
Child Psychology and Psychiatry and Allied Disciplines 36: 459474.
Piaget, J. 1945. La Formation du Symbole Chez lenfant. Delachaux et Niestl.
Plunkett, K. 1998. Language and connectionism. Language and Cognitive Processes 13: 105
127.
Intersubjectivity and bodily mimesis 243

Preston, S. and de Waal, F.B.M. 2002. Empathy: Its ultimate and proximal bases. Behavioral
and Brain Sciences, 25: 172.
Reddy, V. 2000. Coyness in early infancy. Developmental Science 3/2: 186192.
Reddy, V. 2003. On being an object of attention: Implications of self-other-consciousness.
Trends in Cognitive Science 7/9: 397402.
Reddy, V. 2005. Before the third element: Understanding attention to self. In Joint Attention,
N. Klein (ed.), 85109. Oxford: Oxford University Press.
Rizzolatti, G. and Arbib, M. 1998. Language within our grasp. Trends in Neurosciences 21:
188194.
Rizzolatti, G., Fadiga, L., Gallese, V. and Fogassi, L. 1996. Premotor cortex and the recognition
of motor actions. Cognitive Brain Research 3: 131141.
Saussure, F. de 1916. Cours de Linguistique Gnrale [Course in General Linguistics]. Paris:
Payot.
Savage-Rumbaugh, S., Shanker, S., and Taylor, T. 1998. Apes, Language and the Human Mind.
Oxford: Oxford University Press.
Savage-Rumbaugh, S. and Lewin. R. 1994. Kanzi: The Ape at the Brink of the Human Mind. New
York: John Wiley.
Searle, J. 1992. The Rediscovery of the Mind. Cambridge, Mass.: MIT Press.
Senghas, A., Kita, S. and zyrek, A. 2004. Children creating core properties of language: Evi-
dence from an emerging sign language in Nicaragua. Science 305: 17791782.
Sinha, C. and Rodrguez, C. this volume. Language and the signifying object: From convention
to imagination.
Sonesson, G. 2007. From the meaning of embodiment to the embodiment of meaning: A
study in phenomenological semiotics. In Body, Language and Mind. Vol 1. Embodiment,
T. Ziemke, J. Zlatev and R. Frank (eds.), 85127. Berlin: Mouton de Gruyter.
Stern, D. 1985. The Interpersonal World of the Infant: A View from Psychoanalysis and Develop-
mental Psychology. New York: Basic Books.
Stokoe, W. 1960. Sign language structure: An outline of the visual communication system of
the American deaf. Studies in Linguistics, Occasional Paper 8, University of Buffalo, New
York.
Tanner, J.E. and Byrne, R.W. 1996. Representation of action through iconic gesture in a captive
lowland gorilla. Current Anthropology 37/1: 16273.
Tanner, J.E. and Byrne, R.W. 1999. The development of spontaneous gestural communica-
tion in a group of zoo-living lowland gorillas. In The Mentalities of Gorillas and Orangu-
tans Comparative perspectives, S.T. Parker, R.W. Mitchell and H.L. Miles (eds.), 211239.
Cambridge: Cambridge University Press.
Terrace, H.S., Petitto, L.A., Sanders, R.J. and Bever, T.G. 1981. Science 211/4477: 8788.
Tomasello, M. 1999. The Cultural Origins of Human Cognition. Cambridge, MA: Harvard Uni-
versity Press.
Tomasello, M. 2003. Constructing a Language: A Usage-based Theory of Language Acquisition.
Cambridge, MA: Harvard University Press.
Tomasello, M., Call, J. and Gluckman, A. 1997. Comprehension of novel communicative signs
by apes and human children. Child Development 68: 10671080.
Tomasello, M., Call, J. and Hare, B. 2003. Chimpanzees understand psychological states the
question is which ones and to what extent. Trends in Cognitive Sciences 7/4: 153156.
Tomasello, M., Carpenter, M., Call, J., Behne, T. and Moll, H. 2005. Understanding and sharing
intentions: The origins of cultural cognition. Behavioral and Brain Sciences: 28: 675735.
244 Jordan Zlatev

Tomasello, M. and Call, J. 2006. Do chimpazees know what others see or only what they
are looking at? In Rational Animals, S.L. Hurley and M. Nudds (eds), 371384. Oxford:
Oxford University Press.
Trevarthen, C. 1979. Communication and cooperation in early infancy: A description of pri-
mary intersubjectivity. In Before Speech, M. Bullowa (ed.), 321347. Cambridge: Cam-
bridge University Press
Trevarthen, C. 1992. An infants motives for speaking and thinking in the culture. In The Dia-
logical Alternative, H. Wold (ed.), 99137. Oslo: Scandinavian University Press.
Vygotsky, L. S. 1962. Thought and Language. Cambridge MA: MIT Press.
Vygotsky, L. S. 1978. Mind in Society. The Development of Higher Psychological Processes. Cam-
bridge, Mass: Harvard University Press.
Wittgenstein, L. 1953. Philosophical Investigations. London: Basil Blackwell.
Woll, B. and Kyle, J. 2004. Sign language. Encyclopedia of Language and Linguistics. Oxford:
Elsevier.
Wray, A. 2000. Holistic utterances in protolanguage: the link from primates to humans. In The
Evolutionary Emergence of Language: Social Function and the Origins of Linguistic Form,
C. Knight, M. Studdert-Kennedy and J. Hurford (eds.), 285302. Cambridge: Cambridge
University Press.
Zlatev, J. 1997. Situated Embodiment. Studies in the Emergence of Spatial Meaning. Stockholm:
Gotab.
Zlatev, J. 2002. Mimesis: The missing link between signals and symbols in phylogeny and
ontogeny. In Mimesis, Sign and the Evolution of Language, A. Pajunen (ed.), 93122. Uni-
versity of Turku Press.
Zlatev, J. 2003. Meaning = Life (+ Culture). An outline of a unified biocultural theory of mean-
ing. Evolution of Communication 4/2: 253296.
Zlatev, J. 2005. Whats in a schema? Bodily mimesis and the grounding of language. In From
Perception to Meaning: Image Schemas in Cognitive Linguistics, B. Hampe (ed.), 323342.
Berlin: Mouton de Gruyter.
Zlatev, J. 2007a. Language, embodiment and mimesis. In Body, Language and Mind. Vol 1. Em-
bodiment, T. Ziemke, J. Zlatev and R. Frank (eds.), 297337. Berlin: Mouton de Gruyter.
Zlatev, J. 2007b. Intersubjectivity, mimetic schemas and the emergence of language. Intellec-
tica. 2007/23 (4647): 123152.
Zlatev, J. 2008. SEDSU research related to intersubjectivity and conventions. Deliverable 19.
EU-FP6 Project: N 012984, Stages in the Evolution and Development of Sign Use, Nest-
2003-Path-3: What it means to be human?
Zlatev, J. in press. From proto-mimesis to language: Evidence from primatology and social
neuroscience. Journal of Physiology, Paris.
Zlatev, J. forthcoming. Autsim as an impairment in bodily mimesis
Zlatev, J. Brinck, I. and Andrn, M. 2008. Stages in the development of perceptual intersub-
jectivity. In Enacting Intersubjectivity, F. Morganti, A Carassa and G. Riva (eds), 117132.
Amsterdam: IOS Press.
Zlatev, J., Persson, T. and Grdenfors, P. 2005a. Bodily mimesis as the missing link in human
cognitive evolution. Lund University Cognitive Studies 121.
Zlatev, J., Persson, T. and Grdenfors, P. 2005b. Triadic bodily mimesis is the difference. Com-
mentary on Tomasello et al. (2005) Behavioral and Brain Sciences: 28: 675735.
Zlatev, J., Racine, T., Sinha, C., and Itkonen, E. this volume. Intersubjectivity: What makes us
human?
chapter 11

First communions
Mimetic sharing without theory of mind

Daniel D. Hutto

It is widely held that the gradual development of metarepresentational Theory


of Mind (ToM) abilities constituted at least one important hominid upgrade.
Are such abilities really needed to explain hominid (i) tool-making, (ii) social
cohesion, or even (iii) basic interpretative and language formation/learning
capabilities? I propose an alternative explanation of what underlies these sophis-
ticated capacities the Mimetic Ability Hypothesis (MAH). MAH claims that
a vastly increased capacity for recreative imagination best explains the kinds of
sophisticated intersubjective engagements of which hominids would have been
capable and that these constituted an important basis for the development of
complex language. This proposal puts the idea of the evolution of ToM devices
under considerable strain.

How did humans bridge the tremendous gap between


symbolic thought and the nonsymbolic forms of intel-
ligence that still dominate the rest of the animal king-
dom? Merlin Donald, Origins of the Modern Mind

1. The missing cognitive link

Many today claim that full-fledged theory of mind (ToM) abilities specifically
those requiring mastery of the concept of belief must have emerged at a point in
our pre-history sometime after the human line broke from that of the chimpan-
zees (Call and Tomasello 1999, 2005; Papineau 2003; Povinelli and Vonk 2003;
Tomasello, Call and Hare 2003a; Sterelny 2003). This is thought to have occurred
during the million or so years preceeding modern recorded history, the late Stone
Age or Pleistocene. This period is referred to by evolutionary psychologists as the
Environment of Evolutionary Adaptiveness (Dupr 2001:21).
The big question is just how much later did full-fledged folk psychological
abilities arrive on the scene and why. I claim that they are very late-developing,
socio-culturally based and not necessarily universal to human thinking (see
246 Daniel D. Hutto

allagher and Hutto this volume; Hutto 2004, 2007a and b, 2008). In contrast,
G
those who defend the existence of ToM modules (ToMMs) typically argue that
very many important activities of our hominid forerunners necessarily depended
on the having of mature ToM abilities. Hence, they posit the formation of dedicat-
ed mechanisms which gifted our ancestors with such capacities and, being built-
in by biological evolution, these would have been inherited by our species.
How might we decide between these possibilities? Cognitive archaeology
the attempt to understand ancestral minds by drawing on insights of psychology
applied to remnants of pre-history is a highly speculative business. Not only is
the archaeological record gappy, with only fragmentary material evidence having
been preserved (and much of it yet to be discovered), there are no live subjects
to test. This means we do not know, with any certainty, which kind of activities
our ancestors engaged in and what abilities they may have had. And, as shown by
the debates over the cognitive capacities of non-human primates, even when we
have live subjects to examine there is scope for competing interpretations about
how precisely to characterise and explain the basis of social-psychological abili-
ties (Call and Tomasello 2005; Povinelli and Vonk 2004). By comparision, the task
of deciding between rival conjectures about prehistoric cognition is trickier still.
Matters are helped somewhat by an emerging consensus about the level, if not
the nature, of the cognitive capacities of chimpanzees. Placed alongside what we
know about the abilities of modern humans both infants and adults this yields
at least a very rough sense of the cognitive distance that hominids must have
covered: i.e. in thinking about the common ancestor we shared with chimpanzees
we know roughly where our forerunners must have started from and where they
ended up; even if details of the precise route taken remain obscure. Despite its
sketchiness, the archaeological record gives us a general picture, though to be sure
a changeable and contestable one, of the large features of the terrain they covered
and the likely timing of their specific movements. However imperfect, this is the
evidence against which we must test the plausibility of proposals about our ances-
tors likely cognitive powers and what may have driven them.
It is beyond doubt that there was a cognitive change of considerable mag-
nitude (or a series of such) that took place over the period spanning from the
emergence of a common ape-human ancestor approximately 6 million years ago
to the appearance of Homo sapiens (see Figure 1). If we assume a neo-Darwinian
perspective it is likely that such changes will have happened gradually, presum-
ably under a variety of selection pressures.
Recognising these limitations, in the remainder of this chapter I review the ex-
isting evidence and challenge the assumptions that lend prima facie support to the
familiar and widely held view that the gradual development of sophisticated mind-
reading devices must have constituted at least one important hominid upgrade.
First communions 247

Figure 1. A Simplified Evolutionary Tree of the Hominids (reprinted from How Homo
Became Sapiens: On the Evolution of Thinking. Grdenfors, P. p. 7, Oxford University
Press, Copyright 2003. Used with permission from P. Grdenfors)

2. Imitation and social learning

Going back to the very first hominids, there is no reason to think that their life-
styles offered challenges of a significantly different kind from that faced by the
apes certainly, no drastic change occurred in this respect until long after the
passing of Australopithecus afarensis and even after the arrival of Homo habilis.
With the latters appearance we see the first major spurt in brain size (which
increased roughly 1.5 times in cranial capacity compared to that of the austra-
lopithecines). Indeed, the fossil record tells a story of just two such instances of
encephalisation in human pre-history. The first, and most remarkable, roughly
coincides with the emergence of simple Oldowan tool manufacture a craft su-
pervised by the first members of the Homo line (see Figure 1). Although the
tools made during this period would have been extremely basic in many respects
by todays standards, their fashioning would have constituted an impressive
achievement a genuine innovation when compared with what had come be-
fore (or rather what hadnt).
The stone knapping techniques involved in the fashioning of such tools would
have required a good sensitivity to fractural dynamics and strong hand-eye co-
ordination, capacities that far outstrip those required for making simple repeti-
tive actions (Mithen 1996, 2000a; Wynn 2000). An off-line imaginative ability to
248 Daniel D. Hutto

re-enact and practice complex routines would also have been needed, not only for
the technical proficiency of fashioning the final products but also for the collection
and preparation of materials. The manipulation of images perceptual re-enact-
ments looks likely to have funded these aspects of the early tool-makers craft.
Clearly, not even weak ToM abilities would have been needed for the individ-
ual acts of making such tools, but they might have played a crucial part in enabling
the social learning upon which the tool industries themselves were founded. For
more than an indivudals potential to fashion such tools would have been needed
to keep the practice alive; such crafts had to be maintained over time. Could it be
that ToM abilities might have played a critical pedagogical role in ensuring this? In
assessing this idea, let us simply accept the consensus view that the Paleolithic re-
cord suggests very strong social learning of such skills (Mithen 2000b:496). Very
well, but what is the exact connection with ToM abilities? Mithen suggests that:
it seems most probable that these technical skills and traditions were inten-
tionally taught from generation to generation, or acquired by passive watching
and active imitation. A modern-like theory of mind appears essential to either
task. Instructed learning requires that both the teacher and the novice take ac-
count of what is in each others mind.
 (Mithen 2000b:496, emphasis added, 2002, see also Baron-Cohen 1999:263)

Despite this rather bold statement, the idea that passive watching and active imi-
tation implies ToM abilities of any sort is surely false. Demonstrating this is of
particular importance because there is a strong independent reason to suppose
that hominids must have had impressive imitative abilities. Humans are natural
mimics and our basic abilities in this regard seem to be inherited, though they
may be elaborated and extended by epigenesis (see Sinha 2004). Human neonates,
as is well known, engage in facial imitation even at the tender age of thirty-two
hours old (indeed it has been claimed that they are capable of this when they
are less than an hour old) (Meltzoff and Moore 1977; Meltzoff and Moore 1994;
Gopnik and Meltzoff 1997:131).
Young chimpanzees have shown similar abilities (see Myowa-Yamakoshi et al
2004). Yet the claim is not that humans alone are natural-born mimics, it is that
we are also mimics of the first rank. Recent work on social responding carefully
distinguishes a number of different forms, including: stimulus enhancement, goal
emulation, response priming and imitation proper (Billard and Arbib 2002:344;

. Donald puts the point beautifully: Innovative tool use could have occurred countless thou-
sands of times without resulting in an established toolmaking industry, unless the individual
who invented the tool could remember and re-enact or reproduce the operations involved and
then communicate them to others (Donald 1991:179).
First communions 249

Rizzolatti, Graighero and Fadiga 2002:52; Hurley 2005). Stimulus enhancement


and response priming involve the triggering of an action that is already within the
repertoire of the individual, while goal emulation can lead to a new action leading
to a desired goal but without copying the means of the model. True or complex
imitation, by way of contrast, stands out in that it requires the capacity to copy
both the novel ends and means of anothers complex action.
When compared with humans, apes are much less good at complex forms of
imitation (see Donald 2005:2856; Jones 2005). The fact is that it remains a mat-
ter of some dispute whether they are capable of true imitation at all. Monkeys are
able to copy simple actions, such as a movement sequence tied to a particular goal
but they cannot reproduce complex ones involving hierarchically structured rep-
ertoires. They can copy bodily movements but without adoption of specific goal
structures with multiple sub-steps. Chimpanzees too may only be capable of lim-
ited forms of imitation involving uncomplicated movements their performanc-
es only becoming reliable after considerable training. Exactly what level of ability
chimpanzees really have in this regard remains a somewhat open question (Zlatev,
Persson and Grdenfors 2005; Whiten, Horner and Marshal-Pescini 2005). But
what is not in doubt is that their abilities stand in stark contrast to those of young
human infants who are able to masterfully copy complex and novel movements
and actions even those segmented into discrete ordered parts with no train-
ing and little effort (Meltzoff and Moore 1977). Putting these thoughts together, it
seems hard to deny that distinctively human imitative abilities are inherited from
the hominids, but also considerably extend them.
I take this as established common ground. But, pace Mithen, there is simply
no reason to think that imitative abilities in any way entail ToM abilities. To see
this, it helps to have a clear picture of just what imitation involves. Bermdez
neatly characterises the problem that infants must solve in order to imitate faces:
Facial imitation involves matching a seen gesture with an unseen gesture, since
in normal circumstances one is aware of ones own face only haptically and pro-
prioceptively. If successful facial imitation is to take place, a visual awareness of
someone elses face must be apprehended so it can be reproduced on ones own
face. (Bermdez 1998:125)

If we model what is going on in the infant on the way a suitably well informed adult
might attempt to solve this sort of problem, using a set of explicit propositional
instructions, then their ability to converge on precisely the right gesture to be imi-
tated will be regarded as, the product of inference-like processes [that] are not
merely reflexive (Gopnik and Meltzoff 1997:130). And it would seem to follow
that, Very young infants represent a variety of aspects of human action, they can
make inferences on the basis of these representations, they think of themselves
250 Daniel D. Hutto

and others as fundamentally sharing the same psychological states (Gopnik and
Meltzoff 1997:133). If so, we must presuppose that infants are aware of at least
some sort of substantive self/other contrast from birth and this might imply
some sort of ToM capacity (Gopnik 2004:22). But this line of thought raises a
number of difficult questions. What is the character of this neonatal understand-
ing of the self-other contrast? What is the precise content of the representations
that these infants are allegedly using, and what is their origin? And, crucially,
what account can be given of the sub-personal mechanisms that make use of these
representations in order to effect the appropriate manipulation of infants faces
and bodies?
In offering a straight choice between inferentialist accounts and merely reflex-
ive ones, theory theorists, such as Gopnik and Meltzoff, have missed a trick. Re-
search into Mirror Neuron Systems (MNSs) holds out the promise of a better way
of understanding the mechanics of simian and human neonatal abilities to imitate
buccal and facial expressions and other gestures (Gallese 2003; Rizzolatti 2005).
Even though research in this area is still at the early stage, it is widely agreed
that a mechanism with the characteristics of the mirror system appears to have
the potentiality to give a neurophysiological, mechanistic explanation of imita-
tion (see Rizzolatti et al. 2002:55). This is an instance in which we have some
reason to believe that the promissory note for future explanations might actually
get cashed.
The fact that MNSs of apes are not as sophisticated as the human variety could
explain their limited capacity for complex forms of imitation; the differences are
empirically well-established, having been demonstrated by various brain imaging
studies (Knoblich and Jordan 2002:115; Heiser et al. 2003; Arbib 2005). Also, more
importantly, in the human case the potential fit with the basic type of inherited
mimetic abilities to be explained is plausibly of the right level. It is easy enough
to imagine that our early hominid ancestors might have started their mimetic ca-
reers with abilities akin to those of young children [who] spontaneously imitate
adults in a mirror-like fashion (Wohlschlger and Bekkering 2002:102, emphasis
added). Thus, just as children only begin to make the appropriate adjustments for
cross-lateral differences when imitating others as they get older, so too hominid

. For these reasons the theory-theory approach compares negatively with the account that
Gallagher advances, which makes appeal to body schemas. He claims that the imitating subject
depends on a complex background of embodied processes, a body-schema system involving vi-
sual, proprioceptive, and vestibular information This intermodal intra-corporeal communi-
cation then, is the basis for an inter-corporeal communication (Gallagher 2005:76). Although
Gallagher suggests that infants may be capable of experiencing a difference between self and
other, he argues that the concept of the self starts out closer to an embodied sense than to a
cognitive or psychological understanding (Gallagher 2005:79).
First communions 251

mimetic capacities may have become similarly enriched over evolutionary time.
Looking to non-representational, enactive accounts for an explanation of imita-
tive abilities is surely a better bet than positing modules with the relevant declara-
tive knowledge to do such work (see Hutto 2008).
Sterelny (2003) identifies and deftly defuses the assumption that can make it
look as if a ToM might be needed for these basic kinds of imitative task:
The link between imitation and theory of mind depends on the supposition that
imitation involves a translation between points of view: the mimic represents
something like the models motor pattern as seen by an onlooker, and turns it into
a representation of a motor pattern as seen by the agent himself. But that is not
the only possibility If the mimic represents the models behaviour function-
ally pick up the rock in the grasping hand, hold the nut facing away, place it
on smooth hard surface there is no need to transform between points of view.
 (Sterelny 2003:64)

There are good reasons to opt for the simpler functional interpretation of this
process, appealing to mirror neuron ystems or resonance pattern research in or-
der to understand its mechanics. Although I have said that the mirror systems
neuron of humans are much more sophisticated than those of other primates, we
must not suppose that the basic acts of imitation which they sponsor have any of
the standard features of mentalistically-based simulations per se. Certainly, they
do not implicate the kinds of simulation procedures that involve the manipula-
tion and attribution of propositional attitudes (see Goldman 2005:82; Gallagher
2007). Here it pays to remember that Gallese and Goldman, who were the first
to claim that the discovery of mirror neurons might lend respectable empirical
backing to simulation theory (ST), only ever took the evidence to show that there
were primitive precursors that might be related to explicit, mentalistic forms of

. Interestingly, those who suffer from autism have difficulty in replicating the manner and
style of anothers response (see Hobson 2007; Hobson and Hobson this volume).
. To have a content-involving thought it is not enough that an organism is merely inten-
tionally directed at a possible situation or state of affairs, for the latter might be understood
in purely extensional terms. I therefore distinguish intentional attitudes from propositional
attitudes, reserving the latter title for those attitudes that are content-involving. This is not just
a terminological stipulation, though I recognize that some would claim that attitudes of both
sorts are directed at propositions, albeit in different ways. In my view intentional attitudes are
not directed at propositions (only possible situations), yet they exhibit intentional directedness
all the same. The reader is free to assume that my use of terms is decided by fiat. The main point
is that, within the class of intentionally-directed attitudes we can distinguish between those of
the extensional and intensional varieties. For further detail about and justification of this dis-
tinction see Hutto 2008, Chs. 35.
252 Daniel D. Hutto

simulation (Gallese and Goldman 1998:498). As such, it is clear that the kind
of responsiveness associated with these precursor abilities cannot be identified
with or understood in terms of ToM capacities of either the theory theory or
the simulation theory variety. The bottom line is that there is simply no need or
warrant to postulate any inferential or theoretical activity on the part of these imi-
tators. And it is mimetic abilities of this sort that suffice to explain how technical
skills are acquired and developed by attending either to ones own or anothers ac-
tion routines (whether selectively or otherwise) so that these can be recalled and
re-enacted. In this context, it is important to take note of something that Donald
has taken pains to underscore: The process that generates these action-patterns
relies on a principle of perceptual resemblance; accordingly I have labeled the skill
mimesis or mimetic skill (Donald 1999:145).
Worse still, the truth is that Mithens claim that our ancestors would have
needed ToM abilities for the social transmission of technical skills does not hold
up even if we imagine that basic acts of imitation do involve representing the
contents of others minds. Thus, even if we suppose that the minds of ancient
tool-makers were filled with sets of instructions about how to fashion their arte-
facts, representing these would have been of little use to the trainee. A clutch of
rules here imagined as a series of conditional statements is the wrong sort of
medium for technical training and instruction of the kind needed to learn basic
tool manufacture. Quite the opposite; it is often the case when learning a practi-
cal craft that mastering a set of explicit rules is a positive hindrance. This fact will
be salient to anyone who has tried to build a do-it-yourself product using only a
pictureless set of instructions, with no blueprint for guidance. Call this the Ikea
constraint.
Technical training is not about passing on declarative knowledge that, but
of engendering a kind of know-how. Novices typically get the knack by direct
hands-on practice. And where this is not possible, they must attend to what the
other does and attempt to re-enact the appropriate steps using their visuo-mo-
toric imagination in a cross-modal way. Consider what would have been needed
for acquiring the skill of fashioning the Levallois flake, focusing on the diagram
provided in Figure 2.

. The fashioning of the Levallois flake was a highly sophisticated prehistoric tool-making
technique that originally dates back to the Lower Palaeolithic and which was retained into the
Upper Palaeolithic and beyond. Its sophistication makes it a useful test case for deciding if the
transmission of tool-making skills must have required the exercise of theory-of-mind capaci-
ties. My proposal is that such activities look better suited for explanation in terms of hominid
capacities for imaginative re-enactment. And there are yet more deflationary proposals about
the origins of such flakes afoot. Should those turn out to be correct they too would undermine
First communions 253

Figure 2. Technique for fashioning the Levallois flake (reproduced from (http://anthro.
palomar.edu/homo2/archaic_culture.htm, Used with permission from Dennis ONeil)

The need for such first-hand and hands-on training is felt even today, as when
surgeons are given the chance to see one done before performing operations in
theatre. It is therefore hardly surprising that verbal instruction, even when it is
readily available as an accompaniment, appears to play a minor role in craft ap-
prenticeship, even amongst modern humans (Donald 1991:213; Wynn 1991).
Mimesis of the kind discussed is perfectly suited to enable the non-genetic copy-
ing required for the passing down of technical skills through the generations by
imitative means. Social training of this kind takes the form of showing not saying
(of imagining not propositional thinking). This is consistent with the fact that the
mimetic skills in question develop over time; this happened in phylogeny and
is likely to be recapitulated in ontogeny too, in an epigenetic manner: imitative
learners might become more selective and discriminating in the sorts of routines
that they choose to mimic (see Harris and Want 2005; Sinha 2004).

3. The mimetic ability hypothesis

If ToMMs are neither implicated in the maintenance of tool industries nor the
basis of imitative capacities, perhaps they were needed to fund the sophisticated

the claim that theory-of-mind capacities must have sponsored the social learning that made
tool-making industries possible.
254 Daniel D. Hutto

social skills of the hominids. This seems most unlikely. H. habilis looks to have
been living in groups only marginally larger in size than that of australopithe-
cines, and both would have kept within relatively tight geographical boundaries
(Mithen1996). There is no compelling reason to think that the social circum-
stances of the very early hominids would have changed enough to require more
powerful tools for engaging in or monitoring social dynamics of a qualitatively
different sort than those afforded to apes. Very much in line with current thinking
about the abilities of our simian relatives in this regard, Mithen who has done
more than most in thinking about the likely stages of ToM development surmis-
es that the Oldowan tool makers would have had an equivalent theory of mind
ability to that found within chimpanzees today. As an alternative they would have
been extremely clever behaviourists (Mithen 2000b:500).
Things changed decisively with the coming of Homo ergaster/erectus in the
Lower Paleolithic period. Their arrival was accompanied by a remarkable new
way of life, which some regard as constituting the first Hominid Revolution. Un-
like its predecessors, the extent of H. ergaster/erectus movements coming out
of Africa and spreading across Europe and Asia were unprecedented. During
this period the quality of tools improved quite dramatically, so much so that ar-
chaeologists require months of training and practice to become good at creating
Acheulian tools (Donald 1991:179). This could be explained by an increase in
mimetic ability, which would have also conferred advantages with respect to other
technical crafts such as shelter construction.
But such skills would have also permitted new and more complex forms of
social coordination. Many animals, but particularly those who form cohesive so-
cial units, faithfully produce and respond to characteristic expressive behaviours
of others, normally conspecifics (Allen and Saidel 1998). They can signal shifts in
emotional temperament or mood and otherwise indicate their readiness to engage
in characteristic kinds of action. For example the barring of teeth, the arching of
backs and the lowering of heads can be early warnings that the other is preparing
to fight, or mate, or retreat, and so on. Being able to faithfully produce and respond
to such recognisable behaviours makes basic social coordination possible.
The mimetic ability with which hominids appear to have been gifted would have
been qualitatively unlike these other animal signal systems, not just in its special

. There is current debate about whether the early African H. ergaster (meaning workman)
or the later Asian H. erectus was the direct ancestor of modern humans. Indeed, there is debate
over whether or not the former is merely a sub-species of the later. Either way, they will have
established the Acheulian stone industry before their offspring will have left Africa to become
H. heidelbergensis, our last common ancestor shared with the Neanderthals.
First communions 255

mirroring character but in being securely under voluntary control. Although it is


likely that our ancestors would have used the full range of facial, vocal and postural
gestures, at least to some limited extent, it is likely that manual gestures would
have dominated (Donald 1991; Arbib 2005). It is likely that the tree-based living of
their simian forefathers would have prepared the early hominids with prodigious
dexterous freedom and control a freedom which the shift to bipedalism would
have allowed them to capitalise upon (Corballis2003; Lieberman 2000:151153).
Also, being self-cueing and self-regulating, the hominid ability to imitate would
have constituted a major step beyond the more stereotypically circumscribed pat-
terns of interaction that characterise the intersubjective engagements of other so-
cial creatures; those that depend mainly on inherited routines alone to structure
the basic form of their engagements. A flexibility conferred by mimetic skills in
conjunction with recreative imaginative abilities for practice and rehearsal would
have had vastly increased the developmental possibilities for social expression and
engagement (see Hutto 2007d; Sinha 2004; Zlatev this volume).
This openness would have introduced new challenges; coordinating intersub-
jective interactions in stable and effective ways would require taming or regu-
lating these newfound capacities for freedom of expression, at least within local
communities. It is very plausible that this was achieved by the development of
a kind of mimetic culture. Donald has convincingly argued that the establish-
ment of such would have funded the emergence of games, rites, and well-defined
norms of a kind unlike anything found in simian societies. A kind of mutual
miming which Donald calls reciprocal mimesis is the plausible basis of non-
linguistic conventions (Donald 1991:6). Such interactions could have acted as
powerful social glue.
At the very least, the development of mimetic abilities which sponsored the
emergence of a unique hominid culture is a credible explanation of changed liv-
ing patterns and augmented technical and social abilities of H. ergaster/erectus,
without the need to postulate that these hominids were in command of any-
thing like a modern language. Certainly, a capacity for reciprocal miming and

. It seems safe to assume that such abilities were under the voluntary control of the hominids.
Whether some other animals have some degree of control over their vocalisations is still an
open question (Allen and Sidel 1998; Hauser and Marler 1993a, 1993b; Marler and Evans 1995;
see also Corballis 2003:202).
. The development of domain-general mimetic abilities obviates the need to postulate the
emergence of an evolved mechanism for identifying, memorizing and reasoning about social
norms, together with a powerful motivation to comply with such norms. And with norms and
norm-based motivation added to the human phenotype, the stage would be set for much that
is distinctive in human cultures (Carruthers 2003:75).
256 Daniel D. Hutto

the establishment of a mimetic culture could have played a central part in an


impressive list of important activities and practices such as childrearing, coor-
dinated hunting and gathering, food sharing, defining community recognised
social ranks and statuses. Mimetic abilities look well-suited to explain norm gov-
erned social interactions of the sort needed for the remarkable and wide-ranging
achievements of H. ergaster/erectus. Yet all of this would have been available in
the absence of language (Donald 1991:174). For, as Donald makes quite clear
Language is not necessary for the development of complex social roles and rules,
but mimesis is essential (Donald 1991:175).
A steady increase along this cognitive trajectory might explain why, although
the second period of significant encephalisation that occurred with the advent of
Homo sapiens was still a long way off, hominid brain sizes were increasing slightly
all the while. And this needs explaining since having larger brains came at a heavy
price. They are costly to run and feeding them on small stomachs requires a high-
quality diet one that is not easy to acquire. Big brains will have made other
demands too; especially for bipeds. Not only do they require more energy, they
make birthing difficult (Lovejoy 1980). This has other consequences. Having big-
ger brained offspring is problematic for those who walk on two legs, for it meant
that babies had to be pushed through rather narrow birth canals. The solution to
this problem, having immature offspring, saddled hominids with all the burdens
of dealing with prolonged periods of childhood (Locke and Bogin 2006). This
is not seen in other primates but is pronounced in humans. H. ergaster/erectus
would have required dedicated practices of pedagogy and childcare. Thus, even
though it is not possible to draw direct conclusions about specific modes of cogni-
tion based on brain size, we can conclude that there must have been some major
trade-off (or trade-offs) for all of these changes.
It seems plausible that expansion of domain-general (not general purpose)
mimetic abilities, which may have reached a first plateau with H. habilis, may
have been at least partly responsible for the growth in their neural volume. The
further enhancement of these abilities, could have been the source, not only of the
more advanced technical skills exhibited by H. ergaster/erectus but also the basis
for their dramatically different kinds of social engagements. The technical and
social advantages conferred by imaginative-imitative abilities would have had a
ratcheting-up effect, independently spurring on and reinforcing their selection.
This might explain why, despite the seemingly great achievements of hominids
during the reign of H. ergaster/erectus, there is an extended period of steady but
unremarkable brain growth which lasted until the arrival of the early humans.

. For a discussion of the running costs of brains see Aiello and Wheeler (1995).
First communions 257

This is surely consistent with the hypothesis that with H. ergaster/erectus mimetic
capacities had reached an apex, coming into full swing for the first time. To give
this proposal a name, let us call it the Mimetic Ability Hypothesis (or MAH).
As presented here the MAH has considerable scope for development. It makes
no strong commitments as to the exact level of hominid mimetic abilities at the
different stages nor does it say anything very precise about the timings of their
emergence. The core claim of MAH does not require that we decide such issues.
It only claims, weakly, that mimetic abilities (and not ToMMs) can potentially
account for the most important technical and social feats of our immediate an-
cestors. To make this case in full would require going into too much detail for
a short chapter. A slightly more developed version can be found in Hutto 2008,
Ch.11 and 12. But in support of the weak claim, in what follows, I concentrate on
critically assessing and rejecting some of the more prominent reasons that have
been offered for thinking that theory-of-mind abilities must have been necessary
for the hominids.

4. Why else ToMMs?

Once the MAH is articulated it puts great pressure on claims that our ancient
ancestors must have used ToMMs in order to get by in their daily routines. In
this regard, it is worth noting that the claim that mindreading devices would
have been necessary for hominids gets much of its credibility from equivoca-
tion about the level of abilities we are seeking to understand. Thus it has been
argued that ToM abilities would have been needed in order to share a plan or
a goal, as required to develop and implement sophisticated hunting tactics or
erecting various constructions (Baron-Cohen 1999:264). Yet this thought must
be weighed up against the fact that even wild chimpanzees and other group ani-
mals are quite capable of coordinating their hunting efforts, despite a manifest
lack of mature ToM abilities (Boesch and Boesch-Achermann 2000; Brinck and
Grdenfors 2003). At the very least, such facts encourage taking extreme caution
when drawing inferences about the degree of mentalising capacity that might
have been needed by our ancestors.
Some have exercised it. For example, Mithen speculates that at most only
a desire-based psychology would have been needed in order to account for the
kinds of behaviours that would have been witnessed from the time of H. ergas-
ter/erectus to the rise of archaic humans. The trouble for those who postulate in-
nate mindreading devices is that it is easy enough to understand a purely desire-
based psychology in terms of an appropriate capacity for unprincipled embodied
258 Daniel D. Hutto

e ngagements the having of a kind of intentional attitude psychology (as dis-


cussed in Hutto 2006a, 2008).
In this light, Dunbars claim that more would have been needed at this in-
termediate phase of hominid evolution becomes particularly important. He has
offered special reasons for thinking that nothing short of full metarepresenta-
tional ToM abilities would have been required. For, he argues, it is only by having
such abilities that those hominids could have managed to have lived in the large
groups to which they were accustomed. Dunbar has independently established
that there is a direct correlation between neocortical volume in primates and the
maximal size of their workable social groups (Dunbar 1992, 1993). Extrapolating
from this data, and estimating the brain size of H. ergaster/erectus, he suggests
that these hominids may have been operating in social groups with as many as
150 members. There are good reasons to believe this might have been so. Groups
of this size would have afforded greater protection and improved capacities to
defend and access resources and new opportunities for predation but at a price.
For, upon reaching a critical ceiling they would have become hard to manage. The
maintenance of social cohesion certainly would not have been possible using the
same methods as their predecessors.
A crucial factor in keeping a handle on this more complex social matrix
would have been to keep tabs on personal relationships both ones own, as es-
tablished with specific individuals, as well as monitoring the third-party alliances
and interactions of others. A quantitative increase in domain-general capacities
for re-identifying particulars and stronger working memory, not unlike that of
apes, would be sufficient to explain this achievement. But keeping track of a so-
cial space and ones place in it is one thing, securing and maintaining ones rela-
tionships within it is quite another. Monkeys and apes manage the latter through
one-to-one physical grooming, but this requires direct contact and interaction
with a restricted number of others. While effective, this method is a thief of time
and in any case once groups get too large and physically disperse such inten-
sive interactions would have been impossible.
Using cranial comparisons with apes and indexing these to their social habits
as a base-line, Dunbar calculates that the budget of time required for physical
grooming would have reached pressure point in groups of the size that H. ergas-
ter/erectus would have been operating in (see also Mithen 1996:111). If so, these
hominids would have been forced to find fresh methods to substitute for personal
grooming. Dunbars proposal is that they switched to linguistic as opposed to
tactile means for achieving this our early ancestors would have had to learn to
gossip (Dunbar 2003, 2004).
If we assume that these linguistic exchanges involved the conveyance of,
and conversations about, propositional attitudes then mature ToM abilities are
First communions 259

e ntailed. In arguing that we should accept this, Dunbar assumes the truth of an
intention-based semantics and a particular version of the communicative view
of language. Crudely, on this view, the primary role of language is to clothe the
pre-existing private thoughts of conversationalists it provides the conventional
forms that serve as the public medium for the sharing of these. Comprehension
and production are understood in translational terms, involving the appropriate
coding of outputs from and the decoding of inputs to the language of thought of
each participant. This has been aptly named the inner process model (Rowlands
2003:7677). It casts natural language signs in the role of mere facilitators, lack-
ing any intrinsic representative power of their own.
If one accepts this picture of the function of language, then metarepresen-
tational devices would have been necessary pre-adaptations for such public ex-
changes. They would be crucially implicated in the translational processes since a
hearers grasp of a speakers meaning would depend on deciphering the speakers
communicative intention understood as their sincere assertion; i.e. they would
be giving expression to what they believe (Grice 1989). If public communica-
tion is, at root, an attempt by hearers to grasp what individual speakers have in
mind i.e. the content of what they intend to assert by their utterances then
anyone engaged in such activity must be presumed to have intact metarepresen-
tational ToM abilities (and the relevant mechanisms to support these). In this way,
folk psychological abilities would have played a vital role in ensuring temporal
cohesion of large dispersed groups (Dunbar 2000:250). Dunbar is explicit about
the relevant implications:
Theory of mind is probably essential for language, not so much because it is in-
volved in the production of speech per se but because it provides a mechanism
that both enables speakers to ensure that their message has got through and al-
lows hearers to figure out what the speakers message actually is (subtext and all).
 (Dunbar 2003:224, emphasis added)

Although in some quarters the communicative view of language still has the status
of being the received view, it deserves to be treated with scepticism (Gauker2003).
In its strong form it rests on the assumption that pre-verbal individuals would
have been capable of propositional thought prior to developing a public medium
appropriate for the expression of such thoughts. Yet if my arguments against the
very idea of content hold good, this idea is thrown into serious doubt and the
communicative conception of language along with it (see Hutto 2006b, 2007c).10

10. The argument developed here only targets those versions of the communicative view of lan-
guage that presuppose the existence of a meaning-conferring Language of Thought. It may be
possible to accept a Gricean analysis of mature language, while rejecting the claim that beliefs
260 Daniel D. Hutto

These considerations alone suffice to make Dunbars conclusions precarious,


to say the least. But his proposal is implausible in any case, if we accept widely
held views about the arrival dates of a symbolic language with the sort of com-
plexity that would have been needed for the reliable formation, encoding and
public expression of content-involving propositional attitudes. Since they lacked
vocal tracts of an equivalent kind to that of modern humans, the anatomy neces-
sary to produce the full range of human speech was absent in H.[ ergaster/]erectus
and certain, if not all, Neanderthals (Lieberman 2000:136). It might be thought
that this was only a barrier to the expression of content-involving propositional
attitudes. But, as I argue elsewhere, if there could be no such attitudes prior to the
establishment of public languages with stable meanings, then there would have
been no thought contents for these early hominid speakers to express in any case
(see Hutto 2007c).
In all if we assume, in line with standard thinking, that the kinds of linguistic
practices needed to support discursive conversational exchange were still a long
way off, it is much more likely that grooming-at-distance was achieved by the
making of pleasant but meaningless noises (Bickerton 2003:79). This may have
involved the exchanging of familiar idiosyncratic calls and would have required a
reasonable vocal control of the sort we find in the duetting used by chimpan-
zees, gorillas and baboons; this is used by these animals to keep in touch with each
other when they are out of one anothers sight. Signature calls used in this way
may have served as a vocal analogue of the repeated actions normally involved
in manual grooming. These performances would be, in key respects, like infant-
directed speech. This is a more plausible hypothesis than Dunbars if we suppose,
as seems likely, that, at this stage, hominids would only have had the capacity to
engage in non-conversational exchanges which had none of the properties of
conversation (Corballis 2003:203).
Building directly on Dunbars work, Dautenhahn (2002) has proposed that the
capacity to produce and comprehend narratives, taking the form of conversational
gossip, may have evolved in order to resolve the dilemma of living in larger social
groups mentioned earlier: the evolutionary origin of communicating in stories co-
evolved with the increasing social dynamics among our human ancestors, in par-
ticular the necessity to communicate about third-party relationships (Dautenhahn
2002:103104, 2001:252). Stated in this way, Dautenhahns proposal seems to

would have been a prerequisite for it. In other words, linguistic utterances have representational
content, the conventional meanings of which are mutually known, but in both ontogeny and
phylogeny, language use does not get its life from belief-based understandings; rather it en-
ables the establishment of such understandings, in time (see Zlatev 2007, this volume; cf. Eilan
2005:1314).
First communions 261

i nherit all of the problems that haunt Dunbars account. But, in fact, these evaporate
if we distinguish between two importantly different kinds of narrative, those of a
purely dramatic re-enactive sort and those which are linguistically based.11
A mimetic culture could have plausibly sponsored the former calling on es-
tablished canonical forms, roles and figures in doing so. These would have been
the obvious precursors to oral myth and story, but they cannot be identified with
such. There is a more modest rendering of Dautenhahns scenario, according to
which the dramatic re-enactments of social happenings would have taken the
form of stories that were literally played out. These would have had a recognisable
pre-narrative format and structure of the sort that would make these embodied
stories ripe for verbal rendering (Dautenhahn 2001). Thus, as Donald suggests:
if hominids could comprehend and remember a complex event, such as the
killing of an animal or the manufacture of a tool, they should have been capable
of re-enacting such events, individually or in groups, once mimetic capacity was
established. (Donald 1999:146)

Regular re-enactments of events of special significance may have eventually be-


come deeply ingrained in the social fabric, thus supporting the establishment
of common customs and habits. Established dramatic re-enactments and non-
linguistic conventions would have been a powerful substitute means of ensur-
ing social cohesion, supplanting or at least supplementing the physical grooming
of individuals. Mimetic interactions of this kind would have helped to solidify
within-group identities, obviating the need for more direct and physically tax-
ing forms of one-to-one social maintenance.12 Either way, if only non-linguistic
grooming methods were used then there is no need to suppose that the early
hominids would have needed (mature) ToMs.
Similar considerations defeat the claim that Pretending necessarily requires
a theory of mind (Baron-Cohen 1999:265, emphasis added). For if the MAH is
even possibly true then a capacity for quite sophisticated forms of pretence those
powerful enough to have fuelled a robust public theatre surely did not rest on

11. We can thus re-name Dautenhahns Narrative Intelligence Hypothesis, which is a direct de-
scendant of Dunbars, the Pre-Narrative Intelligence Hypothesis (PIH) to avoid confusion. This
recognises that there are two aspects of childrens narrative activity which are too often treated
in mutual isolation: the discursive exposition of narratives in storytelling and their enactments
in pretend play (see Richner and Nicolopoulou 2001:408).
12. This may explain why, being in the same form as the dramatic re-enactments of our ancient
ancestors Childrens first narrative productions occur in action, in episodes of symbolic play by
groups of peers, accompanied by rather than solely through language. Play is an important
developmental source of narrative (Nelson 2003:28).
262 Daniel D. Hutto

having a ToM.13 Here it helps to recall the recent evidence that has been effectively
marshaled against Leslies claim that this sort of pretence involves metarepresen-
tation (cf. Leslie 1987, 1994, Berguno and Bowler 2004).
Moreover, there is a clear connection between the kinds of imaginative re-en-
actments that a mimetic culture would have sponsored and the basis for the kind
of narrative competency young children first exhibit (see Hutto 2006h). Social
dramas are, of course, the very stuff of many narratives such engagements set
the broad parameters for deciding which events are interesting; these provide the
subject matter for much narration. Without question, stories could have been told
about the actions of our ancient ancestors. Certainly, their lives would have been
dramatic enough. Nevertheless, they lacking the appropriate medium and estab-
lished practice were in no position to tell such stories. We must be clear about
the order of appearance: narratives could not have been related orally or conver-
sationally in the early stages of our pre-history on the assumption that there can
be no narrative without narration, a point sometimes overlooked by those who
see human life in terms of narratives untold or waiting to be told (Lamarque
2004:394).14 And assuming that H. ergaster/erectus lacked a (sufficiently complex)
language, the resources for conducting discursive conversation and telling stories
would have been missing (Dunn 1991). In place of such discourse, a blossoming
capacity for mimesis may have taken up the slack.

5. First communions

Even if they lacked language it is widely supposed that the hominids as far back
as H. ergaster/erectus would have used a kind of proto-language one which had
some but not all the features of a full fledged language. Specifically, if we fol-
low Bickerton, such a language would have consisted of grammarless utterances,

13. Carruthers puts a very late date on the development of a capacity for imaginative pretence,
suggesting it only emerged at some point between 3060,000 years ago, thus restricting it to H.
sapiens sapiens. He supposes that its onset, along with the amassing of cross-domain knowledge
which would have already been made possible through the medium of public language, resulted
in an avalanche of creative thinking, cultural artefacts and novel modus vivendi for our ancient
ancestors (Carruthers 1998:115). He finds support for this hypothesis in the putative fact that
if earlier hominids, such as H. ergaster/erectus, had exercised their imaginations then we would
have seen a much greater impact of this in the patterns of their life. But Carruthers may be
looking for the wrong sort of evidence, for instead of appearing in the form of symbolic art, the
recreative imagination may have left its mark visibly on tool-making industries and invisibly on
a mimetic culture.
14. Or more succinctly: a story must be told, it is not found (Lamarque 2004:394).
First communions 263

rather like the one and two-word offerings of two-year olds; being comparable
to pidgins. Thus it might be thought that metarepresentational abilities must be
implicated in the formation of such a basic proto-language and its subsequent
acquisition. The forging and learning of a public lexicon is based on a capacity for
co-reference, and on the received view children require some type of theory of
mind. They need to have an idea of what the parent is thinking and how his/her
mind differs from their own to be so very good at associating the correct utterance
with the correct reference (Mithen 2000b:497). If so, perhaps metarepresenta-
tional mechanisms played an instrumental part in the development of the very
first public languages. The assumption here is that the formation and learning of
a public lexicon rests on the most sophisticated kind of pre-verbal intersubjective
engagement joint attention. And the idea is that joint attention looks hard to
account for without invoking ToM abilities of some kind or other.
Certainly, the capacity of human infants of around 1 year to attend jointly
with others is the most sophisticated form of non-verbal intersubjective engage-
ment. It involves participants not only attending to the same objects at the same
time; it also requires that they mutually recognise that their attending has a com-
mon focus that they both attend to one anothers attending to one another and
some worldly point of interest. Intersubjective interactions of this kind are thus
of a quite different order from more mirror-like forms of reciprocal imitation.
The capacity builds upon the uniquely human capacities for declarative pointing,
social referencing and, as will become clear momentarily, imaginative perspec-
tive shifting.15
Typically the scene is set for such engagements when one or other of the par-
ticipants draws attention to some object or feature in the local surround, using the
sort of strategies familiar to children once they have learnt to point declaratively,
at around 912 months old (Tomasello 2003). Gaze monitoring, checking back
and other social referencing techniques are then used to ensure that the commu-
nicative triangle has been established and that it is maintained. In this way, joint
attention involves attending not just to the object of the others interest but also
to the intention behind the behaviour (yet without entertaining higher order
thoughts about it) (Brinck 2004:196).
If we focus on this aspect of the phenomenon it might be assumed that it
requires making some kind of simulative leap. Simulation is often presented as a
kind of imaginative attempt to adopt anothers perspective on events, a coming

15. Joint attention of this kind has never been shown to be mastered by apes, not even adult
apes. Indeed, the gulf between their abilities and ours in this domain is so wide that it suggests
that to the extent that they can attend jointly at all it is an evolutionarily different version
(Gmez 2005:80; see also Leekam 2005:225).
264 Daniel D. Hutto

to see how the world looks from the others perspective. It is as if I had climbed
in behind your eyes, while at the same time recognising that I am not you. But
there are problems even for those defenders of simulation theory, such as Gordon
(1995, 1996), who only invokes a first-personal transformation as opposed to a
projective version of the theory. For, unless such an act of simulation is assumed
to take place within a context in which the simulators are somehow already in-
dependently aware that the other in question has a different point of view, the
heuristic would be an utterly hopeless means of getting to grips with the perspec-
tive of others; at best it would be a means of becoming the other while losing ones
sense of self (see also Gallagher 2007). Whereas in joint attention what is required
is both identification and recognition of difference (Hobson 2007). It involves
having the experience of acting in and attending to a shared world, alongside oth-
ers in response to public objects. This is quite different from the experience of
identifying with others or of merely acting on the world in response to objects.
In such non-verbal engagements I see what the other is attending to, I see
that they are attending to it, and I see that they are attending to my attending to
both the object and to my attending (see Zlatev this volume). Only in this way is
the object recognised as a common focal point. The mechanisms that underpin
the mutual connectedness that enable identification and common focus are likely
to have their ancient roots in mirror neuron systems. Interestingly, key elements
of the human mirror neuron system for grasping are found in Brocas area an
area of the brain that relates to language production and comprehension. This
has inspired researchers to speculate that there may be interesting, possibly quite
tight, connections between this kind of intentional attunement and basic lexicon
forming abilities (Rizolatti and Arbib 1998; Billard and Arbib 2002; Arbib 2005).
For example, Arbib (2005) suggests that this part of the brain evolved atop of
those implicated in more basic modes of intersubjective engagement, precisely
because it once subserved a manual, gesture-based form of communication which
he calls proto-sign. It is plausible that resonance systems of this kind may have
played a pivotal role in enabling humans to enjoy a shared world and develop a
common language for describing it, ultimately the intentional attunements they
subserve may be the basis for the parity between speakers and hearers that en-
able first communions (Arbib 2003, 2005; Heiser, Iacoboni, Maeda, Marcus and
Mazziotta2003).16 These lines of research are especially attractive given the im-
portant link between joint attention and word learning.

16. It seems likely therefore that the first non-syntactic proto-language would have been manu-
ally-based. If so, it would follow that language evolution took a circuitous route i.e. speech-
dominated language would not have emerged directly from primate call systems. Some claim
that the latter is the more straightforward explanation (Dunbar 2003). That is certainly so, but
First communions 265

Yet, as noted above, to jointly attend requires more than mere identification.
It also requires being able to see the other as other. This fact can make it look as
if such triangulation must be in debt to the services of some kind of inference-
based ToM abilities, of the TT or ST sort. To attend to the others perceptual take
in the appropriate way can certainly seem to require a capacity for conceptually
distinguishing between self and other, and indeed of making inferences from
me to you based on assumptions of similarity by some means. Characterised in
this way, joint attention might be thought to implicate full fledged mindreading
abilities involving propositional attitudes. But it aint necessarily so. As always,
there are richer and leaner interpretations, even when it comes to making sense
of what is perhaps the most sophisticated form of non-verbal intersubjectivity
(Carpendale and Lewis 2004).
There is little doubt that joint attention depends on having a multi-tasking
ability to shift ones perspective across distinct axes focusing, at different mo-
ments, on the common focal point, the others attention to the same, and also on
the others attending to ones own attending. It is quite plausible that such interac-
tions therefore call on recreative imaginative perspective shifting capacities. But
these should not be confused with mindreading activity of the folk psychological
sort. As those who have done the most to explicate their nature make clear, they
are non-propositional and non-simulative (Currie and Ravenscroft 2003:96).
Seeing anothers seeing (whether directed at the world or at my own seeing) does
not involve representing the others cognitive take on things, it only involves imag-
ining their perceptual ones (see Gallagher and Hutto, this volume). If this is right,
even the most sophisticated form of non-verbal intersubjective engagement does
not involve the manipulation or attribution of propositional attitudes.
We need appeal to nothing more than capacities for intentional attunement
and recreative imagination to explain pre-linguistic joint attention. Most impor-
tantly, such acts do not require their participants to make full fledged proposi-
tional attitude ascriptions. Infantile forms of attending to the attending of others
cannot be explicated in terms of metarepresentational understanding, if that un-
derstanding is only operative at a later stage, after the concept of belief has been
mastered (Moses 2001; Wellman and Phillips 2001; Woodward, Sommerville and
Guajardo 2001). We have little choice but to conclude that non-verbal acts of joint
attention are only based in responsiveness to intentional attitudes not proposi-
tional ones. Infantile capacities to identify and respond to intentions-in-acting are
quite distinct from the understanding of intentions that rest on having mastered
the concepts of desire and belief (and how these interrelate to form reasons).

there is simply no a priori reason to suppose that straightforward explanations are always the
best ones.
266 Daniel D. Hutto

Baron-Cohen claims that understanding that words refer presumes the con-
cept of intention or goal (Baron-Cohen 1999:267). Yet, as just argued, there is
every reason to think that even if this is true in some sense it does not imply a
capacity for metarepresentation. Put otherwise, even if one were to insist that
some concept of intention is needed it could not be one that is equivalent to
the mature folk psychological variety. It is much more plausible that even in the
normal case, where basic language learning is supported by children and adults
mutually engaging in pre-linguistic acts of joint attention; these only involve a
mutual responsiveness to one anothers intentional attitudes.
Not only that, but the argument suffers in any case since having joint atten-
tional abilities is not even necessary for learning the basics of a language. Con-
sider that:
Children with autism show us just how useless a language capacity is without a
theory of mind. Strip out a theory of mind from language use and you have an
individual who might have some syntax, the ability to build a vocabulary and a
semantic system but what would be missing from their language use and com-
prehension is pragmatics Language without a theory of mind is not of course
entirely useless. It allows literal communication, acquisition of information from
others, requesting, ordering, etc.
 (Baron-Cohen 1999:266, emphasis added;
 second quotation from footnote on p. 267)

The very existence of such linguistic abilities in individuals with autism serves as
a kind of existence proof; it demonstrates the falsity of the claim that it is neces-
sary to have ToM abilities in order to learn words. As the above quotation makes
amply clear, at least some individuals with autism are surely capable of learning
and using language competently to some extent, despite their ToM deficiencies.
Presumbly they are able to achieve this because, even though they cannot jointly
attend with others, they are supported by veteran language users who can. With-
out the normal forms of feedback and checking, their teachers must make even
stronger assumptions than usual about what it is that the autistic child is attend-
ing to when the relevant associations are being forged. It is their job to ensure, to
the best of their abilities, that the initiate is making the appropriate connections
between items of reference and local labels (see Hutto 1999:1334, 2000:315).
Indeed this line of argument is made more implausible still when it is observed
that full-fledged ToM abilities are not only unnecessary; to use them during basic
word-learning situations would be downright unhelpful. Doing so would make
it difficult even to establish basic referential triangles in the first place. Ironically,
this fact is most evident in the cases of primitive lexicon forging and learning; it
has been noted that:
First communions 267

Young childrens deficits in understanding intentional diversity may actually ben-


efit word learning. As many theorists have noted, the task of interpreting novel
words presents a complex inductive problem because, logically speaking, any
given word could mean any of an infinite set of possible things (Markman 1989;
Quine 1960). (Sabbagh and Baldwin 2005:172)17

It is clear that without a basic agreement in how to respond to things, symbol


grounding and word-learning would not get off the ground. But it could be ar-
gued that something akin to the tendency for default attribution, in which chil-
dren use the simple heuristic of attibuting beliefs to others based on what they
take to be the case, is at play here (Leslie, Freidman and German 2004). This
would obviate the need to decide between the countless possible ways of constru-
ing the others communicative intentions. Perhaps, so the thought goes, hominids
had full-fledged ToM abilities after all, but these were only used at their lowest
setting during acts of lexicon formation. But, for obvious reasons, this is a very
poor strategy for arguing that hominids would have needed sophisticated ToM
abilities. If anything, it shows that even if they had them they surely did not need
them. After all, the question is not whether it is in some sense logically possible to
imagine that hominids had sophisticated ToMs, we are trying to assess if there is
any reason to think they actually had them.
Rather than supposing that non-linguistic joint attention rests on prior facil-
ity with referential symbols, it is much more plausible that it scaffolded by imi-
tative abilities and the conventions of a mimetic culture enabled the fashioning
of the very first symbols. This would have permitted a kind of objective think-
ing an awareness of sharing a world with others of a kind that, when further
supplemented with imaginative and mimetic abilities, could have been a bridge
from perceptually-based forms of intersubjective interaction to amodal symbolic
thinking (see Arbib 2003, 2005; Sinha 2004; Hutto 2006c). Being able to learn to
use genuinely referring symbols is dependent upon our capacity to have a shared
awareness of things, not the other way around.
There are various proposals about the stages of this process on the market.
It is plausible that most rudimentary forms of mime exploited shared mimetic
schemas those that tap deeply into what lies at the base of our readily famil-
iar embodied activities, the kind of activities that can be enacted and re-enacted
(Zlatev 2005, 2007). These would have served as the initial points of connection
with others, drawing on the common ways of acting in the world such as running,

17. For those with full-fledged interpretational abilities the early indicational phase is initially
extremely idiosyncratic: one or only a small number of interlocutors can understand what the
child is indicating i.e., grasp what aspect of the environment it is to which the child is drawing
attention (Rowlands 1999:196).
268 Daniel D. Hutto

hitting, kicking, and so on. Mimetic schemas are more fundamental than their
abstract cousins image schemas (see Hampe 2005). Still, even the latter can be
thought of as deriving from what is common to basic, embodied ways of respond-
ing in relation to certain normally encountered situations and activities. These
yield certain familiar contrasts such as UP-DOWN, IN-OUT, FRONT-BACK,
LIGHT-DARK, WARM-COLD, MALE-FEMALE (Lakoff and Johnson 1980:58).
Dichotomies of this kind have universal resonance because they feature in every-
day ways of acting, reacting and interacting with the world and others. Moreover,
it is plausible that material artefacts tools, buildings, furniture, etc. which can
be the focus of joint attention, act as intersubjective anchors, since the activities
they afford are non-arbitrary in important ways and these can be made canonical
through convention (see Sinha 2005:15421543).
Indeed, our tendency to make use of mimesis has stayed with us even after
the establishment of symbolic language. For example, we typically adorn purely
linguistic speech acts with gestures even when these are of no use to the hearer
either for adding expression or for helping to establish significance, such as when
one gestures whilst speaking on the telephone (Corballis 2003). This deeply in-
grained way of connecting with others through mimesis looks to have stayed with
us as something more than an inert cognitive vestige. Thus it has been convinc-
ingly argued that to a large and interesting extent, we only feel we have satisfacto-
rily understood or grasped something once we retreat to embodied schemas of
some kind or other.
However we ultimately choose to make sense of basic mimetic activities, it is
from this sort of starting point that it is possible to sketch the gradual stages of
likely linguistic development in hominids. For once the practice of jointly attend-
ing was well established it is plausible that a basic capacity for mime would have
developed into a more rudimentary form of communication, one involving a kind
of reference to common focal points (where these might be happenings, actions
or objects) even in their absence. In such cases, communicators would need to
bring the relevant objects before the others mind by some means; the referents
would have to be in some way invoked. It is plausible that early mimes might
have achieved this by drawing on bonds of visuo-motor associations of a sort that
would have been familiar to all participants. In doing so they would have had to
tap into associations holding between certain worldly things (objects of potential
co-attention) and certain salient aspects of mimetic acts which resemble these or
would successfully remind the other of them. For example, this might be done by
using highly stylised gestures, such as wriggling ones arm in slithering fashion to
mimic the movements of a snake.18

18. This is a description of what Zlatev calls triadic mimesis (Zlatev this volume).
First communions 269

Mimetic acts of this kind are not signals since they involve prior communi-
cative intent. They are not attempts to effect more straightforward coordinations
such as initiating imitation (i.e. getting the other to use their arms in a similar
way) or to directly cue a certain kind of action routine on the part of the other
(i.e. by inspiring characteristic responses that the presence of snakes normally
calls forth). To use ones arm so as to invoke thoughts of snakes is not an impera-
tive act but an attempt at intentional communication, albeit of a crude and un-
stable kind. That this is possible is evident by the fact that games such as charades
exist, but, of course, this sort of game has important structural supports that the
imagined mimetic acts just described would have lacked (or would have lacked
in the first instance).
It should be clear that even a rudimentary capacity to use and appreciate
mime would have brought unheralded degrees of freedom and new possibilities
for communication. In key respects, its advent would have made the character of
our ancestors first communicative efforts, quite literally, dramatically different
from the sorts of signals used for coordination by other animals, even those of
our closest living cousins.19 Mimetic communication requires that others make
the appropriate connections; they must recognise the significance of the commu-
nicative act. And, lacking established conventions, early mimetic acts would have
depended on strong associations and resemblances, and these could hardly be
relied upon. It is not easy to communicate by means of pantomime, even when
using additional supports. To be sure it is a hit and miss affair: definitely more
miss than hit; that is, unless the activity is structured and supplemented in im-
portant ways. Communicating by pantomime using only non-linguistic resem-
blance based modes of quasi-reference is a weak and highly ambiguous mode of
communicating. Failures at successful indication would have been a spur to fall
in line with publicly established norms. This is a matter of negotiating and adjust-
ing ones methods of communication to suit a public standard, as prompted by
requests for clarification. Both participants forge a common understanding by
recasting the communicative offerings in line with conventional requirements.
This would have been a crucial step on the road from contextual, indicational
communications to true predicative symbolic use.

19. Apes, for example, rarely use declarative as opposed to imperative gestures and only those
with extensive human contact do so at all. Hence, it has been speculated that although apes
can master the referential triangle in their interactions with humans for instrumental purposes
when they are raised in humanlike cultural environments, they still do not attain humanlike
social motivations for sharing experience with other intentional beings (Tomasello and Call
1997:393).
270 Daniel D. Hutto

6. Summary and conclusion

The principal aim of this chapter has been to remove a kind of aspect-blindness
that is prevalent in much research on intersubjectivity the idea that our basic
social dealings necessarily rest on inherited theory-of-mind or mentalizing ca-
pacities. My strategy has been to cast doubt on the standard story about when
such capacities were acquired in prehistory by challenging the familiar idea that
hominids must have had mature theory-of-mind abilities in order to have (i) en-
gaged in advanced tool-making, (ii) enjoyed social cohesion and (iii) formed and
learned language.
I have argued that close scrutiny of the available evidence gives no reason for
believing this to be true. An alternative explanation, which I call the Mimetic Ability
Hypothesis (or MAH), claims that growing recreative imaginative abilities which
funded impressive technical skills and activities appears to have better prospects of
accounting for the sophisticated social engagements of the hominids even those
implicated in their capacity to form and learn symbolic language.
Much more needs to be said concerning the MAH with respect to exact char-
acter and level of hominid mimetic abilities e.g. when and why they will have
emerged and the kinds of activities they will have made possible at the various
stages of hominid development. Making such refinements to the core thesis goes
beyond the ambitions of this chapter but hopefully the sketch provided suffices
to demonstrate the value of this research programme and the fact that mimetic
abilities (and not ToMMs) at least potentially could account for the most im-
portant technical and social feats of our immediate ancestors.
Most importantly, when the two main proposals are compared side by side,
the abductive virtues of the MAH become evident and the suggestion that mod-
ern humans must have inherited mature mindreading devices from our nearest
ancestors looks like a weak and somewhat incredible hypothesis. Even in sketch,
the mere availability of the MAH reveals that we have no overriding reason to
suppose that hominids must have had a sophisticated capacity for folk psycho-
logical understanding.

References

Aiello, L.C. and Wheeler, P. 1995. The expensive tissue hypothesis. Current Anthropology 36:
184193.
Allen, C. and Saidel, E. 1998. The evolution of reference. In The Evolution of Mind, D.D. Cum-
mins and C. Allen (eds.), 183203. Oxford: Oxford.
First communions 271

Arbib, M. 2002. The mirror system, imitation, and the evolution of language. In Imitation in
animals and artifacts, K. Dautenhahn and Nehaniv C.L. (eds.), 229280. Cambridge, MA:
MIT Press.
Arbib, M. 2003. The evolving mirror system. In Language Evolution, M. Christiansen and
S.Kirby (eds.), 182200. Oxford: Oxford University Press.
Arbib, M. 2005. From monkey-like action recognition to human language: An evolutionary
framework for neurolinguistics. Behavioral and Brain Sciences 28 (2): 105124.
Arbib, M., Billard, A., Iacoboni, M. and Oztop, E. 2000. Synthetic brain imaging: Grasping,
Mirror Neurons and Imitation. Neural Networks 13: 975997.
Baron-Cohen, S. 1999. The evolution of Theory of Mind. In The Descent of Mind: Psycho-
logical Persepctives on Hominid Evolution, M.C. Corballis, and S.E.G. Lea (eds.), 261277.
Oxford: Oxford University Press
Berguno, G., and Bowler, D. 2004. Understanding pretence and understanding action. British
Journal of Developmental Psychology 22: 531544.
Bermdez, J. 1998. The Paradox of Self-Consciousness. Cambridge, MA: MIT Press.
Bickerton, D. 2003. Symbol and structure. In Language Evolution, M. Christiansen and
S.Kirby (eds.), 7793. Oxford: Oxford University Press.
Billard, A. and Arbib, M. 2002. Mirror neurons and the neural basis for learning by imitation:
Computational modelling. In Mirror Neurons and the Evolution of Brain and Language,
M. Stamenov and V. Gallese (eds.), 345352. Amsterdam/Philadelphia: John Benjamins.
Boesch, C. and Boesch-Achermann, H. 2000. Chimpanzees of the Tai Forest: Behavioural Ecol-
ogy and Evolution. Oxford: Oxford University Press.
Brinck, I. 2004. Joint attention, triangulation and radical interpretation. Dialectica 58: 179
205.
Brinck, I. and Grdenfors P. 2003. Co-operation and communication in apes and humans.
Mind and Language 18: 484501.
Call, J. and Tomasello, M. 1999. A nonverbal false belief task: The performance of children and
great apes. Child Development 70: 381395.
Call, J. and Tomasello, M. 2005. What chimpanzees know about seeing, revisited: An expla-
nation of the third kind. In Joint Attention: Communication and Other Minds, N. Eilan,
C.Horel, T. McCormack and J. Roessler (eds.), 4364. Oxford: Oxford Univeristy Press.
Carpendale J.I.M. and Lewis, C. 2004. Constructing an understanding of the mind: The devel-
opment of childrens social understanding within social interaction. Behavioral and Brain
Sciences 27 (1): 79151.
Carruthers, P. 1998. Thinking in language? Evolution and a modularist possibility. In Lan-
guage and Thought: Interdisciplinary Themes, P. Carruthers and J. Boucher (eds.), 94119.
Cambridge: Cambridge University Press.
Carruthers, P. 2003. Moderately massive modularity. In Minds and Persons, A. OHear (ed.),
Cambridge: Cambridge University Press.
Corballis, M.C. 2003. Gestural origins of language. In Language Evolution, M. Christiansen
and S. Kirby (eds), 201218. Oxford: Oxford University Press.
Currie, G. and Ravenscroft, I. 2003. Recreative Minds. Oxford: Oxford University Press.
Dautenhahn, K. 2001. The narrative intelligence hypothesis: In search of the transactional
format of narratives in humans and other social animals. In Proceedings of the Fourth
International Cognitive Technology Conference: CT 2001: Instruments of Mind, M. Beynon,
C.L. Nehaniv and K. Dautenhahn (eds.), 248266. Berlin, Heidelberg: Springer-Verlag.
272 Daniel D. Hutto

Dautenhahn, K. 2002. The origins of narrative. International Journal of Cognition and Tech-
nology 1: 97123.
Donald, M. 1991. Origins of the Modern Mind: Three Stages in the Evolution of Culture and Cog-
nition. Cambridge, MA: Harvard University Press.
Donald, M. 1999. Preconditions for the evolution of protolanguages. In The Descent of Mind:
Psychological Perspectives on Hominid Evolution, M.C. Corballis and S.E.G. Lea. (eds.),
138154. Oxford: Oxford University Press.
Donald, M. 2005. Imitation and mimesis. In Perspectives on Imitation: From Cognitive Neu-
orscience to Social Science, S.L. Hurley and N. Chater (eds.), 283300. Cambridge, MA:
MIT Press.
Dunbar, R.I.M. 1992. Neocortex size as a constraint on group size in primates. Journal of Hu-
man Evolution 20: 469493.
Dunbar, R.I.M. 1993. Coevolution of neocortical size, group size and language in humans.
Behavioral and Brain Sciences 16: 681735.
Dunbar, R.I.M. 2000. On the origin of the human mind. In Evolution and the Human Mind:
Modularity, Language and Meta-Cognition, P. Carruthers and A. Chamberlain. (eds.), 238
253. Cambridge: Cambridge University Press.
Dunbar, R.I.M. 2003. The origin and subsequent evolution of language. In Language Evolu-
tion, M. Christiansen and S. Kirby (eds.), 217234. Oxford: Oxford University Press.
Dunbar, R.I.M. 2004. Grooming, Gossip and the Evolution of Language. London: Faber and
Faber.
Dunn, J. 1991. Understanding others: Evidence from naturalistic studies of children. In Natu-
ral Theories of Mind, A. Whiten (ed.), 5161. Oxford: Blackwell.
Dupr, J. 2001. Human Nature and the Limits of Science. Oxford: Oxford University Press.
Eilan, N. 2005. Joint attention, communication and mind. In Joint Attention: Communication
and Other Minds, N. Eilan, C. Horel, T. McCormack and J. Roessler (eds.), 133. Oxford:
Oxford Univeristy Press.
Gallagher, S. 2005. How the Body Shapes the Mind. Oxford: Oxford University Press.
Gallagher, S. 2007. Logical and phenomenological arguments against simulation theory. In Folk
Psychology Re-Assessed, D.D. Hutto and M. Ratcliffe (eds.), 6377. Dordrecht: Springer.
Gallagher, S. and Hutto, D.D. this volume. Understanding others through primary interaction
and narrative practice.
Gallese, V. 2003. The manifold nature of interpersonal relations: The quest for a common
mechanism. Philosophical Transactions of the Royal Society of London 358: 517528.
Gallese, V. and Goldman, A. 1998. Mirror neurons and the simulation theory of mind-read-
ing. Trends in Cognitive Sciences 2: 493501.
Grdenfors, P. 2003. How Homo Became Sapiens: On the Evolution of Thinking. Oxford: Oxford
University Press.
Gauker, C. 2003. Words Without Meaning. Cambridge, MA: MIT Press.
Gmez, J.C. 2005. Joint attention and the notion of subject: Insights from apes, normal children
and children with autism. In Joint Attention: Communication and Other Minds, N.Eilan,
C. Horel, T. McCormack and J. Roessler (eds.), 6584. Oxford: Oxford University Press.
Gopnik, A. 2004. Finding our inner scientist. Daedalus 133 (1):2128.
Gopnik, A. and Meltzoff, A.N. 1997. Words, Thoughts, and Theories. Cambridge, MA: MIT
Press.
First communions 273

Goldman, A.I. 2005. Imitation, mind reading and simulation. In Perspectives on Imitation:
From Neuroscience to Social Science. Volume 2: Imitation, Human Development and Cul-
ture, S.L. Hurley and N. Chater (eds.), 7993. Cambridge, MA: MIT Press.
Gordon, R.M. 1995. Simulation without introspection or inference from me to you. In Mental
Simulation, M. Davies and T. Stone (eds.), 5367. Oxford: Blackwell.
Gordon, R.M. 1996. Radical simulationism. In Theories of Theories of Mind, P. Carruthers and
P. Smith (eds.), 1121. Cambridge: Cambridge University Press.
Grice, P. 1989. Studies in the Way of Words. Cambridge, MA: Harvard University Press.
Harris, P. and Want, S. 2005. On learning what not to do: The emergence of selective imitation
in tool use by young children. In Perspectives on Imitation: From Neuroscience to Social
Science. Volume 2. Imitation, Human Development and Culture, S.L. Hurley and N. Chater
(eds.), 149162. Cambridge, MA: MIT Press.
Heiser, M., Iacoboni, M., Maeda, F., Marcus, J. and Mazziotta, J.C. 2003. The essential role of
brocas area in imitation. European Journal of Neuroscience 17: 11231128.
Hobson P. 2007. We share, therefore we think. In Folk Psychology Re-Assessed, D.D. Hutto and
M. Ratcliffe (eds.), 4161. Dordrecht: Springer.
Hobson, R.P. and Hobson, J.A. this volume. Engaging, sharing, knowing: Some lessons from
research in autism.
Hurley, S.L. 2005. The Shared Circuits Hypothesis: A Unified Functional Architecture for
Control, Imitation, and Simulation. In Perspectives on Imitation: From Neuroscience to So-
cial Science. Volume 1. Mechanisms of Imitation and Imitation in Animals, S.L. Hurley and
N. Chater (eds). 177194. Cambridge, MA: MIT Press.
Hutto, D.D. 1999. The Presence of Mind. Amsterdam: John Benjamins.
Hutto, D.D. 2000. Beyond Physicalism. Amsterdam: John Benjamins.
Hutto, D.D. 2003/2006. Wittgenstein and the End of Philosophy: Neither Theory Nor Therapy.
Basingstoke: Palgrave Macmillan.
Hutto, D.D. 2004. The limits of spectatorial folk psychology. Mind and Language 19: 548573.
Hutto, D.D. 2006a. Unprincipled engagements: Emotional experience, expression and re-
sponse. In Radical Enactivism: Intentionality, Phenomenology, and Narrative, R. Menary
(ed.), 1338. Amsterdam: John Benjamins.
Hutto, D.D. 2006b. Against passive intellectualism: Reply to Crane. In Radical Enactivism:
Intentionality, Phenomenology, and Narrative, R. Menary (ed.), 121149. Amsterdam: John
Benjamins.
Hutto, D.D. 2006c. Four Herculean labours: Reply to Hobson. In Radical Enactivism: Inten-
tionality, Phenomenology, and Narrative, R. Menary (ed.), 185221. Amsterdam: John
Benjamins.
Hutto, D.D. 2007a. Folk psychology without theory or simulation. In Folk Psychology Re-As-
sessed, D.D Hutto and M. Ratcliffe (eds.), 115135. Dordrecht: Springer.
Hutto, D.D. 2007b. The narrative practice hypothesis. In Narrative and Understanding Per-
sons, D.D. Hutto (ed.), 4368. Royal Institute of Philosophy Supplement. Cambridge:
Cambridge University Press.
Hutto, D.D. 2008. Folk Psychological Narratives: The Socio-Cultural Basis of Understanding Rea-
sons. Cambridge, MA: MIT Press.
Jones, Susan. 2005. Why dont apes ape more? In Perspectives on Imitation: From Neuroscience
to Social Science: Volume 1: Mechanism of Imitation and Imitation in Animals, S.L Hurley
and N. Chater (eds), 297301, Cambridge, MA: MIT Press.
274 Daniel D. Hutto

Knoblich, G. and Jordan, J.S. 2002. The mirror system and joint action. In Mirror Neurons and
the Evolution of Brain and Language, M. Stamenov and V. Gallese (eds), 115124. Amster-
dam/Philadelphia: John Benjamins.
Lakoff, G. and Johnson, M. 1980. Metaphors We Live By. Chicago: Chicago University Press.
Lamarque, P. 2004. On not expecting too much from narrative. Mind and Language 19: 393
408.
Leekam, S. 2005. Why do children with autism have a joint attention impairment? In Joint At-
tention: Communication and Other Minds, N. Eilan, C. Horel, T. McCormack and J. Roess-
ler, (eds.), 6584. Oxford: Oxford University Press.
Leslie, A.M. 1987. Pretense and representation: The origins of Theory of Mind. Psychological
Review 94: 412426.
Leslie, A.M. 1994. Pretending and believing: Issues in the theory of ToMM. Cognition 50:
211238.
Leslie, A.M., Friedman, O. and German, T.P. 2004. Core mechanisms in Theory of Mind.
Trends in Cognitive Sciences 8: 528533.
Lieberman, P. 2000. Human Language and Our Reptilian Brain: The Subcortical Basis of Speech,
Syntax and Thought. Cambridge, MA: Harvard University Press.
Locke, J. and Bogin, B. 2006. Language and life-history: A new perspective on the develop-
ment and evolution of human language. Behavioral and Brain Sciences 29: 259325.
Lovejoy, C.O. 1980. Hominid origins: The role of bipedalism. American Journal of Physical
Anthropology 52: 250.
Markman, E.M. 1989. Categorization and Naming in Children. Cambridge, MA: MIT Press.
Marler, P. and Evans, C. 1995. Bird calls: Just emotional displays or something more? Ibis
138: 2633.
Meltzoff, A.N. and Moore, M.K. 1977. Imitation of facial and manual gestures by human neo-
nates. Science 198: 7578.
Meltzoff, A.N., Moore, M.K. 1994. Imitation, memory and the representation of persons. In-
fant Behaviour and Development 17: 8399.
Mithen, S. 1996. The Pre-History of the Mind: A Search for the Origins of Art, Religion and Sci-
ence. London: Thames and Hudson.
Mithen, S. 2000a. Paleoanthropological perspectives on the Theory of Mind. In Understand-
ing Other Minds, S. Baron-Cohen, H. Tager-Flusberg and D. Cohen (eds.), 488502. Ox-
ford: Oxford University Press.
Mithen, S. 2000b. Mind, brain and material culture: An archeological perspective. In Evolu-
tion and the Human Mind: Modularity, Language and Meta-Cognition, P. Carruthers and
A. Chamberlain (eds.), 207217. Cambridge: Cambridge University Press.
Mithen, S. 2002. Human evolution and the cognitive basis of science. In The Cognitive Basis
of Science, P. Carruthers, S. Stich and M. Siegal (eds.), 2340. Cambridge: Cambridge Uni-
versity Press.
Mithen, S. 2005. The Singing Neanderthals: The Origins of Music, Language, Mind and Body.
London: Weidenfeld and Nicolson.
Moses, L.J. 2001. Some thoughts on ascribing complex intentional concepts to young children.
In Intentions and Intentionality, B. Malle, L.J. Moses and D.A. Baldwin. (eds.), 6984. Cam-
bridge, MA: MIT Press.
Myowa-Yamakoshi, M., Tomonaga, M., Tanaka, M. and Matsuzawa, T. 2004. Imitation in neo-
natal chimpanzees (Pan troglodytes). Developmental Science 7 (4): 43742.
First communions 275

Nelson, K. 2003. Narrative and the emergence of a consciousness of self. In Narrative and
Consciousness, G.D. Fireman, T.E.J. McVay and O. Flanagan (eds.), 1736. Oxford: Oxford
University Press.
Papineau, D. 2003. The Roots of Reason: Philosophical Essays on Rationality, Evolution and Prob-
ability. Oxford: Oxford University Press.
Povinelli, D.J. and Vonk, J. 2003. Chimpanzee minds: Suspiciously human? Trends in Cogni-
tive Sciences 7: 157160.
Povinelli, D.J. and Vonk, J. 2004. We dont need a microscope to explore the chimpanzees
mind. Mind and Language 19: 128.
Quine, W.V. 1960. Word and Object. Cambridge, MA: MIT Press.
Richner, E.S. and Nicolopoulou, A. 2001. The narrative construction of differing conceptions
of the person in the development of young childrens social understanding. Early Educa-
tion and Development 12 393432.
Rizzolatti, G. 2005. The mirror neuron system and imitation. In Perspectives on Imitation:
From Neuroscience to Social Science, S.L. Hurley and N. Chater (eds), 5576. Cambridge,
MA: MIT Press.
Rizzolatti, G. and Arbib, M. 1998. Language within our grasp. Trends in Neurosciences 21:
188194.
Rizzolatti, G., Graighero, L. and Fadiga, L. 2002. The mirror neuron system in humans. In
Mirror Neurons and the Evolution of Brain and Language, M. Stamenov and V. Gallese
(eds.), 3762. Amsterdam/Philadelphia: John Benjamins.
Rowlands, M. 1999. The Body in Mind. Cambridge: Cambridge University Press.
Rowlands, M. 2003. Externalism: Putting the Mind and World Back Together. Chesham: Acu-
men.
Sabbagh, M.A. and Baldwin, D.A. 2005. Understanding the role of communicative intentions
in word learning. In Joint Attention: Communication and Other Minds, N. Eilan, C. Horel,
T. McCormack and J. Roessler (eds.), 16584. Oxford: Oxford University Press.
Sinha, C. 2004. The evolution of language: From signals to symbols to system. In Evolution
of Communication Systems: A Comparative Approach, D.K. Oller and U. Griebel (eds.),
217235. Cambridge, MA: MIT Press.
Sterelny, K. 2003. Thought in a Hostile World. Oxford: Blackwell.
Tomasello, M. 2003. On the different origins of symbols and grammar. In Language Evolution,
M. Christiansen and S. Kirby (eds.), 94110. Oxford: Oxford University Press.
Tomasello, M. and Call J. 1997. Primate Cognition. New York: Oxford University Press.
Tomasello, M., Call, J. and Hare. B. 2003a. Chimpanzees understand psychological states The
question is which ones and to what extent. Trends in Cognitive Sciences 7: 153156.
Tomasello, M., Call, J. and Hare, B. 2003b. Chimpanzees versus humans: Its not that simple.
Trends in Cognitive Sciences 7: 239240.
Wellman, H. and Phillips A. 2001. Developing intentional understandings. In Intentions and
Intentionality, B. Malle, L.J. Moses and D.A. Baldwin (eds.), 125148. Cambridge, MA:
MIT Press.
Whiten, A., Horner, V. and Marshal-Pescini, S. 2005. Selective imitation in child and chim-
panzee: A window on the construal of others actions. In Perspectives on Imitation: From
Neuroscience to Social Science: Volume 1: Mechanism of Imitation and Imitation in Animals,
S.L. Hurley and N Chater (eds.), 263283. Cambridge, MA: MIT Press.
Wittgenstein, L. 1953. Philosophical Investigations. Oxford: Basil Blackwell.
276 Daniel D. Hutto

Wittgenstein, L. 1983. Remarks on the Foundations of Mathematics. Oxford: Blackwell.


Wohlschlger, A. and Bekkering, H. 2002. The role of objects in imitation. In Mirror Neurons
and the Evolution of Brain and Language, M. Stamenov and V. Gallese (eds.), 101113.
Amsterdam/Philadelphia: John Benjamins.
Woodward, A.L., Sommerville, J.A. and Guajardo, J.J. 2001. How infants make sense of in-
tentional action. In Intentions and Intentionality, B. Malle, L.J. Moses and D.A. Baldwin
(eds.), 149170. Cambridge, MA: MIT Press.
Wynn, T. 1991. Tools, grammar and the archeology of cognition. Cambridge Archeological
Journal 1: 191206.
Wynn, T. 2000. Symmetry and the evolution of the modular linguistic mind. In Evolution and
the Human Mind: Modularity, Language and Meta-Cognition, P. Carruthers and A. Cham-
berlain (eds.), 113139. Cambridge: Cambridge University Press.
Zlatev, J. 2005. Whats in a schema? Bodily mimesis and the grounding of language. In From
Perception to Meaning: Image Schemas in Cognitive Linguistics, B. Hampe (ed.), 313342.
Berlin: Mouton de Gruyter.
Zlatev, J. 2007. Language, embodiment and mimesis. In Body, Language and Mind. Vol 1.
Embodiment, T. Ziemke, J. Zlatev and R. Roz Frank (eds.), 297337. Berlin: Mouton de
Gruyter.
Zlatev, J. this volume. The co-evolution of intersubjectivity and bodily mimesis.
Zlatev, J., Persson, T. and Grdenfors, P. 2005. Bodily mimesis as the missing link in human
cognitive evolution. Lund University Cognitive Studies 121.
part iii

Language
chapter 12

The central role of normativity


in language and linguistics

Esa Itkonen

Any natural language consists of rules which are inherently social and norma-
tive. It is the purpose of this chapter to establish the truth of this claim and to
show that it is significant or non-trivial. The argument is based on the ineluc-
table place of normativity in any consistent account of language, as shown by
Wittgensteins private-language argument. Furthermore, the chapter discusses
the relation between semantics and pragmatics and elucidates the ontology of
the social, showing that normativity implies a particular form of intersubjec-
tivity: common knowledge. Finally, I spell out ramifications of the argument for
the empirical study of language within diachronic linguistics, psycholinguistics
and linguistic typology. I conclude by pointing to the possible sources of the
anti-normative bias in much of theoretical linguistics.

1. Introduction

The word language can of course be used in many different senses, but it is rea-
sonable to assume that one sense may be primary. Thus, when we speak of e.g.
English, what kind of entity is it that we mean by this word? More specifically,
is this entity social or non-social (in the sense of individual-psychological)? The
common-sense answer is that it is a social entity. It goes without saying (or so it
seems) that e.g. a dictionary of English is about something that is common to or
shared by all speakers of English, and whatever has these characteristics must be
social by definition. But the scientific answer (e.g. Chomsky 1965) is generally
taken to be that linguistics is part of cognitive psychology, which entails that e.g.
English is, at least primarily, an individual-psychological (and not a social) entity.
I argue in this chapter that, on this particular issue, common-sense is right and
science is wrong, due to the irreducible place of normativity in any consistent ef-
fort to explain the nature of language, as shown by Wittgensteins private-language
argument (Section 2). This has implications for the relation between semantics
and pragmatics, which I touch upon in Section 3. Furthermore, I elucidate the
280 Esa Itkonen

ontology of the social, showing that normativity implies a special form of inter-
subjectivity common knowledge with implications for the theme of this vol-
ume. In Section 5, I spell out ramifications of the argument for empirical studies
within diachronic linguistics, psycholinguistics and linguistic typology. Finally, I
conclude by pointing to the source of the anti-normative bias in much of theoreti-
cal linguistics.

2. The private-language argument

The primarily social nature of language can be shown in different ways. I have
always preferred to rely on Wittgensteins so-called private-language argument,
or PLA for short. PLA has spawned a huge number of publications, among which
Saunders and Henze (1967) still stands out. Considered with all its ramifications,
PLA is anything but simple. A minimalist version of it will be presented in what
follows (but see also Itkonen 1978:91113, 2003b:120125).
PLA directs itself against the dominant tradition of Western philosophy, a
tradition equally represented by Descartes, Hume, and Kant. According to this
(Cartesian) tradition, public things and qualities are reducible to subjective expe-
riences, which constitute the rock bottom of knowledge. Moreover, knowledge
of other minds is supposed to be gained on the basis of the argument from anal-
ogy: When I perceive that bodies (constructed out of my sense-impressions and)
resembling mine behave under similar circumstances in the same way as my body
does, I may infer with a high degree of probability that these bodies are possessed
by minds which think and feel in ways similar to mine.
To start with, the incoherence of the Cartesian position may be demonstrated
by a simple conceptual argument. The Cartesian ego, expressed as I or me, is sup-
posed to be prior to other persons. But, just as there can be no left without right,
there can be no I without you and we: If as a matter of logic you exclude other
peoples having something, it loses its sense to say that you have it (Wittgenstein
1958:398).
More elaborately, the Cartesian position may be reformulated in linguistic
terms as follows. Since knowledge of the intersubjective or public world is sup-
posed to be based on subjective or private experiences, the ordinary intersubjec-
tive or public language must or could have been preceded by a subjective or
private language. Such a language is private in the twofold sense that it refers to
subjective experiences and its rules are known to one person only.
Wittgenstein (1958: 243277 and passim) argues that if a person constructs
a private language and consciously tries to follow its (private) rules, he cannot
know whether or not he has made a mistake. Because the notions of language and
Normativity in language and linguistics 281

rule presuppose the possibility of making a mistake, i.e. an aspect of the norma-
tivity of language, there can be no private language: The test of whether a mans
actions are the application of a rule is ... whether it makes sense to distinguish be-
tween a right and a wrong way of doing things in connection with what he does
(Winch 1958:58).
Presented in outline, PLA goes as follows. Suppose that I am at this very mo-
ment going to (consciously) use some word X of my own private language. My use
of X, i.e. what I mean (or intend to mean) by X, is based on my particular memory
of how I have decided to use X, or how I have used X in the past. Maybe I wish
to check this memory to make sure that I am not mistaken. But the only check
I can rely on is the same memory; and of course it is no independent or genuine
check in fact it is no check (or basis for testing) at all. Therefore any private
rule-application that seems correct to me will be correct, which means that the
notion of a private rule-application, and thus of a private language, dissolves (cf.
the Winch-quotation above). Written documents, for instance, do not get me out
of this circle, because now the question arises whether I remember correctly the
meanings of the written private words. (Notice that on this reading of PLA the
exact nature of the referent thing or sense-impression? is no longer of decisive
importance.) Kenny (1973:192193) presents this argument exceptionally well.
Genuine checks are provided only by other peoples memories, and more gen-
erally by their intuitions about the correct use of (public) language. Of course,
there is no guarantee that these are always trustworthy. But at least they offer
the possibility of genuine testing; and possible testing is certainly preferable to
impossible testing (represented by exclusive recourse to my own memories or
intuitions). This is nothing but the requirement of objectivity (in the sense of in-
tersubjectivity), which is the cornerstone of scientific thinking.
Some readers may still remain unconvinced. Therefore, to further clarify the
issue let us deal with a concrete counter-argument which has kept reappearing in
essentially the same form from the mid-50s onwards. Suppose that I formulate a
private rule according to which what is now called blue ought to be called mlue
by me. I paint a blue patch on a piece of paper and write mlue under it, and on
future occasions I will use this device to make sure that I am indeed following the
rule correctly. Have I not proved that the notion of a private rule is a viable one?
The answer is No, and here are some of the reasons why. Taken together, the
blue patch and the word mlue constitute a picture. When composing this picture,
I may have thought that its meaning is self-evident, i.e. that it can be interpreted
in one way only. But this is wrong. One of Wittgensteins basic insights is that
every picture or image can be interpreted in an infinite number of ways (and this
is also true of mental images; cf. Blackburn 1984:4550; Heil 1992:2530). On
the next occasion when I look at the picture, I may mistakenly think that the rule
282 Esa Itkonen

was meant to be not to say mlue when seeing something blue; or I may think that
the blue patch was meant to remind be that I should check whether in any of the
worlds languages blue is called mlue; and so on.
In other words, the human memory is notoriously fallible. It would be pre-
posterous to assume that I am the only person in the world whose memory hap-
pens to be absolutely infallible. Now, what is true of memory is true of intellectual
capacities more generally. Human beings may succumb to any kinds of aphasia,
delusion or insanity. Today, with the well-documented spread of Alzheimers dis-
ease, this has become a near-certainty: everyone of us, unless released by a timely
death, will become (more or less) insane. Let us keep this in mind, when we now
return to the explication of PLA.
Realizing the ever-present possibility of multiple interpretations, I may now
wish to secure the unambiguous meaning of the picture by adding an explicit
written instruction. If I use my own private language, the instruction will look
something like this: zmosh # glaark * mlue. But nothing can guarantee that I will
remember the meanings of these private words correctly, and if I attempt to avert
this danger by further amplifying the instruction, infinite regress will ensue. If, on
the other hand, I use English, the instruction will look like this: I ought to say
mlue whenever I see this colour! But now I am cheating because my supposedly
private language is based on a public one. More importantly, however, this does
not help me at all, because now any of the forms of human frailty alluded to above
may attack me, either one by one or jointly. Perhaps I am colour-blind, but just
do not know it; or perhaps I have become insane and think that, when looking at
a blue patch, I am looking at my face in a mirror; or perhaps the moment when
I lose the mastery of English has already arrived (but I just do not know it) and I
either fail to understand the instruction or think that it says that I should go wash
my teeth; and so on. The upshot is that my rule-following behavior needs check-
ing by others. This is not fool-proof either. (Perhaps everyone is insane.) But at
least it provides the possibility of genuine checking, which my private memory
and understanding cannot provide.
Wittgenstein assumes that in language, like in any other social institution, we
are, or may become, conscious of the rules we either follow or break. Attempts to
dispose of PLA are often based on redefining private language as unconscious
psychological structure, which makes it self-evidently true that everybody has his

. Barresi and Moores (this volume) Intentional Relations Theory can be thought of as an
empirical equivalent of PLA. In their requirement that, for a psychological concept to come
about, both the first-person, inner aspect and the third-person, outer aspect are equally
needed, they reproduce the insight that [A]n inner process stands in need of outward criteria
(Wittgenstein 1958:580).
Normativity in language and linguistics 283

or her private language. But the redefinition is unjustified, in the first place. Just
as well one might decide to call the internal structure of individual atoms their
private languages.
A useful up-to-date explication of what it is to follow a rule (of language)
is provided by the doctrine of response-dependency or response-authoriza-
tion (cf. Pettit 1996:195204; Itkonen 1997:5860, 2003b:126130, 165168;
Haukioja 2000).
Nothing of what precedes entails in any way that language is exclusively social
in character. Language has of course both a psychological and a biological aspect
or, if you like, substratum. What the preceding discussion is meant to establish is
that language is primarily a social entity.

3. Semantics and pragmatics

Rules or norms do not just lie inertly there; rather, they only exist as rules or norms
of acting. The social view of language, outlined above, suggests that the mean-
ing of a linguistic expression is identical with its (conventionalized) use: Look at
the sentence as an instrument, and at its sense as its employment (Wittgenstein
1958:421). Here as elsewhere, the form of an instrument is a means to achieve
different goals. Language use, i.e. speaking, is part of the same general means-
ends hierarchy as are all human actions and activities.
Both linguistic meaning and its study are called semantics. More precisely,
semantics is that part of (the study of) meaning which deals with meanings of
words and sentences at the (general) level of the conventional linguistic system,
and not at the (concrete) level of single acts of speaking. However, the actionist
nature of language is present already in semantics. As a semantic entity, a sentence
like I will come to see you at midnight encodes an act of asserting. The acts of re-
questing and asking are encoded in imperative and interrogative sentences.
Language is not just action, but also interaction. In the case of requests and
questions (codified as corresponding imperative and interrogative sentences) this
is self-evident because they can only be conceptualized as being directed to some-
one different from the speaker himself. But the same is also true of assertions,
codified as corresponding declarative sentences, as Sibawaihi, the founder of Arab
linguistics, was perceptive enough to realize:

. The interdependence of normativity and consciousness has been explored in an illuminat-


ing way by Zlatev (2007). Zlatevs (this volume) definition of post-mimesis (language) as being
essentially conventional-normative is also in consistence with the present argument.
284 Esa Itkonen

This is how we speak, even if the listener does not ask loud, because what you say
follows the extent of the question he might pose if he were to ask you.
 (cf. Itkonen 1991:155156)

The same insight was achieved e.g. by Russell (1967 [1940]:24):


In adult life, all speech ... is, in intention, in the imperative mood. When it seems
to be a mere statement, it should be prefaced by the words know that. We know
many things and assert only some of them; those that we assert are those that we
desire our hearers to know.

Pragmatics is that part of the study of meaning which deals with how the general
meaning determined by the linguistic system becomes concrete or specific in sin-
gle, either real or imaginary acts of speaking. This requires taking contextual in-
formation into account. In semantics, as noted above, the sentence I will come to
see you at midnight has just the meaning of an assertion. (Which assertion? this
is evident from the lexical content and the grammatical structure.) In pragmatics,
the same sentence (once uttered) becomes depending on the context either
a promise (= Romeo is speaking to Juliette) or a threat (= a vampire is speaking
to his future victim). This, in my view, is the relationship between semantics and
pragmatics in a nutshell. It coincides with de Saussures (1962 [1916]) classic dis-
tinction langue vs. parole (see Section 5.1 below).
It may seem natural to assume that pragmatics, concentrating on individual
performance, pertains to psychology. In my view, however, pragmatics too is of
social character. First, the performance is not individual but inter-individual, i.e. it
necessarily takes place between speaker and hearer. Second, this inter-individual
performance is publicly observable, and derives its identity from being (common-
ly) understood as a joint result of convention and context; just think of the Romeo
vs. vampire contrast (cf. Leech 1983; Verschueren 1999). The truth of this state-
ment remains unaffected by the fact that psychological explanations may of course
be provided for any type of behavior (including linguistic interaction).
In sum, semantics is the study of context-independent meaning whereas prag-
matics is the study of context-dependent meaning. This context-independent vs.
context-dependent distinction was captured by Paul (1880 [1975]:Chapter 4) by
means of his terminological dichotomy usuelle vs. okkasionelle Bedeutung (= usual
vs. occasional meaning). Sometimes it has been claimed that the (inter)actionist
nature of language becomes evident only in pragmatics. We have just seen that
such a view is mistaken. At the level of semantics any sentence encodes a frozen
action, and it is the task of pragmatics to melt it (cf. Itkonen 1983:152164). It is
Normativity in language and linguistics 285

also clear that the acts of referring and predicating belong already to semantics,
and not just to (discourse) pragmatics.
The relation between semantics and pragmatics is dynamic in the sense that
when context-dependent meanings recur, they may conventionalize and thus be-
come part of the linguistic system. This kind of ascent from speech (parole, ok-
kasionelle Bedeutung) to language (langue, usuelle Bedeutung) is in general char-
acteristic of language change (cf. Section 5.1 below).
Having defended the social view of meaning (and of language in general), I
may add a few words on why I find its opposite, i.e. the psychologist view of mean-
ing, less convincing. To be sure, psychologism may mean many different things,
and in what follows I shall briefly deal only with one version of this doctrine.
It is not uncommon to see meaning equated either with (unconscious) schema
or with (conscious) mental image. First, let us assume that meanings are schemas.
These are hypothetical entities: we do not know what they are, but only presume
what they might be; and they may even be non-existent. (Implausible as this may
sound, it is certainly possible.) In contrast, we do know the meanings of words
like midnight and of sentences like I will come to see you at midnight; it makes no
sense at all to assume that they are non-existent. Therefore meanings cannot be
schemas. It needs to be added immediately that we know the meanings of words
and sentences only at the pre-theoretical level, i.e. we know them merely as data.
We do not know how they should be theoretically analyzed (cf. Section 4.4).
Second, let us assume that meanings are mental images. These are subjective
or vary from one person to the next whereas meanings are intersubjective. (For
instance, the sentence I will come to see you at midnight has only one meaning
in the English language, not as many meanings as there are speakers of English.)
Moreover, mental images may be non-existent. Even for a single speaker, there
seems to be no mental image (or set of mental images) systematically and reliably
connected e.g. with the word if. But if we accept the equation meaning = use,
the meaning of if ceases to be a problem. It is enough to state (or list) its differ-
ent uses: the transition from cause to effect (= If it is raining during the night, the
streets will be wet in the morning) or from effect to cause (= If the streets are wet
in the morning, it has been raining during the night), and so on. (But notice again
that knowing the different uses of if does not entail knowing how they should be
theoretically described).

. Thus, I am in broad agreement with the argument presented by Verhagen (this volume) that
semantics (conventional meaning) is not only a matter of denotation, but also includes argu-
mentative aspects. I am less willing, however, to agree that these aspects constitute the core of
lexical and grammatical meaning.
286 Esa Itkonen

In addition to these specific arguments against viewing meanings as schemas


or mental images, we should heed the more general or philosophical admoni-
tion voiced by Wittgenstein (cf. above): Pictures or images (including schemas)
are never enough. They must always come equipped with instructions about how
they are meant to be interpreted.
When the psychologistic conception of meaning amounts to equating mean-
ing not with any specific mental image, but with subjective experience in general,
it seems to be based on the following type of fallacy:
In order to get clear about the meaning of the word think we watch ourselves
while we think; what we observe will be what the word means! But this concept
is not used like that. (It would be as if without knowing how to play chess, I were
to try and make out what the word mate meant by close observation of the last
move of some game of chess.). (Wittgenstein 1958:316)

Accepting the equation meaning = use has a both clarifying and liberating effect.
An enormous amount of time and energy has been wasted on trying to solve the
problem of how meaning exists. But no one is or need be worried about how
the use of a hammer or of a computer exists.

4. The ontology of the social

4.1 Physical and social reality

The ontology of social entities is fundamentally different from the ontology of


physical entities:
There existed electrical storms and thunder long before there were human beings
to form concepts of them or to establish that there was any connection between
them. But it does not make sense to suppose that human beings might have been
issuing commands and obeying them before they came to form the concept of
command and obedience. (Winch 1958:125)

The concept of command is such as to be accessible to consciousness: commands


exist only insofar as they are recognizable as, or known to be, what they are. This
type of knowledge must be shared by all those who issue commands (and either
obey or disobey them). In what follows it will be called common knowledge. This
provides us with a preliminary definition of social: Social entities (unlike physi-
cal entities) exist if, and only if, they are commonly known to exist. For instance,
Normativity in language and linguistics 287

money ceases to exist, i.e. it is just pieces of metal and paper, as soon as people no
longer know that it exists (qua money).
This definition has some interesting consequences. Because a language like
English exists if, and only if, it is commonly known to exist, it follows, among
other things, that the correctness of correct sentences is a social fact, as elucidated
by the following equivalence:
(1) The sentence John is easy to please is a correct sentence (of English) iff the
sentence John is easy to please is commonly known to be a correct sentence.

The formulation (1) is equivalent to the following formulation:

(2) The sentence John is easy to please is a correct sentence is true iff the sentence
John is easy to please is a correct sentence is commonly known to be true.

The sentence (2) instantiates the Tarskian T-sentence, which is of the following
general form (cf. Itkonen 1983:112):
(3) X is true iff p

Here p represents the truth condition of X. According to the received view, the
truth value and the truth condition are two different things: we always know the
truth condition of X, i.e. p, and we analyze it in a step-wise fashion, but this hap-
pens independently of whether we know X to be true or false. As far as physical
facts are concerned, it is indeed the case that while we do know the truth condi-
tion of X, we do not know the truth value of X. Now, the example (2) refutes the
received view as applied to social facts, because it shows that, in this crucial do-
main, it is impossible to know the truth condition of X without knowing the truth
value of X (for discussion, cf. Itkonen 1983:129135). Thus, at the level of social
facts, the T-sentence has the following form:
(4) X is true iff X is (commonly) known to be true.

A declarative sentence X is used to make a statement (or assertion). In logical


semantics, the truth-condition of X is equated with the meaning of X. This view
is too restrictive, but it is certainly the case that knowing the truth condition of
X is part of knowing the meaning of X. We have just seen that, in connection
with social facts, knowing the truth condition (and, more generally, the mean-
ing) of X entails knowing the truth value of X. But why should we think, in the
first place, that we know the meanings of the words and sentences that we utter?
Wittgenstein (1969:370) suggests the answer: I should stand before an abyss if
I wanted so much as to try doubting their meanings...
288 Esa Itkonen

4.2 The nature of common knowledge

What does it mean to say that a social entity like the English language is an object
of common knowledge? One way to answer this question, due to Lewis (1969), is
to say that X is an object of common knowledge if, and only if, the three condi-
tions given in (5) are true of X and of (practically) any two members of a com-
munity (where both A and B stand for each of the two):
(5) A knows-1 X
A knows-2 that B knows-1 X
A knows-3 that B knows-2 that A knows-1 X

As abstruse as such a formulation may seem at first, it is quite easy to show that
three-level knowledge of this kind necessarily occurs in all institutional encoun-
ters. Suppose I want to cash a check in a bank. The only reason why, when ap-
proaching the counter, I do not make soothing gestures or shout I know what to
do, you dont have to tell me!, is that I possess the relevant three-level knowledge:
Not only do I know-1 what to do; and not only do I know-2 that the teller knows-
1 what to do; but I also know-3 that the teller knows-2 that I know-1 what to do.
This type of third-order mentality is also discussed and exemplified by Zlatev
(this volume).
From the logical point of view, there is no way to stop the infinite regress of
different knowledge-levels (= I know that he knows that I know that he knows...).
From the practical point of view, however, this is not a problem. People do not
generally go beyond three- or four-level knowledge. Some people are able to do
this; but nobody masters e.g. ten-level knowledge.
The explication of social in terms of many-level knowledge has sometimes
been regarded as entailing some sort of philosophical idealism. Our example of
check-cashing behavior should dispel this misunderstanding. The relevant com-
mon knowledge is embodied not just in peoples behavior, but also in such physi-
cal artifacts as the bank building, its furniture, the clerks implements, and so on.
Sinha (1988) rightly emphasizes the importance of taking into account the mate-
rial grounding of institutions (including language).
Our example is apt to illuminate another often-misunderstood aspect of com-
mon knowledge. My attitude vis--vis the bank teller is not invalidated if it later
turns out that at the moment of our mutual encounter he happened, for instance,
to be either unconscious or suffering from an attack of insanity, which means
that he did not, as a matter of psychological fact, possess the requisite three-lev-
el knowledge about me. As three-level knowledge about B is not about what B
knows in fact, but what A is entitled to expect B to know: Given the surroundings,
I was entitled to expect that the bank teller whom I was approaching knew his
Normativity in language and linguistics 289

business, i.e. had the requisite three-level knowledge about me. Hence, common
knowledge turns out to contain a crucial normative element. It is a rational recon-
struction of sociality, not a psychological description of what actually goes on in
peoples heads in each and every case:
For in most social situations, if not in all, there is an element of rationality. ...I
refer to the possibility of adopting, in the social sciences, what may be called the
method of logical or rational construction, or perhaps the zero method. ...The
zero method of constructing rational models is not a psychological but rather a
logical method.
 (Popper 1957:140141, 158; for discussion, see Itkonen 2003b:131135)

Common knowledge is usually conceived of as being generalized, and conven-


tionalized (cf. Section 5.1), out of single instances of (non-normative) third-order
mentality, as described by Zlatev (this volume). But, in many accounts, the notion
is not thought to contain an ineluctable normative element. The original version
of the concept given by Lewis (1969) can be criticized for having ignored pre-
cisely this fact (cf. Itkonen 1978:182186). In sum, the social world (explicated by
means of the notion of common knowledge) is permeated by normativity consid-
erations through and through:
It is perhaps the basic insight of Winch (1958) that we need criteria, whose use is
governed by rules [= norms], to identify entities as same or different, and that as
regards social entities, such criteria are internal to them. (Itkonen 1978:185)

Clark (1996) too considers a language as an object of common knowledge, and


he claims (pp. 7577), more precisely, that a language qua commonly known is a
set of conventions. This agrees perfectly with my view (even if I prefer the term
norm). The conventions include those for lexical entries and those for gram-
matical rules, i.e. norms for pairing (morphemic and lexical) forms with mean-
ings and those for combining meaningful forms into phrases and sentences, as I
would say.
Common knowledge (like knowledge in general) must have a basis. In the
simplest case, the common knowledge of a fact is based on its intersubjectively
observable existence. For instance, the common knowledge that it is raining right
now is based on the fact that (as everybody can see) it is raining right now. But
remember that a physical fact, unlike a social fact, can exist, and typically does
exist, even if it is not commonly known to exist.
What is the basis for linguistic common knowledge, e.g. for (2) in Section4.1?
It cannot be pinpointed as easily as it can in the case of commonly known physi-
cal facts. It is not a particular happening, like someone uttering John is easy to
please and no-one protesting its incorrectness. (To be sure, linguistic common
290 Esa Itkonen

knowledge must not in general conflict with such particular happenings.) The
basis for common knowledge about the (in)correctness of sentences is diffuse,
in the sense that it is constituted just by general facts about coming to master a
language and by the concomitant common knowledge about those facts. In this
respect linguistic common knowledge is just one instantiation of institutional
common knowledge in general. The most important difference vis--vis common
knowledge about physical facts resides in that the basis for linguistic common
knowledge, though undeniably existent, cannot be used to strengthen or justify
that which it is a basis for:
And here the strange thing is that when I am quite certain of how the words are
used, have no doubt about it, I can still give no grounds for my way of going on.
If I tried I could give a thousand, but none as certain as the very thing they were
supposed to be ground for. (Wittgenstein 1969: 306307)

4.3 A solution to the controversy between individualism and collectivism

The definition of social ontology given in Section 4.2 dissolves rather than solves
a long-standing controversy within the philosophy of the social sciences. One
side has argued that there is an ontological level of social institutions distinct
from the level of individual persons. The other side has argued that there is
nothing but individual persons (cf. ONeill 1973). Now we can see that they are
both right. Indeed, there are nothing but individual persons; but what we have
is not just an aggregate of individual persons endowed with arbitrary mental
states and distributed in a random order; rather, we have individual persons en-
dowed with quite specific mental attributes (namely many-level states of knowl-
edge) placed in a quite definite structure or pattern (namely that characteristic
of common knowledge). It is this structure that constitutes the ontological level
of social phenomena.
As an analogy, consider the distinction between a single line and a net. On the
one hand, it can be argued that a net consists of nothing but lines, which means
that the line is ontologically primary vis--vis the net. On the other hand, the net
is not just a random heap of lines, but a quite specific structure or pattern of lines.
When the lines constitute a net-like structure, then and only then there is this
all-important difference that it is possible to catch fish with a net, but not with a
line. This difference is important enough to be called ontological; and it shows
how increasing complexity makes a new ontological level emerge out of an onto-
logically simpler level. It could also be argued that in (dis)solving the controversy
Normativity in language and linguistics 291

between individualism and collectivism, we eo ipso show that the contrast between
psychological and social, which was taken for granted in much of the previous dis-
cussion, is more apparent than real. In so doing, we have been forced to revise the
meanings of these two words, i.e. psychological and social, to some extent.
The preceding discussion suggests that the metaphor of social network
should be taken seriously. The same analogy may also illustrate the distinction
between (subjective) intuition and (intersubjective) norm, which may at first seem
a little puzzling.
Institutions are about norms. Norms are learned on the basis of observation,
but once they are known, they can no longer be just a matter of observation be-
cause they are made use of to judge whether an observed (or imagined) action is
correct or not:
The correctness of a performance is not among its perceptual characteristics; it
cannot be, since it is a relation between the performance and an adopted rule
[= norm] a relation which is more fully expressed by the statement that the
performance conforms to the adopted rule. (Krner 1960:117)

The subjective (non-observational) knowledge of norms is called intuition. It is


a general truth, labeled Humes guillotine, that knowledge of norms (i.e. of what
ought to be done) cannot be reduced to observation (of what is done).
In the definition of common knowledge, it is the first level, i.e. A knows-1
X, which corresponds to that standard type of (subjective) linguistic intuition
which is used in gathering the data that constitutes the basis for grammar-writ-
ing: A knows that y is a correct sentence. The second and third levels are also
of intuitional character; but more importantly, they bring out the interactional
nature of language or of social facts in general. Moreover, there is also theoretical
understanding about the three-level knowledge as a whole: Although I am just
one knot in the social network, i.e. a single person qua member of an institution,
whose knowledge and action constitute just a small contribution to its existence,
it is nevertheless possible for me to reflect on the institution as a whole.
The social world, understood as an object of common knowledge, is co-
extensional with Poppers (1972) world-3, though without the latters Platonist
overtones. The ineluctably interactional nature of all social facts was beautifully
captured by Marx and Engels (1973 [1846]:37):
Es zeigt sich hier, dass die Individuen allerdings einander machen, physisch und
geistig, aber nicht sich machen. (= So we see that in a physical and spiritual sense
individuals make each other, but do not make themselves.)
292 Esa Itkonen

4.4 Normativity in language

The fundamental distinction between linguistics and any genuine natural science
consists in the fact that the subject matter of the former is inherently normative
whereas the subject matter of the latter is inherently non-normative. Now the no-
tion of normativity needs to be explicated more narrowly.
First of all, we have to establish the distinction between a rule-sentence such
as (6), which describes a rule (or norm), and an empirical hypothesis such as (7),
which describes an (assumed) regularity.
(6) In English, the definite article (i.e. the) precedes the noun (e.g. man)
(7) All ravens are black.

The difference between (6) and (7) consists in the fact that (7) can be (and in
fact has been) falsified by spatiotemporal occurrences, namely non-black ravens,
whereas (6) is not, and cannot be, falsified. The utterance of a sentence (8) does
not falsify (6). Why? because this sentence is incorrect. Nor does the utterance
of a sentence like (9) falsify (6). Why? because this sentence is correct. Thus, (6)
is unfalsifiable (on the basis of spatiotemporal occurrences).
(8) *Man the came in.
(9) The man came in.

The difference between rule-sentences and empirical hypotheses has been occa-
sionally recognized in the philosophy of the social sciences, e.g. by Ryan (1970),
who, to be sure, fails to distinguish between rules (= object of description) and
rule-sentences (= description):
A causal generalization has only one task to fulfil, namely telling us what will
and will not happen under particular conditions, irregularities are thus falsifying
counter-examples to the causal law. But rules [i.e. rule-sentences] are not falsifi-
able in any simple way except of course that it may be false to say that there is
a rule and breaches of a rule are errors on the part of those whose behavior is
governed by it. (p. 141; emphasis added)

In general, however, the distinction at issue has remained in some sort of method-
ological limbo. On the one hand, one may be willing to admit that perhaps just
perhaps there may indeed exist something that resembles this distinction. On
the other hand, one refuses to draw any methodological consequences from the
(possible) existence of this distinction.
What is at issue here is the normativity of language: sentences are normative
(i.e. correct or incorrect) entities whereas birds are not (or, at least, not in the same
Normativity in language and linguistics 293

sense as sentences are). The normativity of language is ignored in traditional phi-


losophy of language, as shown by the fact that the distinction between sentences
and (e.g.) birds is ignored. At the face of it, this is a curious fact, because philosophy
of language is brimming with talk about rules of language. In practice, however,
no examples of these rules are ever given. Because the discussion is carried out at
such a high level of generality, the distinction between sentences and (e.g.) birds
is destined to remain hidden. Among philosophers of language, to be sure, there
are some laudable exceptions, for instance Cavell (1971a [1958], 1971b [1962]).
In reality, the meanings of words are all based on corresponding rules: there
are rules which determine that three designates a number, i.e. 3, and not a plant,
whereas tree designates a plant and not a number; and so on for all words of all
languages. These rules attach meanings to forms. And then there are rules that de-
termine how meaningful forms have to be combined. One rule of this kind is de-
scribed by our rule-sentence (6). Other such rules deal with facts of government
(= rection) and agreement. It is correct to say I confided in him and incorrect to
say I confided from him; it is correct to say I am upset and incorrect to say You am
upset; and so on. As noted before, Clark (1996) assumes the existence of two cor-
responding types of rules. For any rule it is possible to construct a corresponding
rule-sentence. The status of rules may be clarified by the following remarks:
The problem for the grammarian is to construct a description ... for the enor-
mous mass of unquestionable data concerning the linguistic intuition of the native
speaker (often himself). (Chomsky 1965:20; emphasis added)
Few users of language know much in any systematic way about their language,
though obviously they can discover any number of odd bits of correct information
simply through self-observation. (Hockett 1968:63; emphasis added)

Because of their trivial or pre-theoretical character, rules and corresponding rule-


sentences possess no linguistic (or scientific) interest whatever. However, their
philosophical (or metascientific) significance is enormous. They show that, con-
trary to what is the case in the natural sciences, the basic data of grammatical
description are not particular entities (= single spatio-temporal occurrences), but
general entities (= norms) described, in principle, by general and unfalsifiable
sentences. This insight constitutes the core of response-dependency (mentioned
in Section 1.1).
The standard reaction to what precedes is to say that if the rules/norms of
language are known in an unfalsifiable way, or with certainty, there is nothing
left for the grammarian or linguist to do. But consider the case of Panini (c. 400
BC), the greatest grammarian of all (Dixon 2002:145). At the pre-theoretical
level, his contemporaries knew Sanskrit just as well as he did. But only he was able
294 Esa Itkonen

to construct the grammar that was to bear his name. Thus, once the data are in,
everything still remains to be done. Similarly, Chomsky and Hockett clearly imply
that there is a job for them to do, whatever odd bits of correct and indubitable
information the average speaker may possess about his language.
The same point can be made by briefly returning to the notion of truth con-
dition. As Wittgenstein so eloquently put it, we stand before an abyss if we start
to doubt whether or not we know the meanings of the words and sentences that
we use. But of course we know them only at the pre-theoretical level. We know
that John is easy to please is a correct English sentence (unlike e.g. *John is easy
from please) and that it means the opposite of John is difficult to please, but we
do not know the best theoretical description of this (or any other) sentence. Any
theoretical description is falsifiable by definition. But falsification in grammatical
description is not what it is in the natural sciences.
There are many other standard objections against the distinction between
rule-sentence (= A) and empirical hypothesis (= B), for instance:

If English were different, A would be falsified.


In English (as it is now), A is verified and any other formulation of the same
facts is falsified.
The definite article does not (always) precede the noun (just think of Ivan
The Terrible).
Maybe A is not falsifiable by simple observation, but neither are scientific
theories.
The terms definite article and noun are theoretical, not pre-theoretical.
A and B are formulated in dissimilar ways.
Not all rules of English are of the same type as the one described by A.
The existence of the rule described by A is a contingent and not a necessary
fact.
A is not an analytical sentence.
English has also statistical and experimental aspects not captured by A-type
sentences.

Such and similar objections have been brought together and answered in Itkonen
(2003b:Chaps 3, 6, 7); see also Section 5 below.
It should also be pointed out that the mere existence of the normativity of
language is enough to refute all varieties of physicalism (or naturalism), i.e. of
the view that physical data is all there is. If you argue for this view, you must do
so in the language of physics (and/or philosophy); and the language you use is not
physical (or naturalistic), but normative.
Normativity in language and linguistics 295

4.5 Correctness vs. rationality

In typical linguistic behavior, rational actions are performed by uttering correct


sentences. It is quite possible, however, to perform irrational actions by uttering
correct sentences, and to perform rational actions by uttering incorrect sentences,
which shows that the dimensions of correctness and rationality are independent
from each other.
Since the use of language exemplifies the general meansends hierarchy, as
noted in Section 3, it is amenable to so-called rational explanation, which is a
general explanatory model for human (and even animal) behavior:
To explain an action as an action is to show that it is rational. This involves show-
ing that on the basis of the goals and beliefs of the person concerned the action
was the means he believed to be the most likely to achieve the goal.
 (Newton-Smith 1981:241)

Even irrational behavior can be explained, if at all, only by means of rational ex-
planation, namely by exposing the reason why it was performed. This involves
coming to understand how behavior that is irrational in fact came to seem rational
to the agent. The transition from goals to means followed by the carrying-out
of the means, as codified in rational explanation, can be seen as the causal force
that brings about linguistic behavior investigated in such distinct linguistic sub-
disciplines as psycholinguistics, sociolinguistics, and diachronic linguistics (cf.
Itkonen 1983).
Using language must consist of the continuous making of linguistic choices, con-
sciously or unconsciously, for language-internal (i.e. structural) and/or language-
external reasons. (Verschueren 1999:5556; emphasis added)

This innocuous-looking statement, once its implications are spelled out, justifies
the use of rational explanation.

5. Normativity and beyond: Language change, language


psychology and typology

5.1 Language change: The need for statistics

Language change entails that old norms (or rules) are replaced by new ones.
Comparative Indo-European linguistics started with the idea of grammaticaliza-
tion. Thus, Franz Bopp claimed in 1816 that, for instance, the endings of Sanskrit
296 Esa Itkonen

verbs had originally been full personal pronouns (cf. Arens 1969:177). To give
another example, let us consider the Modern French constructions venir de + INF
and aller INF. Originally these had the concrete local meanings come from INF
and go INF. Then in some contexts these constructions were reanalyzed as having
also the temporal meanings recent past and near future. First, these meanings
were more or less accidental or pragmatic; but later they became conventionalized
or semantic. (As noted in Section 3, this pragmatic vs. semantic distinction is
just a reformulation of Pauls (1975 [1880]) distinction between okkasionelle vs.
usuelle Bedeutung.) That new conventions or norms had emerged, was evident
as soon as the temporal meanings were extended to such contexts where the old
concrete and non-temporal meanings are impossible, as shown in (10) and (11).
(10) Il vient de mourir (he has just died < he comes from dying)
(11) Il va sveiller (he will wake up < he goes wake up).

The mechanism of grammaticalization (= reanalysis-cum-extension) is discussed


e.g. by Itkonen (2002). It is a curious fact that while in theoretical linguistics much
attention has been devoted to the notion of conventionalization, the logically pri-
mary notion of convention (or normativity) has remained practically unknown.
The (typical) linguist takes the existence of language for granted. He is not
competent by training to answer the phylogenetic question concerning the origin
of language. Nor is it his business to reconstruct the process through which norms
may have emerged out of an attempt to coordinate originally non-normative ac-
tions (cf. Lewis 1969). This does not mean, however, that these are not worthwhile
questions to be asked in an interdisciplinary framework.
Traditionally, grammarians have been relying on self-invented example sen-
tences, which means that traditional synchronic linguistics has been based on intu-
itional data (for extensive documentation, see Itkonen 1991). The use of intuitional
data unites such otherwise dissimilar approaches as generativism (=Chomsky1965;
Jackendoff 1994), cognitive linguistics (= Lakoff 1987; Langacker 1987), and con-
struction grammar (= Goldberg 1995; Croft 2001). The reliance on intuitional data
is fully justified in so-called clear cases (exclusively focused upon by the six lin-
guists just mentioned), but elsewhere one has to resort to observation of actually
occurring utterances, which entails the use of statistics.
Norms of language may be more or less binding, i.e. they may determine the
correctness of expressions or sentences either in a discrete (either-or) way or in
a non-discrete (more-or-less) way. In most languages, for instance, the norms of
word order are non-discrete while the norms of affixal morphology are discrete.
The norms of word meaning are open, in the sense that there is a discrete core
Normativity in language and linguistics 297

surrounded by a non-discrete periphery: It is only in the normal cases that the


use of a word is clearly prescribed... (Wittgenstein 1958:142).
Even when the norms are discrete, the (normative) behavior they subsume
is non-discrete, which is another way of saying that they may be broken (either
deliberately or inadvertently). A much discussed example is the t/d deletion in
todays English (cf. Hudson 1997). The (discrete) norms determine the phono-
logical form of the noun mist (fog), the past tense left of the verb to leave, and
the past tense missed of the verb to miss. But in actual practice, the word-final t/d
may or may not be present, and in these three cases it is typically retained in the
following proportion: 50%65%80%. There is the experience of this statistical
pattern (based on observation), in addition to the (intuitive) knowledge of the
above-mentioned discrete rules. This duality can be captured by assuming that
what a discrete norm determines is a prototype: while a prototype is defined by its
typical features, any of these may be overridden in exceptional cases. The impor-
tant thing is that this duality must not be explained away. In particular, it would
be wrong to try to reduce the discrete norm to the corresponding non-discrete
and statistical behavior. This follows from the fact, mentioned in Section 4.3: that
ought cannot be reduced to is.
When the percentage of the norm-following behavior drops below 50%, at
the latest, we are witnessing a diachronic process which turns a discrete norm
into a non-discrete one and, in general, ultimately leads to its disappearance. This
amounts to a change of the prototype, which in turn equals a language change.
This can be a lengthy process. For instance, in one hundred years the correct pro-
nunciation of todays mist may actually be [mis]. To give a less speculative ex-
ample, it took some 300 years (i.e. between 1450 and 1750) for the construction
exemplified by (12) to be replaced by the construction exemplified by (13) as part
of the emergence of the auxiliary system of Modern English.
(12) Saw he the dragon?
(13) Did he see the dragon?

First, the latter structure was totally incorrect, and in the end it came to be totally
correct. In between, there was a gradual shift that can be described only in statisti-
cal terms (cf. Hudson 1997). In other words, language change is a prime example
of less-than-clear cases.
It is easy to see that Saussures terminological distinction between langue and
parole captures the following dichotomy: on the one hand, language as a system
of norms accessible to conscious intuition; on the other, actual spatio-temporally
specifiable linguistic behavior that is accessible to observation.
298 Esa Itkonen

5.2 Language and the psychology of language:


The need for experimentation

La langue est une institution sociale (Saussure 1962 [1916]:33). It is a general


fact that an institution or, more generally, any rule-system S can be described or
formalized in many different ways. This means that different people may view S
from different perspectives and with different descriptive goals in mind. Thus,
there is no a priori reason to assume that the description of S must aim at cap-
turing the way that S has been internalized by those who have learned it. For
instance, it is possible to describe S so as to achieve either a maximal degree of
operational efficiency or a maximal degree of logical simplicity. The types of de-
scriptions of S that result from adopting either one of these two perspectives will
differ from each other, just as they will both differ from the type of description of
S that sets the psychology of the users of S as its goal:
But what would that grand success [of sequence-extrapolating algorithms] teach
us about human perception, pattern recognition, theory formation, theory revi-
sion, and esthetics? Nothing nothing at all.
This ... brings out the vastness of the gulf that can separate different research
projects that on the surface seem to belong to the same field. ... Todays wonder-
fully powerful chess programs, for instance, have not taught us anything about
general intelligence not even about the intelligence of a human chessplayer!
Well, I take it back. Computer programs have taught us something about
how human chessplayers play namely, how they do not play. And much the
same can be said for the vast majority of artificial-intelligence programs.
 (Hofstadter 1995:5253)

This is a very clear formulation of the fact that there is a difference between a
description of S, or D1, and a description of the psychology of S (= P-S), or D2.
Thus, D1 and D2 refer to, and describe, two distinct entities, namely S and P-S.
The understanding of this distinction has been made needlessly difficult by am-
biguous terminology. On the one hand, P-S is often called knowledge of S. On
the other hand, S is by definition commonly known. This creates the wrong
impression that there is no difference between S and P-S nor, consequently, be-
tween D1 and D2.
For the sake of clarification, consider the following analogy. If I describe the
moon as I see it with the aid of a telescope, it is still the moon that I describe, and
not my vision (enhanced by the telescope). If I genuinely wish to concentrate on
my vision, and not on the moon, then I have moved from astronomy to the psy-
chology of vision. Exactly the same remarks apply to the distinction between D1
Normativity in language and linguistics 299

and D2, as Hofstadter so well demonstrates. It is only D2 which aims at psycho-


logical reality whereas D1 has other desiderata (e.g. efficiency or simplicity).
Once you have grasped this distinction, you realize that it applies practically
everywhere. For instance, there is a difference between geometry and the percep-
tion of geometrical figures and shapes (cf. Itkonen 1983:13). In just the same
way, there is a difference between formal logic and psychology of logic (cf. Itkonen
2003a:Chap. XV). In linguistics, the matters may at first seem less clear. Therefore
it is good immediately to point out that there are quite uncontroversial cases of
non-psychological grammatical descriptions. For instance, it is a fact, pointed out
by Paul Kiparsky (p.c.), that Paninis grammar does not strive after psychological
reality. Similarly, in arguing against the view that linguistics is psychology, Katz
(1981) operates with the concept of optimal grammar:
[There should be no] constraints that impose a ceiling on the abstractness of
grammars by tying them down to one or another particular [i.e. physical or psy-
chological] reality (p. 52).
A grammar G is an optimal grammar for the language L, if ...G ...implies ev-
ery true evidence statement about L ...and there is no grammar simpler than G...
(p. 67; emphasis added).
[O]n the most natural definition, an optimal grammar is a system of rules
that predicts each grammatical property and relation of every sentence in the lan-
guage and for which there is no simpler (or otherwise methodologically better)
such predictively successful theory. (Katz 1985:201; original emphasis deleted)

However, Katzs references to optimal grammar remain rather unconvincing, be-


cause he is unable to exemplify this concept. Therefore it is important to empha-
size that, within the world history of linguistics, this concept has already been
exemplified rather well, namely by Paninis grammar:
[Paninis grammar] is the most comprehensive generative grammar written so far
(Kiparsky 1979:18). Modern linguistics acknowledges [Paninis grammar] as the
most complete generative grammar of any language yet written, and continues to
adopt technical ideas from it. (Kiparsky 1993:2912)

The same laudatory view of Paninis grammar has been both documented and
argued for in Itkonen (1991:Chap. 2, esp. pp. 6870). In the present context it is
important to understand that, in addition to being the best generative grammar,
Paninis grammar is by Kiparskys own admission (cf. above) also a non-psy-
chological grammar, which means that it is indeed a serious candidate for being
the Katz-type optimal grammar.
300 Esa Itkonen

The notion of non-psychological or autonomous linguistics has been ana-


lyzed in Itkonen (1978) and Kac (1992). Katz (1981, 1985) gives it a Platonist
interpretation, but there is really no reason to do so:
The properties Katz assigns to abstract objects appear all to be possessed by the
kind of conventions of mutual knowledge that Esa Itkonen argues are constitu-
tive of linguistic rules (Itkonen 1978; not cited in Katz 1981).(Pateman 1987:52)

While language is identical with a system of (social) norms, psychology of lan-


guage is identical with the structures and processes involved in speech under-
standing and production as well as in the mental storage of linguistic units. In
Itkonen (1983) this distinction was conceptualized as holding between (social)
norms and (individual-psychological) internalizations-of-norms. It is in connec-
tion with the latter that the need for experimentation arises. This can be illustrated
by means of what is probably the most famous example in recent decades.
The standard theory of generative linguistics, as expounded in Chomsky
(1965), made use of a descriptive apparatus consisting of transformations that
convert deep structures into surface structures. This is one possible method of
presenting intuition-based data in a systematic way; indeed, it was already used by
Apollonius Dyscolus, who wrote the oldest extant syntactic treatise of the West-
ern tradition (cf. Itkonen 1991:206211). But is it also psychologically adequate?
And how can this question be answered, in the first place?
Experimentation provides the answer. If transformations are psychologically
real processes, they must take time to be performed. Hence, the hypothesis is that
there are longer reaction times connected with producing and/or understand-
ing sentences that involve more (rather than less) transformations. Experimental
data give this verdict: [T]he hypothesis that the operations that the subjects per-
formed were grammatical transformations is actually disconfirmed by the data
(Fodor et al. 1974:241).
That it is perfectly legitimate to use transformations in grammatical descrip-
tion (= autonomous linguistics) in spite of their psychological non-reality, shows
that, in Hofstadters (1995) words, there is a gulf that separates intuition-based
autonomous linguistics from experimental psycholinguistics. More precisely, the
data of the former is of pre-experimental character; it is a precondition for the data
of the latter: One cannot make experiments if there are not some things that one
does not doubt (Wittgenstein 1969:337).
The existence of pre-experimental linguistic knowledge has occasionally been
acknowledged: It is pointless to run an experiment which shows that if some-
thing is a pencil, appropriately motivated English speakers will call it pencil.
Anyone who knows English knows that already (Fodor et al. 1974:399400).
Normativity in language and linguistics 301

This type of experiment would be a slightly absurd exercise, with the results a
foregone conclusion (Wason and Johnson-Laird 1972:78). However, the larger
implications have remained unexplored and poorly understood.
The ambiguity of non-psychological vs. psychological study of language is
well illustrated by the notion of analogy. On the one hand, analogy may be just
a convenient descriptive device for presenting the data. On the other, analogy
may be meant to capture the actual structure-cum-process that brings linguistic
behavior about (cf. Itkonen 2005a).

5.3 The nature of typological linguistics

Up to now we have come across three distinct types of linguistic data, namely
intuitional, observational, and (observational-)experimental. The two latter types
deal with frequencies of spatio-temporal occurrences and thus require a statistical
mode of description. This division of labor between different linguistic subdisci-
plines was already set forth in Itkonen (1977) and (1980).
What is the status of typological linguistics from the present perspective? An
in-depth analysis of the reference grammars of ten more or less exotic languages
reveals a general lack of any statistical means of description (cf. Itkonen 2005b).
This shows that, once again, we are dealing with intuitional data. In many cases,
however, what we have is not the intuition of a (field) linguist, who, while writing
his grammar, may still be in the process of learning the language to be described,
but the intuition of his informant(s). In other words, we are dealing with elicita-
tion. Haiman (1980:xi) gives an eloquent account of this method:
I will always remember Kamani Kutane for his thought experiments: given a min-
imally contrasting pair of sentences, he would construct elaborate background
stories which would be appropriate for only one of these sentences. Eventually I
would understand one of these, and we could move on. It was by means of such
continued thought experiments that he was able to make clear to me the meaning
of that most mysterious of all Hua forms, the gerund -gasi.

As shown by this quotation, and as argued in Itkonen (2004), the study of exotic
languages is based on empathy as a form of intersubjectivity, or in Collingwoods
(1946:218) words our capacity of rethinking the same thought which created
the situation we are investigating, and thus coming to understand this situation.
But once we have become aware of empathy in this context, we realize that we
have been using it all the time. For instance, we can explain the grammaticaliza-
tion of the constructions venir de INF and aller INF in the way we do (cf. Sec-
tion3.1), only because we understand the processes of reanalysis and extension
302 Esa Itkonen

that are involved here; and we understand them, because we can re-enact them,
i.e. we realize that we could have done the same thing. On reflection, this turns out
to be an application of the model of rational explanation (cf. Section 2.5).

6. Conclusion: The roots of the anti-normative bias


in theoretical linguistics

Considering everything that has been said so far, one naturally wonders: Why
has there been such a pronounced inclination to ignore the ineluctably normative
character of language? There are many reasons, of which I mention here only two.
First, there is sheer intellectual laziness:
[It is wrong] to consider the salient features of an object as representative of its
totality. In this way the evident concreteness of the sound of words leads one to
ignore the extent to which use, however intangible, is necessary to word-hood.
 (Friedman 1975:94, emphasis added; discussed in Itkonen 1978:182183)

Notice that it is the same, or very similar, fallacy that underlies the entire Carte-
sian tradition mentioned in Section 2. This is the Cartesian argument in outline:
I see, and hence I know, that this thing in front of me is a burning candle; but I do
not see anyone else in the room; thus when I know what I know about the thing in
front of me, I am alone; therefore my knowledge is not social but subjective; and
what is true of my knowledge here and now is true of every type of knowledge.
Once this argument has been spelled out, one cannot help marvelling how simple,
and simple-minded, it really is.
Second, there is the temptation to replace the (normative) correct vs. incor-
rect distinction by the (non-normative) possible vs. impossible distinction. Thus,
Jackendoff (1994:4950) claims that, unlike a sentence like Harry thinks Beth is a
genius, a sentence like Amy nine ate peanuts is not a possible sentence of English.
However, it is not only the case that this is a possible sentence of English. We see
with our own eyes that it is also an actual sentence of English, namely incorrect
English. It must be actual because (an exemplification of) it occurs in space and
time (cf. Dretske 1974:2425; Itkonen 2003b:142144).
But why should it be tempting, in the first place, to replace normative by non-
normative? because of the prestige enjoyed by the natural sciences. The data of
physics is inherently non-normative. From this, it has been wrongly inferred that
the data of linguistics too must be non-normative, come what may.

. The discovery of mirror neurons seems to have revitalized the notion of empathy, as shown
in detail by Barresi and Moore (this volume).
Normativity in language and linguistics 303

Is there, then, no normativity in the natural sciences? Of course there is. Just
think of protophysics which investigates the set of norms for measuring space,
time, and mass (cf. Bhme 1976). But protophysics is not physics: It is one thing
to describe methods of measurement, and other to obtain and state results of
measurement (Wittgenstein 1958:242). As argued in Itkonen (1978:4248) and
elsewhere, protophysics is in a certain sense a methodological equivalent of au-
tonomous linguistics. Still, this is an imperfect analogy because what protophys-
ics deals with are norms of researchers, not of research objects.
In sum, I have argued in this chapter that normativity is indispensable for the
existence of language, and that it has been often without self-awareness pivotal
for linguistics from its very dawn. To remain blind to this obvious fact, a strong
bias has indeed been needed.

Acknowledgments

I wish to thank Jordan Zlatev for comments and for his help in editing an earlier
version of this chapter.

References

Arens, H. 1969. Sprachwissenschaft: Der Gang ihrer Entwicklung von der Antike bis zur Gegen-
wart. Band I. Frankfurt a/M: Athenum.
Barresi, J. and Moore, C. (this volume) The neuroscience of social understanding.
Blackburn, S. 1984. Spreading the Word. Oxford: Oxford University Press.
Bhme, G. (ed.) 1976. Protophysik. Frankfurt a/M: Suhrkamp.
Cavell, S. 1971a [1958]. Must we mean what we say? In Philosophy and Linguistics. C. Lyas
(ed.), 131165. London: Macmillan.
Cavell, S. 1971b [1962]. The availability of Wittgensteins later philosophy. In Philosophy and
Linguistics. C. Lyas (ed.) 1971, 166189. London: Macmillan.
Chomsky, N. 1965. Aspects of a Theory of Syntax. Cambridge, MA: The MIT Press.
Clark, H. 1996. Using Language. Cambridge: Cambridge University Press.
Collingwood, R.G. 1946. The Idea of History. Oxford: Clarendon Press.
Croft, W. 2001. Radical Construction Grammar. Oxford: Oxford University Press.
Dretske, F. 1974. Explanation in Linguistics. In Explaining Linguistic Phenomena. D. Cohen
(ed.), 2141.Washington, D.C.: Hemisphere Publishing Company.
Dixon, R.M.W. 2002. Australian Languages. Cambridge: Cambridge University Press.
Fodor, J., Bever, T. and Garrett, M. 1974. Psychology of Language. Cambridge, MA: The MIT
Press.
Friedman, H.R. 1975. The ontic status of linguistic entities. Foundations of Language 131:
7394.
Goldberg, A. 1995. Constructions. Chicago: University of Chicago Press.
304 Esa Itkonen

Haiman, J. 1980. Hua: A Papuan Language of the Eastern Highlands of New Guinea. Amster-
dam: Benjamins.
Haukioja, J. 2000. Grammaticality, response-dependency, and the ontology of linguistic ob-
jects. Nordic Journal of Linguistics 23: 325.
Heil, J. 1992. The Nature of True Minds. Cambridge: Cambridge University Press.
Hockett, C.F. 1968. The State of the Art. The Hague: Mouton.
Hofstadter, D. 1995. Fluid Concepts and Creative Analogies. London: Penguin Books.
Hudson, R. 1997. Inherent variability and linguistic theory. Cognitive Linguistics 81: 73108.
Itkonen, E. 1977. The relation between grammar and sociolinguistics. Forum Linguisticum
I/3: 238254.
Itkonen, E. 1978. Grammatical Theory and Metascience. Amsterdam: Benjamins.
Itkonen, E. 1980. Qualitative vs. quantitative analysis in linguistics. In Evidence and argumen-
tation in linguistics. T.A. Perry (ed.), 334366. Berlin: deGruyter.
Itkonen, E. 1983. Causality in Linguistic Theory. London: Croom Helm.
Itkonen, E. 1991. Universal History of Linguistics: India, China, Arabia, Europe. Amsterdam:
Benjamins.
Itkonen, E. 1997. The social ontology of linguistic meaning. SKY: The Yearbook of the Linguis-
tic Association of Finland: 4980.
Itkonen, E. 2002. Grammaticalization as an analogue of hypothetico-deductive thinking. In
New Reflections on Grammaticalization. I. Wischer and G. Diewald (eds.), 413422. Am-
sterdam: Benjamins.
Itkonen, E. 2003a. Methods of Formalization beside and inside both Autonomous and non-Au-
tonomous Linguistics. University of Turku: Publications in General Linguistics 6.
Itkonen, E. 2003b. What is Language? A study in the Philosophy of Linguistics. University of
Turku: Publications in General Linguistics 8.
Itkonen, E. 2004. Typological explanation and iconicity. Logos and Language V/1: 2133.
Itkonen, E. 2005a. Analogy as Structure and Process: Approaches in Linguistics, Cognitive Psy-
chology and Philosophy of Science. Amsterdam: Benjamins.
Itkonen, E. 2005b. Ten non-European Languages: An Aid to the Typologist. University of Turku:
Publications in General Linguistics 9.
Jackendoff, R. 1994. Patterns in the Mind. New York: Basic Books.
Kac, M. 1992. Grammars and Grammaticality. Amsterdam: Benjamins.
Katz, J. 1981. Language and Other Abstract Objects. Oxford: Blackwell.
Katz, J. 1985. An outline of Platonist Grammar. In The Philosophy of Linguistics. J. Katz (ed.),
172203. Oxford: Oxford University Press.
Kenny, A. 1975. Wittgenstein. London: Penguin Books.
Kiparsky, P. 1979. Panini as a Variationist. Cambridge, MA: The MIT Press.
Kiparsky, P. 1993. Paninian linguistics. In The Encyclopedia of Language and Linguistics,
Vol.1(6). R.E. Asher (ed.), 29182923. Oxford: Pergamon Press.
Krner, S. 1960. The Philosophy of Mathematics. London: Hutchinson.
Lakoff, G. 1987. Women, Fire, and Dangerous Things. Chicago: University of Chicago Press.
Langacker, R. 1987. Foundations of Cognitive Grammar, Vol. I: Theoretical Perspectives. Stan-
ford, CA: Stanford University Press.
Leech, G. 1983. Principles of Pragmatics. London: Longman.
Lewis, D. 1969. Convention. Cambridge, MA: Harvard University Press.
Marx, K. and Engels, F. 1973 [1846]. Die deutsche Ideologie. Werke, Band 3. Berlin: Dietz
Verlag.
Normativity in language and linguistics 305

Newton-Smith, W.H. 1981. The Rationality of Science. London: Routledge.


ONeill, J. (ed.) 1973. Modes of Individualism and Collectivism. London: Heinemann.
Pateman, T. 1987. Language in Mind and Language in Society. Oxford: Clarendon Press.
Paul, H. 1975 [1880]. Prinzipien der Sprachgeschichte. Tbingen: Niemeyer.
Pettit, P. 1996. The Common Mind, 2nd ed. Oxford: Oxford University Press.
Popper, K. 1957. The Poverty of Historicism. London: Routledge.
Popper, K. 1972. Objective Knowledge. Oxford: Oxford University Press.
Russell, B. 1967 [1940]. An Inquiry into Meaning and Truth. Pelican Books.
Saunders, J.T. and Henze, D.F. 1967. The Private-Language Problem. New York: Random
House.
Saussure, F. de. 1962 [1916]. Cours de linguistique gnrale. Paris: Payot.
Sinha, C. 1988. Language and Representation: A Socio-Naturalistic Approach to Human Develop-
ment. New York: Harvester.
Verhagen, A. this volume, Intersubjectivity and the achitecture of the language system.
Verschueren, J. 1999. Understanding Pragmatics. London: Arnold.
Wason, P.C. and Johnson-Laird, P. 1972. Psychology of Reasoning. Cambridge MA: Harvard
University Press.
Winch, P. 1958. The Idea of a Social Science. London: Routledge.
Wittgenstein, L. 1958. Philosophical Investigations, 2nd ed. Oxford: Blackwell.
Wittgenstein, L. 1969. On Certainty. Oxford: Blackwell.
Zlatev, J. 2007. Language, embodiment and mimesis. In Body, Language and Mind, Vol. I: Em-
bodiment. T. Ziemke, J. Zlatev and R. Frank (eds.), 297 337. Berlin: Mouton de Gruyter.
Zlatev, J. this volume. The co-evolution of intersubjectivity and bodily mimesis.
chapter 13

Intersubjectivity and the architecture


of the language system

Arie Verhagen

Certain lexical and grammatical units encode aspects of intersubjective coor-


dination. On the basis of discourse connectives, and especially of negation and
complementation, linguistic communication is argued to be inherently argu-
mentative, a matter of influencing other peoples attitudes and beliefs. Intersub-
jectivity is built into the very structure of grammar, and systematic properties of
grammar show that mutual influencing, rather than just sharing information or
joint attention is at the heart of human language. Because of that, language can
on the one hand be seen as a special case of animal communication systems,
which basically involve management and assessment of other organisms, no-
tably conspecifics. On the other hand, an important difference is precisely that
this management and assessment is indirect, presupposing shared knowledge,
and aimed at other minds.

1. Introduction

Human languages have several features that are candidates for the status of dis-
tinctive characteristic in comparison to communication systems of other animals
(Hockett 1958). Some of these have a special connection to the concept of inter-
subjectivity, understood as the mutual sharing of experiential-conceptual content
between subjects of experience. Thus, the basically conventional character of the
relation between (observable) form and (unobservable) function in the symbols of
human languages presupposes intersubjectivity: conventions are mutually shared
solutions to coordination problems, rules that are followed because of the expec-
tation that others will follow them and because one knows that others expect one
to follow them (Lewis 1969; Keller 1998; Itkonen, this volume, traces the origins
of this insight to Wittgensteins famous argument against private language and
argues that it entails that linguistic phenomena are inherently normative in a way
that does not allow a reduction to strictly physical phenomena). Being mutually
shared is at the core of any definition of intersubjectivity (Zlatev, this volume),
308 Arie Verhagen

so that linguistic symbols, being conventional, are necessarily intersubjectively


grounded. The way linguistic conventions emerge, change and are maintained
thus provides a special window on human intersubjectivity.
Another such feature is referentiality: the systematic use of a signal to make
another individual pay attention to a specific phenomenon in the world. System-
atic links between a signal and the external world have been shown to exist in
other animal communication systems, but their character seems to differ system-
atically from that of linguistic symbols. With respect to the famous case of the
different alarm calls of vervet monkeys (Cheney and Seyfarth 1990) distinct for
leopards, eagles, and snakes Tomasello (2003:10) comments:
It seems as if the caller is directing the attention of others to something they
do not perceive [], that is the calls would seem to be symbolic (referential).
But several additional facts argue against this interpretation. First, there is basi-
cally no sign that vervet monkeys attempt to manipulate the attentional or mental
states of conspecifics in any other domain of their lives. Thus, vervets also have
different grunts that [] mainly serve to regulate dyadic social interactions not
involving outside entities, such as grooming, playing, fighting, sex, and travel.
Second, predator-specific alarm calls turn out to be fairly widespread in the ani-
mal kingdom. They are used by a number of species from ground squirrels to
domestic chickens that must deal with multiple predators requiring different
types of escape response (Owings and Morton, 1998), but no one considers them
to be symbolic or referential in a human-like way.

The special kind of referentiality found in human communication by means of


language thus also seems to be intimately tied up with intersubjectivity. It crucial-
ly involves a triadic relationship of sharing attention for an outside object with an-
other individual. A similar comment applies to the discovery of individual vocal
signatures, not dependent on voice characteristics of the caller, used by bottlenose
dolphins (Janik, Sayigh and Wells 2006). These have been compared to names in
human languages, because of the fact that they identify an individual uniquely
through the shape of the signal (not by voice characteristics), i.e. a dyadic rela-
tionship, in this case between language and the world.
But humans only use their names themselves when introducing themselves
to strangers; it is others who use names to address a specific individual and to
talk about such an individual to others. In contrast, about half of a dolphins
whistles in the wild consists of signature whistles (Janik et al. 2006:8295), while
in human language use, first person pronouns like English I, i.e. the same form
for different individuals, belong to the most frequent words (e.g. De Jong 1979
for spoken Dutch). Thus, the use of deictic elements such as I as well as that of
names in human language is dependent on understanding so-called role-reversal,
which involves a triadic relationship of joint attention for an object (Tomasello
Intersubjectivity and the language system 309

1999:103107). Taking the common core of intersubjectivity to precisely con-


sist of sharing (of some mental content) and joint attention as a paradigmatic
instance (Zlatev, this volume), intersubjectivity is precisely what distinguishes the
dolphins signature calls from the way human names work, and from human ways
of referring to oneself.
Still, detailed studies of animal communicative behavior are highly relevant to
understanding human behavior and human language, if only because they help to
unwrap initially holistic concepts such as referentiality, names, and the like into
different aspects, some of which have clear parallels in the animal kingdom (in
the cases mentioned: picking out a specific category of phenomena in the world,
or unique identification of individuals). In that way they contribute to linking hu-
man language to other phenomena in the natural world, to the prospect of a more
complete understanding of language a cultural phenomenon as grounded in
biology, and thereby to linking culture to nature.
Now how about the concept of intersubjectivity itself? Can we distinguish
different aspects of this phenomenon too, such that at least some of them can
insightfully be regarded as comparable to aspects of animal communication? In
this chapter, I want to argue that we can, in fact: that we should, given a proper
understanding of crucial components of meaning and grammar.

2. Is intersubjectivity something completely different?

Tomasello (2003:12) lists the following points of difference between language and
communicative signals of other primate species:

1. Language is socially learned and transmitted culturally.


2. Linguistic signals are conventional, i.e. understood intersubjectively (cf.
Section 1).
3. Linguistic signals are not used dyadically to regulate social interaction di-
rectly, but rather they are used in utterances referentially (triadically) to direct
the attentional and mental states of others to outside entities (ibid.).
4. Linguistic signals are sometimes used declaratively, simply to inform other
persons of something, with no expectation of an overt behavioral response
(ibid.).

. At this point, Tomasello refers to Dunbar (1996), who puts forward the hypothesis that
language originated in the process of gossip, the sharing of information for purposes of social
bonding.
310 Arie Verhagen

5. Linguistic signals are fundamentally perspectival in the sense that a person


may refer to the same entity as dog, animal, pet, or pest, or to the same event
as running, fleeing, moving, or surviving depending on her communicative
goal with respect to the listeners attentional states (ibid.).

Properties (2), (3) and (4) necessarily involve intersubjectivity. Property (5) may
function, as Tomasello indicates, in an intersubjective way, but it certainly need
not: different construals of the same entity or event may also be useful for a single
individuals interaction with the world, as different categorizations (e.g. as pet or
as pest) invite different types of action. Property (2), conventionality, has already
been discussed; intersubjectivity here provides the foundation for the way linguis-
tic signals function in a community. It is in properties (3) and (4) that intersub-
jectivity enters into the character of the messages conveyed by linguistic signals
themselves. Moreover, the two directing someones attention to an outside en-
tity, and informing someone of something are obviously closely connected.
With respect to these two features, Tomasello construes the specific character
of human language in opposition to that of communicative signals of other pri-
mates; language involves joint attention and sharing information, whereas animal
communication is dyadic, and consists of inducing behavior, such as an escape
response in conspecifics. Owings and Morton (1998), to whom Tomasello refers
in this connection, have developed this idea in great detail for many species using
vocal communication. They describe their approach themselves in a program-
matic way as follows:
This book provides a discussion of animal vocal communication that avoids hu-
man-centered concepts and approaches, and instead links communication to
fundamental biological processes. []. Animals use signals in self-interested
efforts to manage the behavior of other individuals, and they do so by exploit-
ing the active assessment processes of other individuals. [] Communication
reflects the fundamental processes of regulating and assessing the behavior of
others, not of exchanging information. (Owings and Morton 1998:i)

Consider the vervet monkeys alarm calls mentioned by Tomasello (cf. above).
Even if the call is species-specific, there is no reason to say that its meaning consists

. In cognitive linguistics, this is known as the fundamental phenomenon of construal (Lang-


acker 1987). For an overview, see Verhagen (2007).
. Feature 1 does not imply intersubjectivity. For example, elements and structure of birdsong
are culturally transmitted (transferred by learning, observation, memorization and copying of
behavior; cf. Hultsch and Todt 2004), and intense interaction facilitates this learning and the
quality of the result, but it certainly does not presuppose any mutual sharing of memory or
experience.
Intersubjectivity and the language system 311

of reference to the predator (the individual, or the category). The meaning of the
call is to induce predator-specific escape responses. The way Owings and Morton
characterize animal communication presupposes that exchange of information
does constitute the basic function of human communication by means of lan-
guage, and as we have seen, Tomasello also construes some of the crucial differ-
ences between animal communication and human language in this way.
But what if human language is also fundamentally a matter of regulating and
assessing others, with exchange of information being secondary? No doubt, the
descriptive power of human languages greatly exceeds that of animal commu-
nication systems (as far as we know), but that does not yet imply that linguistic
meaning primarily consists in descriptive information and that regulatory effects
are derivative; in principle, it may still be the other way around. Precisely this lat-
ter position is a crucial part of the conceptual framework developed in Verhagen
(2005). It is this idea that I will develop and demonstrate further in this chapter.
The evidence I will be considering consists of systematic characteristics of linguis-
tic elements, especially from the domain of grammar, i.e. words and construc-
tions that provide the scaffolding for sentences and discourse.

3. Argumentativity: Concepts and methods

3.1 Argumentativity and conventional meaning

When one individual produces a linguistic utterance for another one, and this oth-
er individual understands it, the result is in systematic ways always more than the
participants jointly focusing on the same object of conceptualization in the same
way. It also consists in inducing, and engaging in, inferential reasoning. Normal
language use is never just informative, but always argumentative, in the terminol-
ogy of Anscombre and Ducrot (1989). Engaging in verbal communication comes
down to, for the speaker/writer, an attempt to influence someone elses thoughts,
attitudes, and sometimes immediate behavior even when a speaker simply says
Over there in response to a Wh-question like Where is the bus stop? (cf. below, end
of this section). For the addressee it involves finding out what kind of influence it

. Behavioral biologists may to some extent differ on the question whether a notion of infor-
mation has any role at all to play in explaining animal communication, but such differences
are relatively marginal. Thus, although Bradbury and Vehrenkamp (2000) do not agree entirely
with Owings and Morton (1998), their initial statement also reads: It is widely agreed that ani-
mal signals modulate decision making by receivers of the signals (Bradbury and Vehrenkamp
2000:259, referring to the seminal work of Dawkins and Krebs 1978).
312 Arie Verhagen

is that the speaker/writer tries to exert, and deciding to go along or not. In terms
of intersubjectivity: the process of verbal communication involves partially shared
and partially divergent experiential-conceptual content, that communicating sub-
jects attempt to coordinate on by means of (the speaker) attempting to influence
the others inferences and (the addressee) assessing such attempts.
In itself, this is not incompatible with an information view of linguistic mean-
ing. The constant, conventional function of ordinary words and constructions
might consist in the information they provide, with rhetorical effects coming on
top of that, depending on the context, and thus being variable. However, Ans-
combre and Ducrot argue for the opposite position, which is therefore sometimes
characterized as a theory of argumentativity in the language system. The default
condition for ordinary expressions, in this view, is that they provide an argument
for some conclusion, and this argumentative orientation is what is constant in the
function of the expression, while its information value is more variable.
For example, in a commentary to the Dutch national Budget for the year 2001
the most favorable one in many years government officials from the Ministry of
Finance wrote that there was a prospect of a negative deficit, thereby indicating
that there were more reasons than ever to control the budget. A criterion of ad-
equacy for a semantic theory is that it should explain why the effect of this expres-
sion on addressees is systematically different from that of the expression surplus,
despite the fact that this is truth-functionally equivalent. The point is that the word

. When pronominal reference to the roles of speaker/writer and hearer/addressee is called


for, I will adopt the practice of using feminine forms (she, her) for the former, and masculine
ones (he, his) for the latter.
. This idea of argumentativity in natural language is to an important extent in agreement
with the basic position adopted by Levinson (2000): most inferences associated with an expres-
sion, even if they are defeasible, are conventional, and not computed on-the-fly, contrary to
certain traditional and newer approaches in (Gricean) pragmatics, notably Relevance Theory (cf.
Sperber & Wilson 1986); i.e. they cannot be conceived of, in a cognitively realistic approach to
semantics and pragmatics, as conversational implicatures that are derived on the basis of some
strictly truth conditional content plus knowledge of the context. Rather, such inferences are de-
rived because of the use of the expression itself. Thus, the argumentativity approach I advocate
here and Levinsons concept of generalized implicatures are in agreement about the idea that
strength is a normal and crucial part of semantics, and also with Itkonens (this volume), notion
of conventional meaning as frozen action. An important difference between the argumentativ-
ity approach and these other pragmatic-semantic approaches to meaning in natural languages,
resides in the distinction, in the argumentativity approach, between orientation and strength.
As we will see in the remainder of this chapter, precisely this distinction allows for a general
treatment of some seemingly distinct phenomena. More generally, the argumentativity approach
comprises, in a single conceptual framework, a number of notions that have been developed
independently of each other in different fields of pragmatic research (cf. also Note 8).
Intersubjectivity and the language system 313

deficit is conventionally associated with warning, i.e. counts as an argument to cut


spending. The use of the word negative does not reverse this argumentative status.
On the contrary, it strengthens the point because it adds its own rhetorical force,
which points in the same direction as deficit (that of warning). As with deficit, this
must be considered an inherent part of the meaning of the word.
It is the conventional meanings of these words that allowed the writers of
this text to use them in attempting to regulate the attitudes of their readership
in a way that is in their interests. If this effect were something that comes on top
of the informational value of the utterance inferred by readers in context, after
having computed the information , then it is impossible to explain why there is
a systematic difference in signal value between negative deficit and surplus: the
information value, which in this view is the starting point for readers inferences,
is not different, so readers should be able to reach the same conclusions in both
cases. But in fact, the inferences to be drawn from these two expressions are each
others opposites. What we can now observe is that the information value of the
term deficit is more variable than its argumentative value: in combination with
the term negative, it is compatible with situations that can also be described as
surplus, but even in such a combination, what remains constant is the argumenta-
tive value of the signal.
Or consider a very simple sentence as in (1) (Ducrot 1996:42).

(1) There are seats in this room.

What are the properties of situations in which the utterance of (1) is appropriate?
At first sight, the argumentative character of (1) may not be apparent, and one
might think that understanding the utterance just consists of knowing how to
check it against reality: are there seats in the room or not? Here it is crucial to take
into account that understanding an utterance at least includes knowing how it fits
into the ongoing discourse, i.e. how it relates to preceding and following utteranc-
es. People do not communicate by means of isolated sentences, but by means of
discourse consisting of multiple utterances that enter into specific relations with
each other, such as question-answer, cause-consequence, problem-solution, and
the like. There are even special classes of elements that provide instructions on
how to connect pieces of discourse, i.e. anaphora, and especially: different kinds
of conjunctions (and, but, because, etc.) and connecting adverbs and adverbial
phrases (so, as well, yet, etc.) jointly: discourse connectives. So when investigat-
ing the meaning of an utterance containing the word seat, as in (1), we should not

. This is not to say that the argumentative value of an expression can never be reversed, but
this requires the use of special elements (e.g. a negative argumentative operator like barely); cf.
Section 4.
314 Arie Verhagen

only look at how it relates to the/some world, but also at the kinds of discourse
that it fits in a coherent way, and the kinds that it does not fit well.
So consider what happens when the utterance following (1) is something like
They are uncomfortable. How to connect this to (1)? The obvious way is to use a
contrastive conjunction like but. Something like and moreover would be highly
incongruous. Schematically (# indicating lack of coherence):
(2) There are seats in this room.
a. But they are uncomfortable.
b. #And moreover, they are uncomfortable.

The reverse is the case if the next utterance is They are comfortable:

(3) There are seats in this room.


a. #But they are comfortable.
b. And moreover, they are comfortable.

What (2) shows is that (1) as such induces an addressee to make positive infer-
ences about the degree of comfort provided in this room. This is apparent from
the need to use the contrastive conjunction but when the next utterance cancels
this inference (because of uncomfortable), and from the strangeness of the addi-
tive connective in (2b). Saying that the seats are uncomfortable is not adding a
simple piece of information to the information about the presence of the seats.
When comfortable is used (rather than uncomfortable), the pattern is reversed, as
shown in (3): here the inference induced by (1) is reinforced, so the additive con-
nective in (3b) is appropriate, and the contrastive one in (3a) is not.
Thus, an utterance like (1) counts as an attempt by the speaker to convince the
addressee of some point that goes beyond the information provided. Moreover,
this is part of the conventional function of the expression in (1). One simply does
not know the meaning of seat if one can only distinguish objects as belonging to
the class or not, but does not know that it licenses this kind of inferences. In view
of this, we may say that the meaning of the word is its contribution to the argu-
mentative value of utterances in which it occurs.
On this basis, it can also be seen in what sense even an apparently simple
piece of information such as the answer Over there to the Wh-question Where
is the bus stop? is argumentative, too (cf. the beginning of this section). Observe
the use of the contrastive connective in Over there, but the last bus has already
left or Over there, but the line has been temporarily re-routed. Clearly, it is a mat-
ter of convention that providing information about the location of the bus stop
counts as inducing the addressee to make certain inferences, probably about the
Intersubjectivity and the language system 315

ossibility to take a bus at that location in the near future (the question will also
p
have been taken as having such a desire as its background).
The normal situation for linguistic meanings seems to be that their argumen-
tative value is tied to a particular way of construing a situation or some aspect
of it. Having only some conventional rhetorical strength constitutes a rather re-
stricted type of language (Verhagen 2005:18); it may be found in elements such
as words for a greeting (Hello) or an apology (Sorry). Notice that these cases show
that the expression of a positive attitude towards the addressee has the status of
an interpretation. Using such expressions counts as a greeting or an apology, ir-
respective of the actual attitude of the speaker with respect to the addressee or the
issue at hand, although, naturally, an inference about the speakers mental state is
often justified. In the same vein, the meaning of Thats great!, for example as ut-
tered in response to an interlocutors announcement of a job offer, is not primar-
ily an expression of the speakers attitude, but a signal to the addressee that the
speaker acknowledges the addressees (right to a) positive evaluation. Again, this
normally licenses an inference about the speakers actual mental state, but it is not
the primary meaning from which the rhetorical value is derived. Rather, it is the
other way around: we infer the speakers personal mental state from the argumen-
tative value of an expression.

3.2 The concept of topos

In exactly what way is the relevant argumentative dimension determined? In or-


der to see how this works, recall that the inferential load of utterances is crucially
involved in the way they relate to each other in connected discourse. Discourse
consists of chains of inferential steps, including the possibility to reject one or
more steps, and change direction. Consider the exchange in (4).
(4) A: Do you think our son will pass his courses, this quarter?
B: Well, he passed those of Winter Quarter.

In a purely information-oriented perspective, Bs utterance should be said not


to address the question posed by A. So why can this be a coherent piece of dis-
course? The reason is, again, that every utterance is taken as orienting the ad-
dressee towards certain conclusions by invoking some mutually shared model in
which the object of conceptualization figures, a topos in Anscombre and Ducrots
terminology. In our culture it is a rule, mutually known to the members of the
316 Arie Verhagen

c ulture, that passing some test normally licenses the inference that one will be
able to pass other tests as well; in other words, the topos is that if someone passed
a test, it is more likely that he will be able to pass other tests than that he will not.
Notice the use of terms like normally and more likely in the formulation of this
rule it is a kind of default rule, not a universally valid one.
Given such a topos, it is valid to infer from the statement He passed his
courses, that he is probably capable of successfully performing certain tasks, like
taking courses of this kind. In this way, Bs utterance can count as a coherent,
in principle positive, answer to As question. That is, creating an argumentative
connection is what appears to make a set of utterances into a coherent discourse.
Again, an addressee takes an utterance not (just) as an instruction to construe an
object of conceptualization in a particular way, but as an instruction to engage
in a reasoning process, and to draw certain conclusions; it is typically not just at-
tending to the same object, but understanding what the speaker/writer is getting
at (what she wants you to infer), that counts as successful communication. And
understanding what it is that your interlocutor wants you to infer, constitutes a
move from a relatively indirect relation between coordinating minds (through
shared attention for some object), to a more direct one; as we shall see, it is this
more direct inter-subjects connection that certain grammatical constructions
operate on.
The predicative use of ordinary adjectives, e.g. about size or quantity, also
provides good illustrations (cf. Pander Maat 2006). Saying that someone is tall,
in this view, does not primarily provide information about that persons length,
but counts as a recommendation of some kind (depending on the topos being
activated), e.g. to select him for the basketball team, or not to select him as a
jockey. Notice that a person being called tall in the jockey-selection situation may
be shorter than a person rejected for the basketball team because he was short.
Again, the constant value of the terms is in their argumentative orientation, not
(just) in their information value.
Of course, we are also getting some information about the world from the
utterances, just like we are able to get information out of the expression negative
deficit. In this case, knowing what the relevant topos is (e.g. the taller someone is,
the better the chance that he will make a good basket ball player), and knowing
something about the average length of persons in general and basket ball players
in particular, we can make certain guesses about the range of possible sizes for the
person involved. But that is not primary in the conventional knowledge activated
by the word tall. Activation of a scale of length that allows inferences about a

. A topos is thus a component of the common ground in the sense of Clark (1996) (cf.
Verhagen 2005:716); cf. also Sinha (1999).
Intersubjectivity and the language system 317

persons actual height is dependent on knowledge of the relevant argumentative


scale, not the other way around.

3.3 Argumentative orientation and argumentative strength

So far I have argued for the inherent argumentativity of language on the basis
of phenomena in the domain of the lexicon. Other phenomena in this domain
include several kinds of speech act verbs (see also Section 5), evaluative adverbs
such as hopefully and unfortunately, and connectives like so and although. But ar-
guably the most striking evidence for the fundamental argumentative character of
human communicative intersubjectivity comes from the fact that it pervades core
parts of grammar. That is what I will turn to in the remainder of this chapter.10
The methodology I will be using in order to demonstrate the precise argu-
mentative character of certain parts of grammar is based on the appropriateness
or inappropriateness of discourse connectives that was introduced in the previous
section, cf. examples (2) and (3). These are taken as diagnostic cues for the argu-
mentative value of the utterances being connected. An important distinction that
can be elucidated in this way is that between argumentative orientation and argu-
mentative strength. Consider the relation between the expressions a small chance
and little chance. These may well refer to the same percentage of probability, for
example 20%, but their roles in orienting an addressee to certain conclusions are
systematically different. In their import, they are exactly opposite, as can be dem-
onstrated with (5) and (6). Suppose someone is considering whether or not to
perform a surgical operation on a patient who is in a serious condition; then it is
coherent for this person to say (5a), but not (5b).
(5) There is a small chance that the operation will be successful.
a. So lets give it a try.
b. #So lets not take the risk.

. This does not necessarily mean that such an informational component has to be so indirect
for all words in a language, i.e. that it could never be conventional languages are more flexible
than that. For example, Anscombre and Ducrot (1989) propose the interesting hypothesis that
numerical expressions in natural languages should be considered a special device, a kind of
operator to remove the default argumentative orientation of ordinary expressions. Saying that
someones height is 1.75 meter does not inherently display the argumentative orientation of
saying that someone is tall or short. Using precise numeral specifications is obviously more arti-
ficial and elaborate than using words like tall, short, fast, slow, etc., which testifies to the default
condition of the linguistic meaning of everyday expressions being inherently argumentative.
10. These are based on the analyses in chapters 2 and 3 of Verhagen (2005), respectively.
318 Arie Verhagen

What this shows is that saying There is a small chance orients an addressee to the
same conclusions as the positive statement There is a chance. On the other hand,
(6) exhibits the mirror pattern: (6a) is not coherent, but (6b) is.
(6) There is little chance that the operation will be successful.
a. #So lets give it a try.
b. So lets not take the risk.

Saying There is little chance orients an addressee towards the same conclusions
as the negative statement There is no chance. Notice that it makes no difference
what the actual percentage of the chance of success is. Whatever turns out to be
the case, a small chance basically orients the addressee to the same general kind of
conclusions as a chance, while little chance orients one to the same sorts of conclu-
sions as no chance.11
The expressions do not by themselves indicate positive vs. negative recom-
mendations. Suppose the context is not that of a surgeon wondering whether or
not to perform an operation, but of a policeman wondering whether or not to
interrogate a seriously injured victim of a shooting, who is waiting to be operated.
In that situation, it may very well be coherent to say There is little chance that the
operation will be successful. So lets give it a try., cf. example (6a), employing a to-
pos of the kind The more important certain information is, the more acceptable
it is to take risks in obtaining it. In that sense, the pragmatic import of the expres-
sion little chance is context dependent. But the significant point is that its effect is
still the same, in this context, as that of the expression no chance, and the reverse
of the effect of a small chance. In this context, it would precisely be coherent to say
There is a small chance that the operation will be successful. So lets not take the risk.,
cf. example (5a). Thus the conventional, context-independent linguistic meaning
of an expression of the type little X is to reverse the orientation of the inferences
associated with the predicate X (with less strength than no; cf. below), whatever
topos is being employed. The equally conventional context-independent meaning
of a small X is to maintain the orientation of the inferences associated with the
predicate X, while their strength is less than with an unmodified assertion.
Thus a generalization can be made over negation and expressions like little
chance in terms of argumentative orientation: their use has the function of direct-
ing the addressee to infer that certain conclusions are invalid. The difference must

11. Therefore, as with the meaning of the expression negative deficit, there does not seem much
prospect for deriving the difference between the distinct intersubjective functions of these ex-
pressions from a descriptive difference without somehow introducing the argumentative orien-
tations in the derivation, i.e. in a non-circular way. The argumentative difference must itself be
taken as part of the linguistic meaning of these expressions.
Intersubjectivity and the language system 319

Table 1. Argumentative orientation and strength of [operator]-chance


orientation Strength
a chance + High
a small chance + Low
no chance High
little chance Low

be characterized in terms of argumentative strength. Straightforward negation has


maximal argumentative strength; its use relates a specific situation (the chance of
success here and now) to a shared inferential model of a type of situations (the
more chance of success, the more reason to operate) without any qualifi-
cation: given the topos, the situation in the world provides the strongest possible
argument for invalidating the conclusion go ahead with the operation. This is the
locus of the difference with little X. The latter shares its argumentative orientation
with negation (the chances of success are not optimal), but presents it as weaker:
it is qualified, since the situation in the world comprises a feature that might be
construed (in other circumstances, or by another person) as favoring an opera-
tion. Similarly, a small X shares its argumentative orientation with unmodified
predication, but presents it as weaker. This is summarized in Table 1.
The second column does not represent a binary distinction, but a scale on
which expressions can also occupy in-between positions. For instance, the
strength of no chance is arguably maximal, while that of a chance may easily be
surpassed by that of, for example, every chance in the world.
Operators with less than maximal strength leave room for discussion and ne-
gotiation. As we have seen before, an utterance that cancels inferences associated
with the previous one, must be marked with a contrastive connective like but.
This also applies to the cases we are considering here; for example, (7a) produces
a coherent discourse, unlike (6a):
(7) There is little chance that the operation will be successful.
a. But lets give it a try.
b. #But lets not take the risk.

But when the strength of the negative operator is maximal, there is no room for
canceling the inference that it is not worth trying; uttering (8) always amounts to
inconsistency:
(8) #There is no chance that the operation will be successful. But lets give it a
try.
320 Arie Verhagen

A difference like this is a natural consequence of differences in strength, and of


the scales involved having definite boundaries; it does not undo the parallel in ar-
gumentative orientation. Distinguishing argumentative strength from orientation
precisely allows us to formulate in a natural way what straightforward negation has
in common with other expressions that at first sight may not look like negation but
nevertheless behave in highly similar ways, as we shall see in the next section.

4. The negation system as an intersubjective coordination system

The claim is that both straightforward negation and the construction little X belong
to a larger system of expressions that share an effect on the argumentative orienta-
tion of the utterances they are part of (though they may differ in their strength).12
To the degree that we can substantiate the claim that this grammatical system must
be characterized in this way, we have provided evidence that the nature of intersub-
jectivity as built into the linguistic system, is basically argumentative.
Consider the following set of expressions containing the expression let alone:
(9) He didnt pass Statistics-1, let alone Statistics-2.
(10) ??He didnt pass Statistics-2, let alone Statistics-1.
(11) *He passed Statistics-1, let alone Statistics-2.
(12) *He passed Statistics-2, let alone Statistics-1.

In view of these, let alone appears to connect two elements that are ordered on
a scale in a specific way, witness the problematic status of (10): presumably, Sta-
tistics-2 is harder to pass than Statistics-1. Moreover, let alone is also a negative
polarity item: since neither (11) nor (12) is fine, it appears to require the presence
of a negation operator in the first clause (cf. Fillmore, Kay and OConnor 1988).
Next, notice that almost X has a negative entailment, whereas barely does not:
(13) He almost passed He did not pass
(14) He barely passed He passed

Nevertheless, almost cannot license the use of let alone, while barely can:

(15) *He almost passed Statistics-1, let alone Statistics-2.


(16) He barely passed Statistics-1, let alone Statistics-2.

12. It is especially because of this kind of systematicity that a linguistic analysis provides a pow-
erful window on the mind. The significance of this point is easily overlooked by proponents of
the information view, cf. Hinzen & Van Lambalgen (2008) and Verhagen (2008).
Intersubjectivity and the language system 321

Table 2. Argumentative orientation and strength of [operator]-passed


orientation strength
passed + high
almost passed + low
did not pass high
barely passed low

From an information-oriented point of view, this is a riddle. An explanation can


be based on the insight that barely reverses the argumentative orientation of an
utterance (as negation does), while almost preserves it (despite the facts, so to
speak), as can be seen in the following set of connected utterances:
(17) He passed Statistics-1. So there is hope that he may make it through the first
year.
(18) #He didnt pass Statistics-1. So there is hope that he may make it through the
first year.
(19) He almost passed Statistics-1. So there is hope that he may make it through
the first year.
(20) #He barely passed Statistics-1. So there is hope that he may make it through
the first year.

Naturally, as an instrument for reversing the argumentative orientation of an


utterance, barely is weaker than straightforward negation, just like almost is a
weaker positive argument for a conclusion than a statement without any hedges.
Schematically, this is shown in Table 2.
And just as we saw in the previous section, the weaker argumentative opera-
tors allow room for discussion, changing direction in the subsequent discourse
(using a contrastive connective), while the strongest one is not defeasible:
(21) He barely passed the test. But anyway, he did.
(22) #He did not pass the test. But anyway, he did.

Recognizing barely and almost as operators and let alone as a connector at the ar-
gumentative level, allows for a unified explanation of the phenomena mentioned
above, including the problem of barely licensing the use of let alone. What these
elements have in common, and what determines the similarity of their grammati-
cal properties, is the fact that their conventional meaning primarily functions at
the level of the argumentative value of utterances, not on that of informational
322 Arie Verhagen

Figure 1. A schematic view of the intersubjective nature of the negation system


(see the text for discussion)

content.13 Their common features in terms of intersubjectivity can be depicted as


in Figure 1 (cf. Verhagen 2005:57).
The use of an element from the negation system, e.g. not or barely, sets up a
configuration of two perspectives (mental spaces in terms of Fauconnier 1994),
the first of which is that of the person responsible for the utterance (including the
negative element), which contrasts in a particular way with a projected second
perspective (by default: the addressees). The speaker/writer envisages that the ad-
dressee might entertain a thought q, for example, that there is hope for their sons
making it through the first year. This is represented as ?q in Space2. She further-
more believes that she shares the knowledge of a certain cultural model with the
addressee, for example, that passing a statistics course in our culture normally pro-
vides some ground for the conclusion that one can also pass other sorts of courses.
This is represented by the topos PQ in both Space1 and Space2. Both the use of
not p and that of barely p invalidate q (given the topos), inviting the addressee to
consider q more justified than q, at least at this point in the discourse. What this
explication shows is that intersubjectivity is built into the very semantics of natural
language negation elements: they involve multiple distinct perspectives that are to
be coordinated in a particular way, and they presuppose shared knowledge that can
be invoked as a basis for inferential processes. The fact that it is this argumentative
character that determines the coherence of a grammatical subsystem (in this case:

13. Cf. Verhagen (2005, Ch. 2) for a discussion of other phenomena relating to negation that
can be illuminatingly (re)analyzed in such a perspective, such as the difference between sen-
tential and morphological negation (by means of prefixes such as un-), and the reason why not
impossible is not functionally equivalent to possible, despite the logical equivalence of these
expressions.
Intersubjectivity and the language system 323

negation) is in turn strong evidence for the fundamental role of argumentativity in


human communicative intersubjectivity. In fact, it is strengthened by the fact that
similar properties are found in other subsystems of grammar.

5. Complementation as an intersubjective coordination system

Traditionally, a sentence containing a clausal complement such as (23) is viewed


as presenting one event or situation as part of another one, i.e. as some (special,
structural) combination of two pieces of information about the world.
(23) The envoy reported that the money had been delivered.

The situations being connected are seen as basically characterized by the verbs,
i.e. report and deliver in (23). Thus, the structure of (23) is considered to be essen-
tially the same as that of the simplex sentence The envoy reported something, with
the slot of something being filled by another clause in (23) itself: this subordinate
clause fills the direct object slot in the main clause.
However, children do not learn such complex syntactic constructs by com-
bining two descriptions of events, i.e. by combining simplex clauses (Diessel and
Tomasello 2001, cf. also Tomasello 2003). Rather, they start, at about 3 years of
age, to add certain markers of subjective perspective like I think and you know
to simple clauses of types they have already been producing before. Diessel and
Tomasello show that, at least for the children, these are not complex structures
that contain two propositions, but single-proposition utterances, the content of
which is expressed completely by what would in a traditional structural analysis
be regarded as the subordinate clause. It is only over a relatively long period of
increasing linguistic experience that children gradually learn to use more verbs,
sometimes also in the past tense and/or with third-person subjects, that in the
end results in the emergence of a more general complementation construction
that allows adults to say and understand things like (23).
These facts cast very serious doubt on the validity of the traditional analysis
for complementation in general, especially since a large part of complementa-
tion in spontaneous conversation of adults also consists of the elementary, single-
proposition type with added perspective-marking that is first acquired by chil-
dren. According to Thomspon (2002), this portion amounts to as much as 80%.
It would of course be preferable to have an analysis that acknowledges the basic
character of the perspective-marking function, and somehow incorporates cases
with third person, past tense main clauses as special cases.
324 Arie Verhagen

So let us reconsider the issue of the precise relationship between expressions


such as (24) and (25)
(24) I think it is raining.
(25) John thinks it is raining.

According to two respectable and long-standing traditions, originating in linguis-


tics (Benveniste 1958) and philosophy (Austin 1962, Searle 1969), these expres-
sions belong to two completely different categories of utterances: either subjec-
tive/performative, or objective/descriptive, respectively. In modern linguistics,
the performative/descriptive distinction is best known. In this view, the use of the
following two sentences exhibits an important qualitative difference:
(26) I promise that Ill have the car up in front at 2 oclock.
(27) John promised that hell have the car up in front at 2 oclock.

The point is, according to Austin, that it makes no sense to characterize (26) in
terms of truth conditions, i.e. to treat it as a description of an act of promising. In
uttering (26), one performs an act of promising, and the performance of an act
can be felicitous or infelicitous, but not true or false. An utterance such as (27),
on the other hand, constitutes a description of an act of promising, and thus its
semantics can be characterized in terms of truth conditions. Accordingly, the two
sentences belong to two wholly distinct categories of speech acts: (26) constitutes
a commissive one, (27) a constative one. Benveniste (1958) had already classi-
fied speech act formulas like (26) and first-person present-tense uses of verbs of
cognition like (24) together as subjective utterances, and (25) and (27) as objec-
tive ones. More recently, what is basically the same insight has also been formu-
lated by others, e.g. Nuyts (2001), Diessel and Tomasello (2001:103/4). According
to Nuyts, expressions like It is probable that... and I think are used performatively
(in the sense that the speaker performs an epistemic evaluation by uttering these
expressions), while third person ones (Mary thinks that...) are used descriptively:
the speaker reports on someone elses epistemic evaluation of a state of affairs
without there being any explicit indication as to whether the speaker personally
subscribes (i.e., is committed) to the veracity of the evaluation or not (Nuyts
2001:385).
But from a grammatical and especially a functional point of view, such a
dichotomy is unsatisfactory, as it implies a rather serious discrepancy between
structure and function: Why are such dissimilar functions expressed in similar
structures, i.e. complementation constructions? Also, the basis for childrens grad-
ual extension of the use of complementation, observed by Diessel and Tomasello,
Intersubjectivity and the language system 325

remains a mystery, because under this analysis, it involves an abrupt shift from
one category of communicative functions to a wholly different one.
However, the argumentative perspective developed above precisely allows for
a natural unification of these phenomena. Consider the way (26) and (27) func-
tion in the context of (28).
(28) A: Can I be in Amsterdam before the match starts?
B1: I promise that Ill have the car up in front at 2 oclock. [=(26)]
B2: John promised that hell have the car up in front at 2 oclock. [=(27)]

Both can count as an affirmative answer. Both can felicitously be followed by the
explicit reassurance So dont worry (notice the use of So). That is, both saying I
promise that X as well as saying John promised that X count as arguments for an
addressee to strengthen the assumption that X will happen, they have the same ar-
gumentative orientation.14 The difference is one of strength rather than argumenta-
tive orientation. Whereas the argumentative strength of the first-person, present
tense utterance is maximal, the strength of the third person, past tense utterance is
less, as the cognitive coordination between author and addressee is indirect, via
the onstage perspective of a third person; but it still functions to coordinate the
perspectives of speaker and addressee, just like a first person utterance.
According to this analysis, a difference between first person, present tense,
and third person matrix clauses should be that the invited inference is defeasible
in the latter case, but not in the former, which has a maximal strength. This is
borne out, for both speech act verbs, witness (29a) and (29b), and verbs of cogni-
tion, witness (30a) and (30b):
(29) a. John promised that hell have the car up in front at 2 oclock. But he
might have forgotten the route to your new home.
b. #I promise that Ill have the car up in front at 2 oclock. But I might forget
the route to your new home.
(30) a. John believes that the mission has been successful. But in fact, it has
failed.
b. #I believe that the operation has been successful. But in fact, it has
failed.

14. In Verhagen (1995), it is argued that it is precisely this constant argumentative orientation
of the report of someone promising something that provides the basis for the development of
the epistemic/evidential use of promise as in The debate promises to be interesting; in such cases
the verb only functions as a speaker-oriented marker of argumentative orientation, and does
not designate an act of promising.
326 Arie Verhagen

The fact that the contrastive conjunction but has to be used in (29a) and (30a)
once again illustrates that the first sentence by itself has the argumentative ori-
entation that I ascribed to it. The difference between performative/subjective and
constative/objective use of verbs of communication and cognition turns out to be
exactly parallel to that between maximally and less strong argumentative opera-
tors, observed in Section 4. Consider the parallel of the difference between the a
and b cases in (29) and (30) with the difference between (21) and (22), repeated
here for convenience:
(21) He barely passed the test. But anyway, he did.
(22) #He did not pass the test. But anyway, he did.

This parallel confirms the idea that third person matrix clauses of complementa-
tion constructions differ only in strength from first person ones, not in kind. In
this analysis, the difference between these two types of uses appears to be not cat-
egorical, as they have the same argumentative orientation, but a matter of degree:
they differ only in argumentative strength.
This functional unification of first and third person matrix clauses of comple-
mentation constructions makes the discrepancy between structure and function
inherent in the traditional approach disappear. The picture of the acquisition of
complementation constructions also becomes more coherent: it starts with learn-
ing to add explicit markings of perspectives to utterances; initially these are com-
pletely grounded in the speech situation (I, you, present tense), they are formu-
laic, and have maximal strength; with experience, the child learns to understand
and produce more and more indirect and general perspective markings, allowing
for more nuances and for defeasibility.
As with the system of negation, we find that the conventional meaning of
complementation primarily functions at the level of the argumentative strength of
utterances, not on that of informational value. Utterances that instantiate comple-
mentation do not consist of structural combinations of pieces of information, but
of a constructed representation of some situation, structurally embedded in a per-
spective indicator (or more than one) that serves, sometimes in conjunction with
other elements, to coordinate cognitive processes of speaker and addressee.15

15. And as with negation, this view is instrumental in solving a number of other long standing
problems of grammatical analysis (cf. Verhagen 2005, Chapter 3). Among these are the issue of
the precise grammatical analysis of sentences like The danger is that things will get out of hand
(is the complement clause the subject or the predicate of the entire sentence?), the precise status
of complements in sentences with copular predicates, like He is afraid that things will get out of
hand (such predicates do not take direct objects, cf. *He is afraid a disaster), and the analysis
Intersubjectivity and the language system 327

6. Discussion and conclusion

The evolution of cognition is to a considerable extent a story of subsequent gener-


ations of organisms interacting more and more indirectly with their environment
(Dennett 1995:370400). The capacity for categorization and insight into causal-
ity allow an individual to act on the basis of prediction, i.e. selection of the hypo-
thetically best course of action from two or more alternatives, without having to
interact with the environment immediately; the potential advantages are obvious.
The evolution of intersubjectivity fits into this picture: it provides a step beyond
categorization and causality in that it allows individuals, among other things, to
act on the basis of predicting what another individual will do, viz. by mentally
putting oneself in the others position and consider what one would do oneself.
The evolution of communication is a special case of increasing indirectness:
it allows organisms to influence other organisms conspecifics and others by
means of signals, without physical engagement and all its hazards. The evolu-
tion of language constitutes a further step of increasing indirectness, as a linguis-
tic signal does not have a single directive nature, but one that is variable, since
its contribution to the argumentative character of an utterance is dependent on
the relevant topos, a shared cultural model; moreover, by means of argumenta-
tive operators, the argumentative orientation of a signal may be reversed, and its
strength may be modified. This kind of indirectness presupposes a form of inter-
subjectivity: mutual recognition of the inferences the other is capable of making,
given the shared model.
This high degree of indirectness and this dependence on shared models pro-
vide room, in fact a basis, for referentiality: what different topoi associated with a
signal have in common may be identified with the signals referent, the concept of
some aspect of a phenomenon in the world that activates a topos. The difference
with the functional referentiality of animal calls like the vervet monkeys preda-
tor-specific alarm calls (cf. Section 2), is that there is no one-to-one link between
a specific category of phenomena and a particular type of response, but a one-to-
many relationship (in modern, adult human beings). But this increased flexibility
does not imply that human language is essentially informative or referential, and
no longer a system for mutual management and assessment of senders and receiv-
ers of signals. In fact, as we have seen, mutual influencing is built into the very
structure of grammar.
Much of language use is only indirectly aimed at a behavioral response, and
primarily a matter of manipulating the mental states and processes of others

of Wh-questions like Who do you think pays the rent?, in which the question word seems to be
extracted from its own clause (for the latter phenomenon, see also Verhagen 2006).
328 Arie Verhagen

(although such a manipulation may well constitute a particular way of eliciting


a behavioral response). But a considerable portion is also conventionally aimed
at immediate effects, such as making a request, asking a question, or issuing a
warning, i.e. typically non-assertive speech acts. The latter point and its signifi-
cance have also been noticed by Owings and Morton (1998). Discussing relations
between communication and cognition in animals, they conclude that animal
knowledge structures are fundamentally pragmatic, i.e. about what to do about
objects, events, and states [...]. According to this approach, signals are not state-
ments of fact, that can be judged to be true or false, but are efforts to produce
certain effects. (Owings and Morton 1998:211). They then notice a parallel with
speech act theory, which is also specifically concerned with utterances that can-
not be judged to be true or false the discovery of which provided the original
motivation for Austins (1962) proposals. As is well known (cf. Section 5), ut-
terances of the type I promise to help you with your homework, Take your hands
off of me!, I now pronounce you husband and wife are not to be understood as
informative, they are not descriptions of states of affairs. Rather, their utterance
constitutes the performance of an act; they are efforts to accomplish goals. To be
sure, certain systematic conditions (called preparatory conditions in the speech
act literature) have to be satisfied in order for these utterances to count as a prom-
ise, a command, and a legitimate wedding. Owings and Morton notice that a
dedicated proponent of the information view might try to construe these as the
actual meaning of the expressions:
Are these correlates, an information advocate would ask, not the information
made available by these signals? No; this question confuses time frames of causa-
tion. Such correlates validate the cues as useful to assessing individuals [...], but
are not the immediate cause of the assessing individuals reaction to the state-
ment. Note that [this] proposal turns the usual view of the role of information on
its head. The correlates of signals (information) are not immediate causes of the
behavior of targets of signals, they are instead long-term validators of the signals
utility. (Owings and Morton 1998:211)

As we have seen in Section 5, the argumentative view of intersubjectivity in the


language system allows for a substantial generalization of speech act theory. Stan-
dard speech act theory restricts its domain of application to non-assertive ut-
terances, which simply cannot be understood as truth-functional. Nuyts (2001)
has extended this domain to include epistemic verbs and predicates, but even
then it is still limited to first person present tense use. But we have seen that the
argumentative orientation of third person uses of such verbs is the same as that
of first person ones. As instruments for influencing, they differ in strength, not
in kind. More generally, assertive utterances which are in principle analyzable as
Intersubjectivity and the language system 329

truth-functional, such as There is going to be a negative deficit, There are seats in


this room, John is tall (cf. Section 2) are also attempts to accomplish something, of
the same type as standard speech acts (like warning, reassurance, or advice).
In view of this generalization, we may broaden Owings and Mortons view on
the secondary character of referentiality in animal communication systems, to
include human language: real-world correlates of signals are long-term valida-
tors of argumentative cues. Human language may certainly be said to allow for
distinguishing innumerably more distinctive long-term validators of such cues
than known animal communication systems. But that in itself does not turn hu-
man language into a system of information exchange rather than a system for
mutual influencing. Since topoi are shared, they may usually remain implicit, and
establishing joint attention suffices for a sender to get a receiver to make the de-
sired inferences. But the way discourse units are systematically connected to each
other, especially by means of linguistic elements, and systematic properties of the
grammatical systems of negation and complementation reveal that the argumen-
tative character of language is still basic.
The parallel between the argumentative view of language and the modern
ethological view of animal communication will now be obvious. It concerns the
fact that human linguistic communication is primarily also a matter of influenc-
ing one another, by exploiting the cognitive capacities of others. Human language
is more involved than (most?) animal communication with influencing mental
states, with consequences for long term behavior, rather than with immediate
behavioral effects, but this is a matter of degree. Since the intersubjective coor-
dination of mental states and attitudes, according to this view, is a special form
of mutual influencing, it is not to be put into opposition to the basic character of
animal communication rather, it is its specifically human variant.16

Acknowledgements

I would like to thank Jordan Zlatev, as well as Chris Sinha, for insightful remarks
and questions about a previous version of this chapter.

16. Beside involving a special kind of mutual influencing, human language also exhibits a num-
ber of other special features as mentioned in the beginning of this chapter. Some of these have
their basis in intersubjectivity, such as the symbolic character of language (cf. Sinha 2004), its
conventionality and normativity (Itkonen, this volume; Zlatev, this volume), and the exception-
ally large size of the inventory of signals (due to, primarily, duality of patterning; cf. Martinet
1949; Hockett 1958), which allows for the emergence and survival of signals for other purposes
than the immediate elicitation of a behavioral response.
330 Arie Verhagen

References

Anscombre, J.-C. and Ducrot, O. 1989. Argumentativity and informativity. In From Metaphys-
ics to Rhetoric, M. Meyer (ed.), 7187. Dordrecht, etc.: Kluwer Academic Publishers.
Austin, J.L. 1962. How To Do Things With Words. Oxford: Oxford University Press.
Benveniste, . 1971. Problems in General Linguistics. Translated by M.E. Meek. Coral Gables:
University of Miami Press.
Bradbury, J.W. and Vehrencamp, S.L. 2000. Economic models of animal communication. Ani-
mal Behaviour 59: 259268.
Cheney, D.L., and Seyfarth, R.M. 1990. How Monkeys See the World. Inside the Mind of Another
Species. Chicago: University of Chicago Press.
Clark, H.H. 1996. Using language. Cambridge: Cambridge University Press.
Dawkins, R. and Krebs, J.R. 1978. Animal signals: information or manipulation? In Behav-
ioural Ecology: An evolutionary approach, J.R. Krebs and N.B. Davies (eds.), 282309. Ox-
ford: Blackwell Scientific Publications.
De Jong, E. (ed.). 1979. Spreektaal. Woordfrequenties in gesproken Nederlands. Utrecht: Bohn,
Scheltema and Holkema.
Dennett, D.C. 1995. Darwins Dangerous Idea. Evolution and the Meanings of Life. New York:
Simon and Schuster.
De Jong, E. (ed.). 1979. Spreektaal. Woordfrequenties in gesproken Nederlands. Utrecht: Bohn,
Scheltema and Holkema.
Diessel, H. and Tomasello, M. 2001. The acquisition of finite complement clauses in English:
A corpus-based analysis. Cognitive Linguistics 12: 97141.
Ducrot, O. 1996. Slovenian Lectures/Confrences Slovnes. Argumentative Semantics/Sman-
tique argumentative. Editor/diteur I. . agar. Ljubljana: ISH Intitut za humanistine
tudije Ljubljana.
Dunbar, R.I.M. 1996. Grooming, Gossip and the Evolution of Language. Cambridge, MA: Har-
vard University Press.
Fauconnier, G. 1994. Mental Spaces. Aspects of Meaning Construction in Natural Language.
Cambridge: Cambridge University Press. [First edition 1985, Cambridge, MA: The MIT
Press.]
Fillmore, C.J., Kay, P., and OConnor, M.C. 1988. Regularity and idiomaticity in grammatical
constructions: The case of let alone. Language 64: 501538.
Hinzen, W. and Van Lambalgen, M. 2008. Explaining intersubjectivity: A comment on Arie
Verhagen, Constructions of Intersubjectivity. Cognitive Linguistics 19: 107124.
Hockett, C.F. 1958. A Course in Modern Linguistics. New York: The Macmillan Company.
Hultsch, H. and Todt, D. 2004. Learning to sing. In Natures Music. The Science of Birdsong,
P.Marler and H.Slabbekoorn (eds.), 80107. San Diego, CA/London: Elsevier.
Itkonen, E. this volume. The central role of normativity for language and linguistics.
Janik, V.M., Sayigh, L.S. and Wells, R.S. 2006. Signature whistle shape conveys identity in-
formation to bottlenose dolphins. Proceedings of the National Academy of Sciences 103:
82938297.
Keller, R. 1998. A Theory of Linguistic Signs. Oxford: Oxford University Press.
Langacker, R.W. 1987. Foundations of Cognitive Grammar. Volume I: Theoretical Prerequisites.
Stanford, CA: Stanford University press.
Intersubjectivity and the language system 331

Levinson, S.C. 2000. Presumptive Meanings. The Theory of Generalized Conversational Implica-
ture. Cambridge, MA: The MIT-Press.
Lewis, D.K. 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University
Press.
Martinet, A. 1949. La double articulation linguistique. Travaux du Cercle Linguistique de Co-
penhague V (Recherches Structurales 1949), 3037.
Nuyts, J. 2001. Subjectivity as an evidential dimension in epistemic modal expressions. Jour-
nal of Pragmatics 33: 383400.
Owings, D.H. and Morton, E.S. 1998. Animal Vocal Communication: A New Approach. Cam-
bridge: Cambridge University Press.
Pander Maat, H.L.W. 2006. Subjectification in gradable adjectives. In Subjectification: Various
Paths to Subjectivity. A. Athanasiadou, C. Canakis and B. Cornillie et al. (eds.), 279320.
Berlin/New York: Mouton de Gruyter.
Searle, J.R. 1969. Speech Acts. An Essay in the Philosophy of Language. Cambridge: Cambridge
University Press.
Sinha, C. 1999. Situated selves. In Learning Sites: Social and technological resources for learn-
ing. J.Bliss, R.Slj and P.Light (eds.), 3246. Oxford: Pergamon.
Sinha, C. 2004. The evolution of language: from signals to symbols to system. In Evolution of
Communication Systems. A Comparative Approach. D.Kimbrough Oller and U. Griebel
(eds.), 217235 Cambridge, MA/London: The MIT Press.
Sperber, D. and Wilson, D. 1986. Relevance. Communication and Cognition. Oxford: Basil
Blackwell.
Thompson, S.A. 2002. Object complements and conversation: Towards a realistic account.
Studies in Language 26: 125164.
Tomasello, M. 1999. The Cultural Origins of Human Cognition. Cambridge, MA/London: Har-
vard University Press.
Tomasello, M. 2003. Constructing a Language. A Usage-Based Theory of Language Acquisition.
Cambridge, MA/London: Harvard University Press.
Verhagen, A. 1995. Subjectification, syntax, and communication. In Subjectivity and Subjec-
tivisation: Linguistic Perspectives, D. Stein and S. Wright (eds.), 103128. Cambridge: Cam-
bridge University Press.
Verhagen, A. 2005. Constructions of Intersubjectivity. Discourse, Syntax, and Cognition. Oxford:
Oxford University Press.
Verhagen, A. 2006. On subjectivity and long distance Wh-movement. In Subjectification:
Various Paths to Subjectivity, A. Athanasiadou, C. Canakis and B. Cornillie et al. (eds.),
323346. Berlin/New York: Mouton de Gruyter.
Verhagen, A. 2007. Construal and perspectivisation. In Handbook of Cognitive Linguistics, D.
Geeraerts and H. Cuyckens (eds.), 4881. Oxford: Oxford University Press.
Verhagen, A. 2008. Intersubjectivity and explanation in linguistics: A reply to Hinzen and van
Lambalgen. Cognitive Linguistics 19: 125143.
Zlatev, J. this volume. The co-evolution of intersubjectivity and bodily mimesis.
chapter 14

Intersubjectivity in interpreted interactions


The interpreters role in co-constructing meaning

Terry Janzen and Barbara Shaffer

Introducing an interpreter into a discourse event affects the very nature of the
interchange because in addition to the interlocutors intersubjective approach to
each other, the interpreter necessarily bases her interpretation on assumptions
she makes about each of the interlocutors shared and non-shared knowledge.
Recently, many American Sign Language (ASL)-English interpreters have
espoused what have been termed expansions, claimed to be grammatically
required in ASL. But ASL has no such explicitness requirement; instead the
interpreter must attend to the intersubjective domain of discourse interaction
in order to attempt to more accurately represent what is in the minds of the
interlocutors. This chapter examines triadic intersubjectivity in interpreted
discourse and the role that contextualization plays in managing others shared
and non-shared knowledge.

[C]onversation is highly contextualized, filled with


subtle cues at all levels marking the relation of utter-
ances to contexts of prior discourse, to situational and
cultural contexts, to contexts of social relations between
speech event participants, and even to the mutual cog-
nitive context within which the dialogic interaction is
embedded. John Du Bois (2003:52)

1. Introduction

When we engage in discourse, various information types are encoded in linguistic


structures we use. For example, discourse may be understood to include a com-
bination of information that is already known to the interlocutors and informa-
tion that is new. A certain amount of known information is required because it
grounds what is new, while the new information is generally the point of the dis-
course. A balance of these two information types keeps the discourse from being
either overly redundant or disconnected (Givn 1984). However, no two people
334 Terry Janzen and Barbara Shaffer

bring the exact same knowledge base or consciousness to the discourse event, so
that when they express ideas in turn, they continually negotiate this balance of
variables so as to best be understood by the other. This therefore constitutes an
intersubjective view of co-constructed discourse.
Because our focus is the interaction between adult language users, we are not
concerned with intersubjectivity within an ontogenetic or developmental context,
even though the emergence of mediated cognition described by Vygotsky (1978)
and others is a rich area for investigation, and one that has greatly informed our
discussion. Instead we take our cue from Per Linell (1995), who views the in-
tersubjective relationship between speakers as a continuous, collective process,
where interactors mutually check understandings. What is said and understood
gets continually updated on a turn-by-turn basis; each contribution to a dialogue
displays (or can display) some understanding or reaction to the prior contribu-
tion (Linell 1995:193). Linell also believes that there must be some meta-level
management of interaction and understanding (p. 183), consistent with Suss-
wein and Racines (this volume) view that humans (but not other animals) possess
an understanding that they attend to others, have intentions, and want things.
Once speakers have a linguistic system in place, that system is by its very nature
intersubjective, involving such elements as persuasion and conveying points of
view argumentativity, in Verhagens (2005, this volume) terms and in our view,
the on-going negotiation of meaning.
How does this negotiation play out? How do speakers gauge that something
they are conveying is understood as it is intended and whether the linguistic ex-
pression includes enough of the right information to be comprehended by the ad-
dressee? Du Bois (2003:52) characterizes face-to-face conversational discourse as
pervasive, spontaneous, interactional, and contextualized. We will argue that in-
terpreted discourse is by nature a face-to-face enterprise, that the intersubjective
views on the objects of discourse and on the discourse itself taken by interlocutors
extends to the third-party interpreter, that the interpreter brings her own views
and intentions to the discourse context, that these necessarily impact the direc-
tion and content of the discourse, and that the interpreters view does not always
correspond to those of the primary interlocutors engaging in the discourse.
Contextualization is always present in discourse as a necessary means to make
our interactions coherent, both within the immediate discourse context and over
the course of time. Contextualizing, where the speaker decides on-line the extent
to which she must contextualize, is an inherently intersubjective action, taking
place regardless of which language is being used, because of the negotiated nature
of discourse. Further, it occurs whether the discourse participants share the same
language or are attempting to communicate across a language boundary with, say,
an interpreter. Contextualizing in discourse may be something that all language
Intersubjectivity in interpreted interactions 335

speakers do, but how they do it will depend on the discourse conventions of the
particular language, and in turn on relevant aspects of the grammar that have
evolved for that particular community of speakers. A question remains, however,
as to what is required by the grammar as opposed to what is optional for partici-
pants when meaning is being negotiated. In on-line discourse where meaning is
co-constructed, interlocutors navigate based on pragmatic factors and assump-
tions, choosing from lexical and grammatical options to construct utterances they
believe will signal their intended sense. On the other hand if something is re-
quired by the grammar, the speaker or signer has no choice but to use it.
For American Sign Language (ASL)-English interpreters, learning variable fea-
tures of discourse and their corresponding grammar is no simple task. For many
reasons interpreters experience inadequacies in their management of discourse
due to insufficient language training, a lack of ready strategies, or inexperience.
Nonetheless, the interpreters goal is to manage the information exchange and dis-
course packaging in not one, but two languages, hopefully with some finesse.
In this paper, we suggest that the description of ASL expansions in Lawrence
(1995) is misrepresentative of ASL grammar and of how ASL compares to other
languages in terms of possible discourse strategies. The features that Lawrence
describes are present to some degree in many languages, if not universally, in-
cluding English. They are not formulaic as required grammatical expressions,
but their use will instead be prompted by pragmatic principles having to do with
negotiating the information exchange. If these principles are not understood by
the interpreter there is the likelihood that she will make erroneous assumptions
about shared knowledge and intentions on the part of consumers, and subse-
quently frame the information in ways that do not match the consumers dis-
course expectations (Janzen and Shaffer 2003; Shaffer and Janzen 2004). A more
appropriate approach to information packaging may be what interpretation and
translation theorists also have referred to as contextualization (e.g. Gile 1995)
where contextualizing information is supplied by the interpreter based on situ-
ational factors rather than on assuming that the language requires something
to be phrased in a certain way. Above all, coherent, discourse-appropriate in-
terpretation necessitates that the interpreter have numerous language and over-
all discourse strategies within easy reach; any single strategy may work well in
one circumstance, but fail in another. Thus managing these triadic interaction
tasks the activity of interpreting tells us much about the very complex nature
of intersubjectivity for adult speakers.
In Section 2 below we review the notion of expansions in the literature
and point out an early precursor to this idea. In Section 3 we introduce the
role of pragmatics in interpreted discourse and discuss how an understanding
of discourse pragmatics and grammar impacts the interpreters text. Section 4
336 Terry Janzen and Barbara Shaffer

f ocuses on contextualization in discourse as interlocutors negotiate meaning with


one another, and in Section 5 we illustrate these principles with some examples.
In Section 6 we summarize our discussion and draw some conclusions regard-
ing the interpreters position in the complex intersubjective nature of three-party
interpreted discourse.

2. Expansions as grammar

In an attempt to better understand what constitutes native signers ASL, Lawrence


(1995) identifies seven ASL elements that she refers to as expansions. Lawrence
labels these as contrasting, faceting, reiteration, utilizing 3D space, explaining by
examples, couching or nesting, and describe-then do. Here a description of two of
these expansions should suffice to illustrate Lawrences claims.
When contrasting, two ideas are juxtaposed, for example a positive and neg-
ative statement, to emphasize what is being asserted (what something is versus
what it is not). Lawrence (1995:208) gives as one example:
(1) English: Lenins tomb is austere.
nod topic neg
ASL: fs LENIN GRAVE PLAIN. FANCY // NOT.
Lenins tomb is plain, not fancy.

For the English word austere, Lawrence suggests that the ASL construction would
need two contrasting phrases.
When couching or nesting, backgrounded information is added to clarify
an idea. Humphrey and Alcorn (2001) suggest that couching/nesting is used
to provide information in an introductory expansion or set up to ensure the
listener has the schema or frame required to understand the upcoming discourse

. A persistent difficulty with authors representation of ASL on the page is that of differing
methods of glossing, since ASL does not have a written system or standard means of nota-
tion. Also, not all authors provide a key to their transcriptions. In (1) we understand that fs
means that something is fingerspelled. It is fairly common to find facial gestures represented by
overlines, with nod meaning a concurrent positive head nod, topic (or top) representing
facial topic marking, and neg meaning a negative head nod. ASL signs are given in upper case
English words.
. Our transliteration.
. While Lawrence says that she is describing signers regular use of ASL, her examples appear
to be ASL translations of English words. This is problematic because attempting to find cross-
language equivalents confounds the issue of what might occur in a single language.
Intersubjectivity in interpreted interactions 337

(2001:9.18). Inherent in this description is the premise that the use of some term
in ASL cannot stand on its own. Humphrey and Alcorn (2001:9.18) offer the fol-
lowing example for allergy:
(2) MEDICINE-TAKE OR CREAM RUB-ON-SKIN OR FOOD EAT-FINISH-
ITCH ALL OVER OR STOMACH UPSET OR HARD BREATH A-L-L-E-
R-G-Y

Humphrey and Alcorn mean that the (fingerspelled) word A-L-L-E-R-G-Y is ob-
ligatorily nested within the example-rich phrasing that precedes it. Their in-
tention is that without the expansion the word will not be understood by the
addressee.
Lawrence attributes expansions to the grammar of ASL specifically, meaning
that in particular grammatical contexts, an expansion is required:
In analyzing ASL discourse, it seems there are specific applications of language
use and language phrasing in ASL that do not occur in spoken English. These
unique applications are what I call EXPANSION. Although the word EXPAN-
SION has many meanings in English, I chose this term because it is descriptive of
what happens in native ASL signing. (Lawrence 1995:207, italics ours)

When Lawrence specifies that expansions take place in native ASL signing she
is suggesting that second language signers (i.e., many interpreters) are not using
ASL structures in a way that first language ASL signers are. Lawrence sees expan-
sions as a requirement of ASL in opposition to English, which suggests that, even
though she speaks only about discourse and not grammar, they are grammatical
constructions that must appear in ASL if language use is to be native-like:
Isolating the features of EXPANSION may, in fact, give the ASL student and ulti-
mately, the student of interpretation, the facility to produce a more natural form
of ASL discourse that is not only more accurate, but also allows the Deaf con-
sumer the ease of understanding the message in a more native-like form.
 (Lawrence 1995:213)

The suggestion that expansions are required by the grammar of ASL is evident in
Humphrey and Alcorn (2001), who claim that ASL demands a more explicit level
of information coding than does English. They propose that:
an English presentation or exchange of information tends to deal with the spe-
cific issue at hand, avoiding a great deal of elaboration or detail. Thus unless the

. Upper case letters separated by hyphens indicates that the word is fingerspelled.
. There is a lexical word in ASL for allergy that might well suffice.
338 Terry Janzen and Barbara Shaffer

speaker is engaged in story-telling, acting, or some special form of discourse, it is


likely s/he will not provide the rich variety of detailed and descriptive informa-
tion required by ASL. (Humphrey and Alcorn 2001:9.10; italics ours)

Aside from Humphrey and Alcorns disregard for different linguistic features as-
sociated with various genres of English discourse (over and above the so-called
special forms they mention) and the Gricean maxim of quantity, that is, Make
your contribution as informative as is required (for the current purposes of the
exchange) (Grice 1975:45), their claim is that English and ASL are fundamen-
tally different in regard to expansion requirements. They do concede that English
sometimes uses similar techniques to convey information (Humphrey and Al-
corn 2001:9:13) but include no elaboration of how this might take place. Instead
their claim is that expansion devices are characteristic of ASL structure, and
therefore must be inserted into ASL as the target language during English to ASL
interpretation.
We propose instead that if general principles of information exchange along
with more specific principles of discourse structure in each language are under-
stood, interpreters will better know how and when to contextualize propositions in
either direction so that information is transferred cross languages accurately without
compromising the intent. It may be, however, that these expansion techniques are
put into practice because the interpreter has made certain assumptions on the part
of the discourse participants she is interpreting for, assumptions which are perhaps
unjustified. Conversely, she may have no pragmatic motivation at all, but is merely
attempting to compensate for her own language and discourse inadequacies.
Whatever the reason, it appears that ASL-English interpreters have come to
believe that expansions are explicit grammatical features of ASL, meaning that
in specific grammatical contexts, an expansion is the correct grammatical way to
make an ASL sentence. This in fact is rarely, if ever, the case and yet the practice
has become an expectation of proper grammar usage without regard to the com-
plex pragmatic interaction of the primary participants in their discourse context.
Instead, the interpreter is deciding what information should be filled in based on
a prescribed notion of grammar rather than on cues from the setting, disregard-
ing entirely the intersubjective negotiation that the principle interlocutors are at-
tempting to engage in.
As noted, both Lawrence (1995) and Humphrey and Alcorn (2001) state
that the level of detail given in ASL discourse is much higher than that found in
English discourse. As an example of incorporating expansion techniques in ASL,
Humphrey and Alcorn cite an ASL narrative from a Deaf signer in which the
narrators mother had planned to visit but had to cancel. Humphrey and Alcorn
give the structure of the ASL story in a rough English approximation, but attempt
Intersubjectivity in interpreted interactions 339

to retain the level of detail contained in the original ASL version. Part of this nar-
rative is as follows:
After weeding the vegetables, she stood up and evidently tripped on a garden
rake that had been left prongs-up on the sidewalk and broke her hip. (role-shift:
standing up, not seeing rake prongs tripping, sailing through the air with a
panicked look on the face, landing on right hip with a great look of pain).
 (Humphrey and Alcorn 2001:9.21)

The ASL narrative is represented by a total of 291 English words in three para-
graphs that include phrases and sentences that Humphrey and Alcorn analyze as
examples of various expansions. They then suggest an appropriate English inter-
pretation of the narrative that is 66 words in length and includes none of the detail
of the original ASL, just the main point along with only those phrases directly re-
lated to this point. They make two comments of interest. First, regarding the level
of detail, ASL typically requires this degree of information [in the orignal ver-
sion] in order to be linguistically correct (2001:9.22; italics theirs) and second,
in contrast to the lack of detail in the English interpretation that the absence of
copious contextual information prevents a Deaf listener [addressee or watcher]
from comprehending the point being communicated (2001:9.23).
What do these comments represent? First, this example and the authors dis-
cussion of it illustrate the claim that expansions are part of the required grammar
of ASL, implying that they must be included or the passage will be incomprehen-
sible to the addressee. Second, if the truncated English version is recommended
as the appropriate interpretation from ASL into English, what does the interpreter
determine to be extraneous in the original text? The implication is that much of
the detail in the ASL version is not appropriate in the equivalent English text.
Third, if the two passages are considered as functionally equivalent, then in the
case of interpreting from English to ASL, how would the interpreter decide on
the specific details to add into the target text, especially given that she was not
present to witness the aunts accident. If such details were added to the ASL tar-
get text (presumably because the grammar requires it), wouldnt the addressee
sense that the interpreter really did not know if the facts expressed by the details
were actually true? Fourth, the discrepancy in the source and target texts in this
example contradicts a statement that Humphrey and Alcorn have made, noted

. By role-shift the authors mean that the signer has taken the perspective or role of the
character and presents the action from that point of view.
. This will be taken up once again in Section 4, where we address the appropriateness of the
interpreter framing information from her own perspective, which may not reflect the level of
shared information between the event participants.
340 Terry Janzen and Barbara Shaffer

above in Section 1, that in the genre of story-telling (or acting or other special
kinds of discourse), English speakers are in fact expected to use a similar degree
of detail inclusion.
The observations we have made above show that some views held regarding
the grammars of ASL and English, and what is necessary to interpret between the
two, focus on requirements of form and not on negotiated aspects of interactions
among speakers and signers. Such a how-to approach to interpretation has a
solid history in the field; in the next section we discuss an earlier item-by-item
approach that equally disregards the notion of shared knowledge.

2.1 An early perspective on the lexicon in ASL-English interpretation

It is evident that even as the field of signed language interpreting began to formal-
ize in the 1960s there were perceived non-equivalencies in the lexicons of English
and ASL. Authors such as Quigley and Youngs (1965) attempted to compensate
for this by suggesting interpretations for lists of English words, claiming that these
English words had no ASL equivalent lexical item and that fingerspelling the Eng-
lish word was an option reserved for the above average Deaf ASL signer. For
low-verbal deaf persons (1965:37), an alternate approach was needed, referred
to as paraphrasing. According to Quigley and Youngs, [i]n interpreting for low-
verbal deaf persons, the interpreter paraphrases, rephrases, defines, simplifies,
and attempts to give the literal sense or conceptual essence of idiomatic expres-
sions The use of analogy, parallelism, and examples are helpful in this type of
interpreting (1965:39). Some examples given are relatively simple, and may very
well serve as good paraphrases in the target. The practice of paraphrasing in this
manner is a widespread strategy in any interpreting when the target word is not
known by the interpreter or doesnt exist (e.g., Gile 1995:198 lists explaining or
paraphrasing as a reconstruction tactic). As English to ASL examples in the med-
ical arena, Quigley and Youngs suggest the English word contagious be conveyed
in ASL as EASY SPREAD SICKNESS (1965:72) and far sighted as CAN SEE FAR,
CANT SEE NEAR (1965:73). These in fact may be reasonable paraphrases that
do not stray significantly from the original (although note that each constitutes a

. While the term low-verbal has often been used to mean persons with minimal language
altogether, Quigley and Youngs define it in their context as follows: In as much as there are
relatively few deaf persons who have absolutely no verbal ability, the term low verbal will be
used to imply that mastery of the English language is either markedly deficient or totally
absent on a functional level in ordinary conversation. These low-verbal deaf persons cannot
understand or make themselves understood without the services of an interpreter in dealing
with hearing people who are not fluent in the language of signs (1965:37).
Intersubjectivity in interpreted interactions 341

differently structured target phrasing which may or may not need to be taken into
account in the overall text).
Other examples stretch the meaning of paraphrasing. Quigley and Youngs
ASL suggestion for cataract is THIN WHITE INSIDE EYE, COVER PART USE
TO SEE, SLOWLY GET WORSE, CANT SEE, MUST REMOVE (1965:72). Dope
becomes INJECTION, BECOME HABIT, CANT STOP, DAMAGE BODY, MIND
BECOMES CRAZY (1965:73). Well beyond paraphrases, these examples explain,
then take the condition to a concluding state, and finally dispense some advice or
opinion. A doctor may be using the term in a sentence like you have early signs
of a cataract which does not (yet) imply the inevitability of impending surgery.
Equally, not all uses of the word dope entail addiction, damage, and insanity.
Even though the original intent was undoubtedly meant as helpful for inter-
preting students, the formulaic nature of these translations assumed a single way
to sign the item no matter what the situation or who the recipient was, disregard-
ing the intersubjective principles of potentially shared context and shared linguis-
tic knowledge or awareness (Linell 1995). Further problematic with Quigley and
Youngs approach to transfer between the lexicons of English and ASL is that it
assumes the absence of certain (and numerous) lexical items in ASL, when in fact
what is not being considered is the possibility that English and ASL do not con-
struct words using identical morphological and lexical processes (Janzen 2005b).
The current practices of expanding and compressing (Finton and Smith
2004) have not resolved these issues because they are equally formulaic and once
again do not take situational factors into account. Interpreters fall prey to for-
mulaic work when they make assumptions about what the languages are like
and what one group of language users will or will not be able to understand
based on these assumptions. Thus, that ASL requires expansions and English
requires compressions dictates the form of the target text without regard for the
participants and their linguistic and extralinguistic knowledge stores and com-
municative goals.

3. Pragmatics and discourse structure

Clearly, the interpreters objective is to construct a target message that is gram-


matical, and thus she must know and use the grammar that is available in the tar-
get language. In one sense, the interpreter is limited by the grammar of the target
language. Gile (1995) refers to this as Linguistically Induced Information which
in the source text, Gile comments, is of little importance to the interpreter because
she cannot likely transfer the grammar of the source language to the target. It is
more critical in terms of the target language, however, because the interpreter
342 Terry Janzen and Barbara Shaffer

c annot avoid the target language grammar when constructing her target text. This
is highly problematic when the grammars of the source and target languages dif-
fer substantially, which we believe is the case for English and ASL. However, the
interpreter is also limited by her knowledge of the grammars if this knowledge is
incomplete, her ability to choose from a set of grammatical possibilities is limited.
But discourse among interlocutors is more than just grammar. Discourse
participants manipulate their repertoire of structural elements to achieve op-
timum communicativeness in interactional contexts. To do this, interlocutors
construct meaning that depends extensively on situational factors as well as by
linking what is communicated at a given moment to knowledge that is carried
forward from past events and experiences. Communicative competence can be
defined in interactional terms as the knowledge of linguistic and related com-
municative conventions that speakers must have to create and sustain conver-
sational cooperation, and thus involves both grammar and contextualization
(Gumperz 1982:209). Thus beyond grammatical knowledge, discourse is always
situated pragmatically, meaning that utterances must always be contextualized
to some extent. Gumperz suggests that discourse participants constantly include
contextualization cues to aid in building and maintaining communicative co-
operation: Roughly speaking, a contextualization cue is any feature of linguistic
form that contributes to the signaling of contextual presuppositions. Such cues
may have a number of such linguistic realizations depending on the historically
given linguistic repertoire of the participants (Gumperz 1982:131). Fox (1994)
points out that linguistic expressions underspecify meaning, so that a lexical item
or construction only truly suggests a meaning when it linked to some specific
context, and that meaning building is by default cooperative (cf. Seleskovitch
1976 on the overspecification of dictionary entries but the underspecification
of words in context words are tokens of meaning that the recipient is left to
construct, in the final round). Itkonens (this volume) view on shared meaning
is consistent with those of Fox and Seleskovitch in that while mental images (of
events, for example) are subjective and vary from person to person, meanings
are intersubjective because they are determined by the immediate cooperative
exchange. Contextualization cues accompany propositions (i.e., the lexical mate-
rial and associated grammar) so as to allow interlocutors to retrieve contextu-
alizing information and thereby assess the communicative intent of utterances
(Gumperz 1995). In fact, Verhagen (2005:10) suggests that the function of dis-
course is not just to be informative, but to engage in cognitive coordination,
that is, to influence another persons thoughts, attitudes and behaviours, and for
Intersubjectivity in interpreted interactions 343

the addressee to ascertain what kind of influence the speaker might intend and
whether or not to go along with it.
Meaning, then, is not something objective found in the words and construc-
tions of language, to be discovered and conveyed, but is co-constructed between
discourse participants in an immediate social context (Wilcox and Shaffer 2005).
This is the case for any interlocutors communication attempts (Coates 1995), and
it is no less true for interpreted discourse. Thus, as Wilcox and Shaffer state, the
interpreter cannot convey someones meaning, but must co-construct meaning
along with the receiver in a specific situation for some specific purpose. This is
the essence of dialogic discourse (Linell 1997; Wadensj 1998), in fact, and in
the case of interpreted discourse the interpreter and the receiver are co-partici-
pants in the co-construction of meaning. The significance of this is that a co-con-
structed target text will not be successful if the interpreters output is formulaic,
that is, based on producing grammar without regard to the co-constructors con-
textualized participation in terms of their situational context, past knowledge and
experience, and relationship with the originator of the source text. The interpreter
must balance grammar and contextualization cues in co-constructing a meaning-
based text. This is reflected in Enkvists (1991) position:
Texts, so we may assume, are ordered, linearized, in ways which somehow op-
timize the order in which a speaker or writer wishes to eliminate the receptors
uncertainties and expose to him the states of affairs that define the particular
scenario or text world he wants to communicate. But as linearization patterns
must be affected both by the syntactic constraints and by the cultural and rhe-
torical traditions associated with the specific language and text type which are
involved, a given linearization pattern cannot always be transferred unchanged
from source to target language. This lack of direct transfer is, then, caused partly
by syntax, but also partly by cultural differences and rhetorical traditions linked
to types of discourse and texts. (Enkvist 1991:6)

Enkvist identifies three elements of texts that affect the transfer of material or
linearization patterns from source to target: syntax (or grammar), cultural dif-
ferences and rhetorical traditions. Interpreters may expect discrepancies between
the source and target languages in each of these three areas, and thus must be
prepared to encounter mismatches while interpreting.
Regarding dissimilarities in grammar, ASL-English interpreters are faced
with two quite distinctly structured languages. For example, a significant but fun-
damental difference is word order. English is a fixed SVO (subject-verb-object)
344 Terry Janzen and Barbara Shaffer

word-order language (Comrie 1981) whereas ASL has variable word order in the
sense that more ordering options are open to the ASL signer. When the verb is
final (e.g., SOV), there is a greater tendency for the verb to be morphologically
complex (Liddell 2003). ASL verb structures that are highly complex morphologi-
cally may appear as single words that comprise a whole sentence with no overt
subject or object nominals (Brentari 2002). As well, third person pronouns in
ASL are not gender specific whereas they are in English (Liddell 2003). The use
of space is prominent in the articulation of ASL such that numerous relationships
between subject, object and verbal action necessarily include a spatial dimension,
as do aspect marking and temporal relations (see, for example, Emmorey 2002),
whereas these spatial features are not overt in spoken English. These and numer-
ous additional differences between the grammars must be within the interpreters
working repertoire, but she must also be able to distinguish between what is re-
quired in the grammar and what is a discourse strategy, given that discourse strat-
egies involve some grammatical option chosen among several along with contex-
tualization cues as appropriate.
However, Enkvist also addresses cultural differences and rhetorical tradi-
tions, each of which can also motivate the interpreters linguistic choices. Dis-
tinct communities of language users exhibit cultural attributes that correspond to
their history, traditions and experiences. Two things about culture are not quite so
clear for interpreters, however. First, it is not always evident what the interpreter
should do with a cultural difference. Should the interpreter compensate for the
difference as Mindess (1999) maintains, or facilitate an exchange of cultures,
that is, allow the interlocutors to experience each others culture (Simon 1995)?
Second, it is often not obvious how cultural elements are reflected in specific lin-
guistic constructions. How does the interpreter know that a particular linguistic
or grammatical choice fits the supposed cultural experience of the recipient? Cul-
tural differences may well be mediated by certain types of contextualization. This
is not specific to one language or another, although the linguistic cues themselves
will likely be language-specific.
Rhetorical traditions refer to the types of text structure associated with
particular genres that are frequently used within a community of speakers or
signers. An example might be the typical way in which a visitor is introduced
to members of the community. In the Deaf community a new Deaf person is
most likely introduced by telling about her school history and by mentioning ac-
quaintances who may be known to both people whereas a hearing person being

. Liddell (2003), however, concludes that the basic order in ASL is SVO, with a leftward ex-
tra-clausal topic slot and a rightward post-clausal subject pronoun slot. Thus variability is more
complex than simply reordering S, V, and O.
Intersubjectivity in interpreted interactions 345

introduced to others in a hearing community may be identified by her profes-


sion (Mindess1999). Another example concerns the overall discourse structure
sometimes attributed to ASL and English as the so-called diamond shape of
ASL discourse (introduction of the topic or point of the discourse-elaboration-
concluding reiteration of the topic/point) versus a funnel shape of discourse
for English (initial general comments-narrowing to the eventual point of the
discourse) discussed by Christie et al. (1999). For the interpreter, this example
presents difficulties on a number of levels. Its application is often overgeneralized
even though it is not yet clear how viable an analysis it is nor how broadly it can
be applied to ASL and English genres. Further, these discourse structure-types
may each appear in the other language, for example the diamond-shaped topic/
point introduction-elaboration-concluding repetition of the point is an excellent
teaching strategy used by many English speaking teachers. In any case, deciding
that one or the other discourse type is appropriate in a particular register says
nothing about specific linguistic or grammatical structures that should be used;
rhetorical traditions have more to do with framing information (Gile 1995) than
with particular linguistic constructions.10
This is not to say that discourse strategies are never associated with particular
lexical items or grammatical constructions. Topic negotiation in ASL, for exam-
ple, frequently begins with items such as KNOW.THAT,11 KNOW, or UNDER-
STAND along with facial topic marking (Janzen 1998), but note that here too the
signer has lexical options whose choice depends on pragmatic factors within the
immediate discourse situation.
In sum, Lawrence (1995) and Humphrey and Alcorn (2001) discuss expan-
sion elements as required at the level of grammar when in fact they are best un-
derstood as discourse strategies. Further, even though these authors attribute ex-
pansions to the grammar of ASL, they are not language specific strategies, but
to one degree or another are all used among speakers or signers no matter what
the language might be. This can equally be applied to so-called compressions
(Finton and Smith 2004), which recently have been proposed as necessary when
interpreting from ASL into English. It is true that certain strategies are favored by

10. Gallagher and Hutto (this volume) suggest that speakers have a lifetime of narrative practice
and know a lot about what another speaker is taking about because of cultural and experiential
knowledge that narrative frames provide. Interpreters need this knowledge in two languages,
but may be missing the early practice Gallagher and Hutto speak about if they learned one of
their working languages later as adults.
11. Some signs in ASL are commonly glossed using more than one English word. When this is
the case we use upper-case words separated by periods.
346 Terry Janzen and Barbara Shaffer

one community of language users or another, but this in no way can mean that
these are exclusive properties of a particular language itself and not others.

4. Assumptions about shared knowledge

As we have stated, it appears that interpreters are making the assumption that
expansions are an a priori feature of ASL grammar. This leads to the further
assumption that expansions are an obligatory component of ASL signers dis-
course. As a result, they are being used as a kind of default without regard to
other potential interpreting strategies, nor to the discourse pragmatics surround-
ing that particular interpreted interaction. It is not always obvious what strategy
the interpreter should be using, but Janzen (2005a) makes it clear that strategizing
constitutes a contribution that the interpreter makes to the resulting text. Leeson
(2005) claims that additions to the interpreters target text have at times been
wrongly labeled as miscues (Cokely 1992) when in fact the interpreter might
from time to time believe that the text will be more clear if a strategic addition is
made. Leeson makes the point, however, that this strategy is just one of many the
interpreter must be skilled at, and when used, it must be a conscious decision that
affects the resulting text but, the interpreter knows, is in addition to the original
text. In other words the expansion belongs to the interpreter, not to the text.

4.1 Co-constructing discourse

Coherence in communication depends on shared knowledge on numerous levels,


for example shared experience, worldview, and linguistic knowledge. On these
bases, interlocutors exchange clues to their intended meanings in the way that
they choose linguistic items and constructions to represent pieces of meaning.
It is worth reiterating here that meaning cannot be transferred directly, but must
be constructed and re-constructed by the speaker or signer and the recipient, re-
spectively (Wilcox and Shaffer 2005). In a sustained sense, this reflects Brincks
(this volume) intersubjective notion of interactivity whereby agents act so as
to directly affect each other, resulting in turn-taking. Interlocutors frame their
discourse in terms of assumptions about shared knowledge, introducing pieces
of information in such a way as to maintain an appropriate level of coherence in
the interaction.
According to Chafe (1994) information will either be active, new, or semi-
active. Shared information is most obviously that which is already active in
the discourse, in other words, it is information that is within the interlocutors
Intersubjectivity in interpreted interactions 347

c onsciousness at the present moment in their interaction. New information is in-


formation that is newly activated at some point in the conversation. Semi-active
information, however, is information that has been introduced at a previous point,
thus has been active at some previous time but is no longer the immediate focus
of the discourse. Shared information may be semi-active too in that interlocutors
approach their discourse with a certain amount of shared context, whether or not
these contextual aspects are ever expressed. Chafe suggests that how the speaker
(or signer) encodes each type of information structurally will differ, but that this
determination critically depends on the originators assumptions about the status
of the information in the addressees consciousness at the time. Regarding semi-ac-
tive information, the further back in time an item was mentioned, the less likely it
will be active in the consciousness of the addressee. Importantly, this suggests that
assumptions about the activeness of an item may not always be correct, considering
that each interlocutor will not likely be thinking the exact same thing about pieces
of information at all times. The speaker may be keeping something more active in
her consciousness while the addressee has not retained it as a focus, for example.
This is congruent with Givns (1995) discussion of referential distance (RD), in
which he measures RD in terms of the number of intervening clauses between a
past mention of an item and the most current mention. The greater the number of
intervening clauses that do not include a mention, the more likely it is that the item
does not retain its active (or even semi-active) status for the addressee.
In co-constructing their discourse, interlocutors negotiate the relevance of
topics, facts and observations about these topics, evaluations of statements, and
numerous additional subjective elements (Scheibman 2002). Each interlocutor
brings her view of the discourse event to that event, which includes assumptions
about her addressees knowledge store, and she shapes her contributions based on
ever-changing perceptions of the event as it progresses. Each interlocutors per-
ceptions, contributions and reactions constitute interaction that is systemically
intersubjective. When discourse is mediated by an interpreter there are several
effects. Most importantly, even though the interpreter has often been conceived
of as an objective, non-participating communication facilitator, when viewed as
a co-participant co-constructing meaning, as Wadensj (1998), Roy (2000) and
Wilcox and Shaffer (2005) suggest, it is undeniable that the interpreter also con-
tributes to the intersubjective relationship in the interchange. Second, the inter-
preters construction of meaning is complicated by the fact that she does not share
the same context as the primary discourse participants given that she most likely
spends the majority of her time outside the other participants environment(s).
Third, the interpreter brings her own conceptualized view of the event to the dis-
course. And fourth, while the interpreter builds her own discourse relationships
348 Terry Janzen and Barbara Shaffer

with each of the other participants on-line, she must work to make sense of what
she perceives to be the intersubjective relationship between those participants
themselves, and further, she must attempt to linguistically represent each of their
inputs into their discussion as if it were still their own.
All of this has the potential to impose shared knowledge in an artificial way.
The interpreter necessarily filters an input utterance through her own understand-
ing of what it must mean, and chooses her target text coding based on what she
understands the target recipient to know. The question remains, however, as to
whether or not she can instead formulate her text based on what she ascertains
the speaker to be assuming about what the target recipient knows. As a regular
discourse strategy, interlocutors negotiate meaning, whereas it seems to be the
case that interpreters often expect to be able to simply take a speakers message,
reformulate it in the target language, and deliver it perfectly. Frishberg (1990) re-
flects this perspective by saying that accuracy means rendering a message from
the senders language into the addressees language and giving the receiver the
complete message (1990:65). Hatim and Mason (1990) argue that the interpreter
must necessarily be a negotiator of meaning too, which if true, implies a particu-
lar stance in the interaction: she must appeal to shared knowledge. As a part of
negotiation, the interpreter should be able to contextualize as needed, listed as an
interpreters coping strategy in Gile (1995). Contextualization is necessary to
make a negotiated message coherent for the recipient, but it may be noted that
contextualizing can occur in two phases of an interpreted discourse, that is, either
the original speaker may contextualize her statements as a reflection of her beliefs
about the addressees knowledge, or the interpreter may contextualize based on her
own beliefs about the addressee. When the latter occurs, the target text shifts away
from the originators actual message and becomes more subjective on the part of
the interpreter. It should also be noted that not contextualizing is also a discourse
strategy. If the speaker does not contextualize, she still is considered to be negotiat-
ing, and determining that no contextualization is needed.
What then if the interpreter chooses to contextualize? First and foremost, the
interpreter must be cognizant of what stance the original speaker has taken as a
meaning negotiator, that she is now choosing a different route, and that the effect
of this route on the addressee may be different from what the speaker intended.
As the interpreter thus negotiates by contextualizing, the recipient will respond
based on the interpreters framing, not necessarily based on the originators fram-
ing. Returning to the notion of expansions as required by ASL grammar, we see
that they might occur because of the assumption that there is no other way to
frame an idea rather than because of an assessment of discourse appropriateness
and careful consideration of the originators purposeful choices. In other words,
Intersubjectivity in interpreted interactions 349

the interpreter expands without regard to the source text or the target recipients
needs. Because the direction of interpretation that expansions tend to occur in
is from English into ASL, it is the Deaf recipient who receives the expansion.
Stratiy (2005), who herself is an ASL-as-first-language user, claims that this prac-
tice disregards the recipient in making the assumption that she does not share
some context with her interlocutor, without any attempt at negotiation. Wadensj
(1998) documents similar examples in her study of Russian-Swedish interpreters
in medical settings. Regarding an interpreter who has over-explained what a
source speaker must have meant, Wadensj says that there can be a tendency to
underestimate the patients ability to understand (which is sometimes considered
patronizing) (1998:225). The interpreter has not understood the intersubjective
dynamic between the speaker and addressee, and obstructs dynamic sharedness
in her interpreted text.
There are several effects here. First, by the interpreter believing she is required
to expand at some point in the text, she sends a message to her recipients that
they must be expanded to. Stratiy suggests that this is based on an assumption
that the Deaf recipient will not be able to understand an un-expanded struc-
ture, which views the Deaf person as deficient or lacking. Stratiy questions why
interpreters do not make the opposite assumption, that the recipient might in fact
understand, but if not, will readily negotiate that point. Second, the interpreter
can often block the discourse participants from their own meaning negotiation,
during which they would otherwise learn about each other and what is shared
and not shared between them. Finally, we would like to suggest that interpret-
ers may use expansions at times when their own knowledge of lexicon or gram-
mar fails them. Paraphrasing is a well-known and often used interpreter strategy
(Gile1995; Leeson 2005) that allows the interpreter to accommodate lexical gaps
in the target language or lapses in recall; the problem with expansions as gram-
mar is that the interpreter believes that the expansion is the correct, and perhaps
only, grammatical structure and does not move beyond it to learn new lexical
items or grammatical phrasing options.

5. Expansions, contextualizing, and grammar

To reiterate our view on discourse negotiation in both dialogic discourse and in-
terpretation, we take the stance that contextualizing is a cooperative principle that
aids in the co-construction of shared meaning. To accomplish this, interlocutors
necessarily take advantage of existing grammar in the language they are using
grammar both facilitates and constrains what interlocutors can do:
350 Terry Janzen and Barbara Shaffer

Selection of lexical items and grammatical decision making may be more dif-
ficult in one language than another because of differences in the variety of pos-
sible choices and in the flexibility of linguistic rules: a very wide vocabulary as
opposed to a more restricted one, flexible or rigid lexical usage, or the number of
possible escape routes in sentence structuring in case the source language leads
to an unexpected segment and forces one to reconsider ones options.
 (Gile 1995:234235)

ASL, like any human language, has numerous grammatical features that are
useful when interlocutors contextualize as a discourse-level meaning negotia-
tion strategy. Topic-comment constructions, identified as a basic sentence type
(Ingram 1978; McIntire 1982; Janzen 1998), are sentences in which a topic con-
stituent appears first followed by a comment. The topic provides some grounding
information that situates a comment for the addressee. It is background informa-
tion the signer assumes the addressee will recognize as the ground from which to
view the comment. The comment constituent contains the actual point of the
utterance or the new information. In these constructions, the topic constituent
provides a kind of context for the main point found in the comment, and is cho-
sen because the signer believes that the addressee will recognize it as something
identifiable, that is, it is shared information.
In choosing a topic as grounding information, the signer may not always be
certain that it is in fact identifiable, thus a topic might be negotiated. Based on
cues from the addressee that an intended topic is not equally shared, the signer
will back up and fill in some contextual details to bring the addressee up to speed.
This accomplished (and acknowledged as such by the addressee), the signer may
reiterate the topic phrase she began with now that it has shared status, it works
as a grammatical topic phrase, and the discourse moves on to some comment(s)
about that topic. Thus, in ASL, topic phrases represent an intersection of grammar
at the sentence level and contextualization at the discourse organization level, and
so may be counted as an overt marker of intersubjectivity.
In English, interlocutors also choose constructions to negotiate in similar
ways. Embedded relative clauses often serve this purpose, for example. In (3) the
relative clause (in italics) contextualizes the guy it supplies more information
that the speaker believes to be identifiable to the addressee and which will assist
in specifying who the guy is.
(3) The guy who we thought we were meeting couldnt come.

Another example, also from English, is the use of that, as in (4):

(4) That plumber I called said I didnt need a new water heater yet.
Intersubjectivity in interpreted interactions 351

That is used in (4) as opposed to the articles the or a as a signal to the addressee
that the noun that follows is unquestionably shared, an intertextual reference
(Hatim and Mason 1990) to a previously shared discourse event.
The interpreters task is to recognize and make sense of contextualizing se-
quences in the utterances she is attempting to interpret, and relay that intertextual
sense by choosing constructions that in essence do the same work, even if the
construction in the target language is not of the same category as that chosen
in the source. A problem that this presents is that the previous context is not
necessarily shared by that interpreter, meaning that the same intersubjective mo-
tivation in choosing structures is not a part of the interpreters cognitive state
(Shaffer and Jansen 2002).
Even though some of these linguistic features may fit Lawrences rubric of
so-called expansions, it is not the case that their use should be viewed a priori as
required structurally. Rather, grammatical elements and constructions are chosen
that fit the specific situational parameters at the moment, and most importantly,
best facilitate the communicative goals of the speaker (Givn 2005). A relative
clause is not chosen by a speaker if it is deemed unnecessary; a topic in ASL is not
negotiated if there is no perceived reason to do so. Thus, even though the inter-
preter must use the grammatical items available in a given language and, as men-
tioned, is constrained by the grammar in that regard, contextualizing is clearly a
discourse-level interplay in which the interpreter is an equally subjective player.

6. Conclusions

In the intersubjective context of interpreted discourse, the interpreters partici-


pation in the negotiation of meaning includes an awareness of contextualizing
on the part of discourse speakers and signers, and the intentional use of con-
textualization in her own target message formulation. We have noted that the
interlocutors use of contextualization is purposeful and is part of their message
framing. Accurate interpretation must take into account such framing choices
along with message content. But secondly, the interpreter must understand that
her own contextualizing choices may shape the target text in ways that are decid-
edly different from the original text, and the source speaker is likely not aware of
these differences. In example (2) in Section 2 above regarding the interpretation of
the English word allergy, the interpreters expanded rendition has two important
consequences. First, it now enters the domain of the target recipients discourse
context, even though the expanded detail was never a part of the source speakers
negotiation. Second, the interpreter is developing her own intersubjective stance
352 Terry Janzen and Barbara Shaffer

with the recipient, which is not equally shared with the source speaker. Wadensj
(1998:234) says that some interpreters excessive willingness to explain for one
of the parties, thus taking over the responsibility of the other party, may also turn
out to be a trouble source obstructing sharedness between the primary interlocu-
tors (italics added).
In the case of allergy, the interpreter expands, or over-specifies, by adding
details that were not in the source message (whether unconsciously or by design).
The choice shows the interpreters construal of (1) what the recipient would not
understand, and (2) what allergy must mean. Unfortunately, by listing a particular
set of sources of allergic reactions the interpreter has also defined the set what if
the doctor, when inquiring about allergies, had in mind an allergy to latex?
Nonetheless, in interpretation generally, there are numerous points in the
process where contextualizing is a beneficial interpreting strategy, and in fact, it is
a regular feature of discourse that is particularly useful in interpretation. Contex-
tualization occurs pervasively in discourse, and there is no reason to expect it not
to occur in interpretation as well. We stress, however, that contextualization must
be a part of the interpreters conscious decision-making process. The interpreter
holds a unique position because it is her task to reformulate the intentions of one
speaker into the linguistic terms and discourse framing of an entirely different
language, and yet she must work to allow interlocutors to build their own inter-
subjective relationship. To do this, she makes myriad subjective linguistic and
framing choices, including whether to contextualize or not and, depending on the
nature of these choices, may either promote or obstruct the intentional interac-
tion between those she is interpreting for.
In this view, the interpreter never takes a neutral stance it simply is not avail-
able to her. Regarding the intersubjective nature of discourse, interpreters have a
dual role to fill. On one hand, as cross-language mediators, they form a discourse
relationship with the speaker of language A and another with the speaker of lan-
guage B. But on the other hand, they attempt to reconstruct the very subjective
messages between the speakers of languages A and B without significantly inter-
fering with the intersubjective relationship the two interlocutors are building with
each other. The interpreters knowledge store will not match perfectly those of the
interlocutors she is interpreting for, so she must be cognizant of the linguistic cues
she encounters from each to help her in the mutual construction of meaning so
that even though the two interlocutors do not share the same linguistic system,
they are communicating meaningfully with each other.
Intersubjectivity in interpreted interactions 353

Acknowledgements

First and foremost we wish to thank the late Donna Korpiniski, interpreter ex-
traordinaire, for her forward thinking and cognizant reflection on her work, and
for many fruitful discussions that have helped us formulate our own ideas pre-
sented here. Thanks also to the editors of this volume for their support, sugges-
tions and thoughtful reviews.

References

Brentari, D. 2002. Modality differences in sign language phonology and morphophonemics.


In Modality and Structure in Signed and Spoken Languages, R.P. Meier, K. Cormier and
D.Quinto-Pozos (eds.), 3564. Cambridge: Cambridge University Press.
Brinck, I. this volume. The role of intersubjectivity in the development of intentional com-
munication.
Chafe, W. 1994. Discourse, Consciousness, and Time: The Flow and Displacement of Conscious
Experience in Speaking and Writing. Chicago: The University of Chicago Press.
Christie, K., Wilkins, D.M., Hicks McDonald, B. and Neuroth-Gimbrone, C. 1999. get-to-
the-point: Academic bilingualism and discourse in American Sign Language and writ-
ten English. In Storytelling and Conversation: Discourse in Deaf Communities, E. Winston
(ed.), 162189. Washington, DC: Gallaudet University Press.
Coates, J. 1995. The negotiation of coherence in face-to-face interaction: Some examples from
the extreme bounds. In Coherence in Spontaneous Text, M.A. Gernsbacher and T. Givn
(eds.), 4158. Amsterdam/Philadelphia: John Benjamins.
Cokely, D. 1992. Interpretation: A Sociolinguistic Model. Burtonsville, MD: Linstok Press.
Comrie, B. 1981. Language Universals and Linguistic Typology. Chicago: University of Chicago
Press.
Du Bois, J.W. 2003. Discourse and grammar. In The New Psychology of Language: Cognitive
and Functional Approaches to Language Structure, Volume 2, M. Tomasello (ed.), 4787.
Mahwah, N.J.: Erlbaum.
Emmorey, K. 2002. Language, Cognition, and the Brain: Insights from Sign Language Research.
Mahwah, NJ: Lawrence Erlbaum.
Enkvist, N.E. 1991. Discourse type, text type, and cross-cultural rhetoric. In Empirical Re-
search in Translation and Intercultural Studies, S. Tirkkonen-Condit (ed.), 516. Tbingen:
Gunter Narr Verlag.
Finton, L. and Smith, R.T. 2004. The natives are restless: Using compression strategies to de-
liver linguistically appropriate ASL to English interpretation. In CIT: Still Shining After
25 Years, Proceedings of the 15th National Convention, Conference of Interpreter Trainers,
E.M.Maroney (ed.), 125143. USA: CIT.
Fox, B.A. 1994. Contextualization, indexicality, and the distributed nature of grammar. Lan-
guage Sciences 16 (1): 137.
Frishberg, N. 1990. Interpreting: An Introduction, Revised Edition. Silver Spring, MD: RID Pub-
lications.
354 Terry Janzen and Barbara Shaffer

Gallagher, S. and Hutto, D.D. this volume. Understanding others through primary interaction
and narrative practice.
Gile, D. 1995. Basic Concepts and Models for Interpreter and Translator Training. Amsterdam/
Philadelphia: John Benjamins.
Givn, T. 1984. Syntax: A Functional-Typological Introduction, Volume 1. Amsterdam/Philadel-
phia: John Benjamins.
Givn, T. 1995. Coherence in text vs. coherence in mind. In Coherence in Spontaneous Text,
M.A. Gernsbacher and T. Givn (eds.), 59115. Amsterdam/Philadelphia: John Benja-
mins.
Givn, T. 2005. Context as Other Minds: The Pragmatics of Sociality, Cognition and Communica-
tion. Amsterdam/Philadelphia: John Benjamins.
Grice, H.P. 1975. Logic and conversation. In Syntax and Semantics 3: Speech Acts, P. Cole and
J. Morgan (eds.), 4158. New York: Academic Press.
Gumperz, J.J. 1982. Discourse Strategies. Cambridge: Cambridge University Press.
Gumperz, J.J. 1995. Mutual inferencing in conversation. In Mutualities in Dialogue, I. Markov,
C.F. Graumann and K. Foppa (eds.), 101123. Cambridge: Cambridge University Press.
Hatim, B. and Mason, I. 1990. Discourse and the Translator. London and New York: Longman.
Humphrey, J.H. and Alcorn, B.J. 2001. So You Want to Be an Interpreter? An Introduction to Sign
Language Interpreting (3rd Ed.). Amarillo, TX: H&H Publishers.
Itkonen, E. this volume. The central role of normativity for language and linguistics.
Ingram, R.M. 1978. Theme, rheme, topic, and comment in the syntax of American Sign Lan-
guage. Sign Language Studies 20: 193218.
Janzen, T. 1998. Topicality in ASL: Information Ordering, Constituent Structure, and the Func-
tion of Topic Marking. Ph.D. Dissertation, University of New Mexico, Albuquerque, NM.
Janzen, T. 2005a. Introduction to the theory and practice of signed language interpreting. In
Topics in Signed Language Interpreting: Theory and Practice, T. Janzen (ed.), 324. Amster-
dam/Philadelphia: John Benjamins.
Janzen, T. 2005b. Interpretation and language use: ASL and English. In Topics in Signed Lan-
guage Interpreting: Theory and Practice, T. Janzen (ed.), 69105. Amsterdam/Philadelphia:
John Benjamins.
Janzen, T. and Shaffer, B. 2003. Implicit versus explicit coding across two languages: Mis-
matches of cognitive domains during interpretation. Paper presented at the 8th Interna-
tional Cognitive Linguistics Conference. La Rioja, Spain, July 2025, 2003.
Lawrence, S. 1995. Interpreter discourse: English to ASL expansions. In Mapping our Course:
A Collaborative Venture, Proceedings of the Tenth National Convention, Conference of In-
terpreter Trainers, E.A. Winston (ed.), 205214. United States: Conference of Interpreter
Trainers.
Leeson, L. 2005. Making the effort in simultaneous interpreting: Some considerations for
signed language interpreters. In Topics in Signed Language Interpretation: Theory and
Practice, T. Janzen (ed.), 5168. Amsterdam/Philadelphia: John Benjamins.
Liddell, S.K. 2003. Grammar, Gesture, and Meaning in American Sign Language. Cambridge:
Cambridge University Press.
Linell, P. 1995. Troubles with mutualities: Towards a dialogical theory of misunderstanding
and miscommunication. In Mutualities in Dialogue, I. Markov, C.F. Graumann and
K.Foppa (eds.), 176213. Cambridge: Cambridge University Press.
Intersubjectivity in interpreted interactions 355

Linell, P. 1997. Interpreting as communication. In Conference Interpreting: Current Trends in


Research, Y. Gambier, D. Gile and C. Taylor (eds.), 4967. Amsterdam/ Philadelphia: John
Benjamins.
McIntire, M.L. 1982. Constituent order & location. Sign Language Studies 37: 345386.
Mindess, A. 1999. Reading Between the Signs: Intercultural Communication for Sign Language
Interpreters. Yarmouth, ME: Intercultural Press, Inc.
Quigley, S.P. and Youngs, J.P. 1965. Interpreting for Deaf People. Washington, DC: U.S. Depart-
ment of Health, Education, and Welfare.
Roy, C.B. 2000. Interpreting as a Discourse Process. New York: Oxford University Press.
Scheibman, J. 2002. Point of View and Grammar: Structural Patterns of Subjectivity in American
Conversation. Amsterdam/Philadelphia: John Benjamins.
Seleskovitch, D. 1976. Interpretation, A psychological approach to translation. In Translation:
Applications and Research, R.W. Brislin (ed.), 92116. New York: Gardner Press.
Simon, S. 1995. Delivering culture: The task of the translator. In Perspectives davenir en tra-
duction, M. Aubin (ed.), 4356. Winnipeg: Presses universitaires de Saint-Boniface.
Shaffer, B. and Janzen, T. 2002. Topic marking: What signers know and interpreters dont.
Paper presented at the Association of Visual Language Interpreters of Canada national
conference. Halifax, Nova Scotia, July 2226, 2002.
Shaffer, B. and Janzen, T. 2004. Contextualization in ASL-English interpretation: A question of
grammar or discourse strategy. Paper presented at the Conceptual Structures, Discourse
and Language conference. Edmonton, Alberta, October 810, 2004.
Stratiy, A. 2005. Best practices in interpreting: A Deaf community perspective. In Topics in
Signed Language Interpreting: Theory and Practice, T. Janzen (ed.), 231250. Amsterdam/
Philadelphia: John Benjamins.
Susswein, N., and Racine, T.P. this volume. Sharing mental states: Causal and definitional is-
sues in intersubjectivity.
Verhagen, A. 2005. Constructions of Intersubjectivity: Discourse, Syntax, and Cognition. New
York: Oxford University Press.
Verhagen, A. this volume. Intersubjectivity and the achitecture of the language system.
Vygotsky, L.S. 1978. Mind in Society: The Development of Higher Mental Processes. Cambridge,
MA: Harvard University Press.
Wadensj, C. 1998. Interpreting as Interaction. London and New York: Longman.
Wilcox, S. and Shaffer, B. 2005. Towards a cognitive model of interpreting. In Topics in Signed
Language Interpreting: Theory and Practice, T. Janzen (ed.), 2750. Amsterdam/Philadel-
phia: John Benjamins.
chapter 15

Language and the signifying object


From convention to imagination

Chris Sinha and Cintia Rodrguez

In this chapter we argue that intersubjectivity cannot be grounded in individual


mental or representational content. Intersubjectivity, therefore, is not equiva-
lent to common knowledge, rather common knowledge (indeed individual
knowledge in the true representational sense) depends upon intersubjectivity.
Intersubjectivity is the fundamental basis of what Durkheim (and Searle follow-
ing him) have called social facts, which are irreducible to (though they depend
upon) biological and individual psychological facts. Intersubjectivity is based
upon participation in joint action, and such participation also implicates the
shared material, interobjective world. Participatory engagement with signifying
objects accompanies and underpins the childs entry into the symbolic realm
of language, and makes possible the development of subjectivity and cultural
identity through participation in narrative practices.
[It] is always difficult for the psychologist to think of
anything existing in a culture We are, alas, wedded
to the idea that human reality exists within the limiting
boundary of the human skin! (Bruner 1966:321)
The body is our general medium for having a world
Sometimes the meaning aimed at cannot be achieved
by the bodys natural means; it must then build itself
an instrument, and it projects thereby around itself a
cultural world. (Merleau-Ponty 1962:146)
Observation of O. at 2:4;5. Father goes to get him from
the car seat. O. keeps his eyes closed, eyelids quivering
slightly, with a slight smile. Then he opens his eyes and
says Im sleeping, laughing.
358 Chris Sinha and Cintia Rodrguez

1. Intersubjectivity and the ontology of the social

This chapter has two primary aims. The first is to propose an account of the so-
cial nature of the shared mind. The second is to put forward arguments and evi-
dence for regarding material objects, especially artefacts, as a crucial ingredient
of intersubjectivity and its development. In this section we advance philosophical
and psychological arguments for considering the shared mind to be fundamental-
ly social. We critically assess the methodological individualism guiding the con-
strual by most philosophers and psychologists of the notion of intersubjectivity,
and propose an alternative construal of intersubjectivity which sees it as rooted in
an ontology of the social, whose methodological counterpart is the Durkheimian
concept of the social fact.
There are two fundamentally different ways to conceive of the shared mind.
The first of these is predicated upon the foundational status of individual mental
or representational content, and in particular of intentional states such as beliefs.
An intentional state is characterized, as Searle (1983) puts it, by its directedness to
whatever it is about. Intentional states can be about anything at all: object, event
or process, real or imaginary, and hence can also be directed at other intentional
states, whether those of the subject or that of another subject. I can, for example,
wish that I had thought of an idea before someone else did, or I can believe that
my next door neighbour believes in fairies, and so on. It is on this basis that the-
ory theories of intersubjectivity are constructed: intersubjectivity is considered
to be a matter of knowledge (or belief, or intentional states in general). On this
account, intersubjectivity is essentially a matter of common knowledge in the
sense of Lewis (1969) (see also Clark 1996; Itkonen this volume).
It is indisputable that normal adult human beings do indeed base much of
their social reasoning on representations of other peoples mental states. There
are also good arguments for viewing social institutions such as language as objects
of common knowledge (Itkonen this volume). There are, however, at least three
objections to regarding the common knowledge account as sufficiently foun-
dational or inclusive to fully comprehend intersubjectivity. The first objection is
logical. The common knowledge account is immediately vulnerable to Humes
Other Minds problem: How can I know that the mental content that I ascribe to
you is the mental content that you actually have, even excluding cases of mistaken
or false ascriptions? In other words, if, for example, I (correctly) think that my
neighbour believes in fairies, how do I know that whatever it is that my neighbour
believes in is what I think they believe in? To know that, I have to be sure that
what my neighbours mental content is about is the same as what my neighbours
mental content under my representation is about. Without this guarantee of ref-
erential intersubjectivity, there can be no common knowledge. In other words,
Language and the signifying object 359

the common knowledge formulation of intersubjectivity presupposes, without


explaining, the intersubjectivity of reference. Another way of putting this is to say
that the common knowledge formulation presents an unsolved instance of the
Grounding problem, which requires a logically prior appeal to intersubjectivity
for its resolution (Sinha 1999).
The second objection to the common knowledge account is that intersub-
jectivity is as much about feeling as knowing. As has frequently been pointed out,
intersubjectivity is closely connected to the capacity for empathic identification.
However, the affective phenomenology of intersubjectivity extends beyond empa-
thy, in that there are some states of feeling that are constitutively intersubjective,
in the sense that they implicate the experience of the feeling of another directed
towards oneself. For example, for a couple to be in love it is necessary for each to
be in love with the other. The experience of being in love with a lover is quite dif-
ferent from the experience of being unrequitedly or disappointedly in love, for no
other reason than the intersubjectively shared nature of the former, as contrasted
with the forlornly solitary nature of the latter. Although knowledge of the others
feelings is important in this, knowledge is not all there is to it, since this intersub-
jective state also involves commitments and accountabilities a quintessentially
normative dimension that is fundamental to intersubjectivity, selfhood and to the
social domain in general (Shotter 1984).
The third (related) objection is that the common knowledge account of in-
tersubjectivity is disembodied; it does not take into account the intercorporeal
(Merleau-Ponty 1962) dimension of intersubjectivity, which manifests itself most
clearly in the mimetic nature of primary intersubjectivity from the earliest stages
of infancy (Zlatev this volume; Trevarthen 1979, 1998; Reddy, Hay, Murray and
Trevarthen 1997). It is in the shared experience of corporeally expressed, emo-
tionally rich states of the embodied mind that, as the French developmental psy-
chologist Henri Wallon insisted, we should seek the roots of the instersubjective
psyche (Netchine-Grynberg and Netchine 2002; Rodrguez 2006; Wallon 1984
[1925]; Zazzo 1975). We shall develop this argument further below, by demon-
strating that intersubjectivity and, by extension, institutions and conventions also
find material embodiment in artefactual objects.
Despite these briefly-sketched problems afflicting the common knowledge
account of intersubjectivity, it has been deeply influential, not only in philosophy
of mind but also in developmental psychology. This is, firstly, because it accords
with the tradition of reducing all realities existing between people to theories
about what goes on inside individual minds; and secondly because it also ac-
cords with the mentalist emphasis in classical cognitivism on the primacy of men-
tal representation. The ontogenetic version of this account seeks its explanation
for certain representational capacities of the adult mind the ability to represent
360 Chris Sinha and Cintia Rodrguez

representations in the autonomous domain of representation itself. Whether


theory of mind is proposed to be a consequence of meta-representational abili-
ties first applied to the childs own cognitive processes, or of an ability to read
the intentions of others, the basic assumption of the paradigm remains in place:
mind is an autonomous domain, and actions are secondary to the internal and
private intentional states which they reveal.
We turn now to the second, very different, conception of the shared mind,
which has its roots in Durkheimian social theory. The object of social theory, for
Durkheim, was the domain of social facts, which he described as a category of
facts which present very special characteristics: they consist of manners of act-
ing, thinking, and feeling external to the individual, which are invested with a
coercive power by virtue of which they exercise control over him. (Durkheim
1982 [1895]).
Social facts, for Durkheim, are not merely aggregates of the individual rep-
resentations of them by the subjects that are regulated or coerced by the social
facts, since for each individual subject the social fact presents itself as a part of an
out-there objective reality. The objectivity of social facts consists in the fact they
are independent of any single individuals thoughts or will. As Jones (1986:61)
puts it, it is precisely this property of resistance to the action of individual wills
which characterizes social facts. The most basic rule of all sociological method,
Durkheim thus concluded, is to treat social facts as things. Durkheims treatment
of social facts thus consists in, first, an ontological proposition, that social facts
are irreducible to biological or psychological facts (or structures or processes);
coupled with, second, an epistemological and methodological proposition re-
garding their treatment: as objects of a particular kind, whose determinate nature
consists in their coercion of conduct. We shall return below to the question of
what is implied by the treatment of social facts as objects.
Durkheim, it must be said, has often been criticized for the breadth and vague-
ness of his notion of social fact. A particularly problematic aspect of his theo-
ry is that, in counterposing social facts to individual conscience (or mind),
he sometimes identified the former with states of the collective conscience.
Some social psychologists (e.g. Moscovici 2000) have followed this direction in
constructing a theory of social representations, but critics have claimed that
Durkheim sympathized with a view of society as a kind of super-organic col-
lective personality. Whether or not Durkheim believed in the collective mind,
such a concept is not only scientifically untenable, it is unnecessary. We propose
that a social fact can most simply be defined as something regulating an aspect of
conduct which requires the participation (Goodwin and Goodwin 2004) of more
than one individual. This something may be a codified law, a norm, an institu-
tion, a rule in the Wittgensteinian sense, or a canon of interpretation. A natural
Language and the signifying object 361

language, therefore, qualifies as a social fact (indeed, as a social institution, see


Itkonen this volume) under this reading of Durkheims theory.
Let us now compare and (if possible) try to integrate Durkheims account with
the common knowledge account. Social facts (like any other facts) are potential
objects of intentional states. Individual beliefs about social facts (like any other be-
liefs) are also potential objects of intentional states (and hence of common knowl-
edge). The efficacy of at least some social facts depends upon their being the ob-
jects of common knowledge (Itkonen this volume). However, claimed Durkheim,
the social fact itself is not the sum, average or common denominator of all the
individual beliefs of all participants. Rather, Durkheim insisted, the social fact is in
some sense prior to these individual cognitions. This is at first blush puzzling, since
the collectivity of participants (or some authority amongst them) can, in principle,
change the social fact (e.g. the rules of a game) just by so deciding.
To clarify this issue, let us compare Durkheims view with that of Searle
(1995:12): There are things that exist only because we believe them to exist. I
am thinking of things like money, property, government, and marriages [such]
Institutional facts are so called because they depend upon human institutions for
their existence. Durkheim, we suggest, would have agreed with the second, but
not the first, of these propositions of Searle. How can we render this difference
intelligible?
The answer, we suggest, is to view social facts as constituting an emergent,
normative ontological level existent only in the intersubjective field of joint action
regulated by norms and commitments. Intersubjectivity is then essentially a mat-
ter of co-participation in joint action structures which, by virtue of their norma-
tive regulation, are conventionalized as social and communicative practices. Social
practices, and the norms regulating them, can be objects of intentional states, in-
cluding common knowledge, but they are not reducible to the aggregates of such
states. This formulation helps us to understand the specific sense in which so-
cial facts are methodologically necessarily treated as objects; they are instances
of the objectification, or reification, as conventional form, of intersubjectivity. As
such, they are also potential epistemic objects of common knowledge, including
scientific and theoretical knowledge.
Our account of intersubjectivity, then, accords priority to co-participation,
action and practice over individual mental states, both logically and ontogeneti-
cally. Note that this priority does not deny either the existence of individual men-
tal states, or the reflexive structure of common knowledge. Rather, it regards

. Of course, there must always have been some (mythic) inventor of a social fact, such as
money, and at least one other participant to understand the intention behind the invention, but
once invented, the social fact acquires a relative ontological autonomy.
362 Chris Sinha and Cintia Rodrguez

i ntersubjectivity as essentially social, and logically and ontogenetically pre-requi-


site to common knowledge. Indeed, we have argued that intersubjectivity is the
fundamental condition of all social facts, a proposal which we suggests consider-
ably clarifies Durkheims own formulations, while remaining true to his insight
into the relative autonomy of social facts from psychological facts.
Our proposal can now be compared with the following argument for the exis-
tence and role of collective intentionality advanced by Searle (1995:2526):
The requirements of methodological individualism seem to force us to reduce
collective intentionality to individual intentionality. [However] it does not follow
from [the individual possession of intentional states] that all my mental life must
be expressed in the form of a singular noun phrase referring to me. The form that
my collective intentionality can take is simply we intend, we are doing so-and
so, and the like the intentionality that exists in each individual head has the
form we intend.

Searles argument, however initially appealing, faces the problem of where this
primitive or a-priori we comes from. The answer that we offer is that it is the
lexico-grammatical expression of Intersubjectivity itself, deriving from and
grounded in joint action (we are doing so-and-so), regulated by the normative
social fact that makes recognizable the joint action as an instance of the practice
so-and-so. This answer, similarly to our argument above with respect to the on-
tological status of social facts, preserves Searles recognition of the foundational
status of intersubjectivity in joint intentions, while turning his own version of it
on its head. It is not, we maintain, the intentional state we intend X that is con-
stitutive of the practice X: rather, the intentional state is derived from the shared
practice X, whose conventionalized objectification as a social fact is the object
of the intentional state.
Searle (who fails to reference Durkheim) goes on to write (ibid. p. 26):
I will henceforth use the expression social fact to refer to any fact involving
collective intentionality. So, for example, the fact that two people are going for
a walk together is a social fact. A special subclass of social facts are institutional
facts for example, the fact that this piece of paper is a twenty dollar bill is an
institutional fact.

. This is an allusion to Karl Marxs assertion, both laudatory and critical, that he had turned
Hegels dialectical logic the right way up. The mystification which dialectic suffers in Hegels
hands by no means prevents him from being the first to present its general forms of motion in a
comprehensive and conscious manner. With him it is standing on its head. It must be inverted,
in order to discover the rational kernel within the mystical shell (Marx 1976 [1873]:103).
Language and the signifying object 363

It will be clear by now that, from our point of view, Searles proposal puts the cart
before the horse. We would maintain, rather, that collective intentionality is based
upon, not the source of, participation in joint action in an intersubjective field,
regulated by social facts (norms, institutions etc.).
What empirical evidence does developmental science offer for the existence
of an ontology of the social, and how might this bear upon the difference be-
tween our account and Searles collective intention account? The classic experi-
ment by Murray and Trevarthen (1986), who showed that infants were able to
distinguish between a real-time video-mediated (CCTV) image of their moth-
ers, and the same image recorded on videotape, unsynchronized with the infants
own actions, is extremely illuminating. The experiment, we would argue, provides
strong evidence of the reality of the ontology of the social as such (both as social
fact and as psychological reality); and of the biologically based readiness of very
young infants to participate. The important thing about such participation, which
distinguishes it from mere coordination on the basis of the stimulus situation,
is not only the temporal sequencing and rhythm of the interaction, but also the
subjective recognition of being engaged in participation indexed by the different
emotional reactions of the infants to the two stimulus situations.
Viewing primary intersubjectivity in terms of participation, rather than in-
tention, has important consequences for developmental theory. The implausi-
bility of attributing neonatal engagement to intentional mental contents has led
some developmentalists to neglect the significance of primary intersubjectivity,
and focus on secondary intersubjectivity (triadic joint attention) as the decisive
achievement in the development of the shared mind (Tomasello 1999). We main-
tain, in contrast, that all later forms of intersubjectivity are predicated on primary
intersubjectivity. It is, however, neither necessary nor correct to interpret the evi-
dence for primary intersubjectivity in terms of innate intentionality, whether
individual or collective. Rather, we prefer Trevarthens more recent formulation
in terms of motives for engagement, while emphasizing that the constitution
of engagement as intersubjective is effected as much by the structuring (by the
caretaker) of participation, as by the biological predisposition and capacity of the
infant to engage. Primary intersubjectivity, on this reading, is neither merely a
psychological nor merely a biological fact, but a proto-social fact supported by
human developmental psychobiology.
Furthermore, without denying the developmental significance of the sharing
of attention and of other individual intentional states, our prioritizing of partici-
pation in joint action enables us to conceptualize, similarly to Rakoczy (2006), the
primary inter-mental dimension of intersubjectivity as being a normatively regu-
lated commitment to the activity itself (see also Shotter 1978, 1995); and prompts
364 Chris Sinha and Cintia Rodrguez

us to re-examine the significance of the objects which in some sense carry or


signify such norms and conventions.

2. The object as a social-material signifier

Intersubjectivity is often conceived mentalistically, as a property of the unmedi-


ated mind. We reject the idea that intersubjectivity is to be considered as equiva-
lent only to inter-mental, in that we stress that inter-corporeality extends be-
yond the body to encompass objects. Intersubjectivity is materially grounded in
embodiment, and this embodiment extends beyond the skin to encompass its
mediation by objects, or what we shall call, following Latour (1996), interobjectiv-
ity. Such mediation, we propose below, can be regarded as the ontogenetically first
manifestation of semiotic mediation.
We proposed above an account of intersubjectivity in terms of co-participa-
tion in joint action structures which, by virtue of their normative regulation, are
conventionalized as social and communicative practices. This definition excludes
actions which may be directed towards others, but which are not framed as part
of an activity governed by a norm. It also excludes solitary activities which may
be governed by norms of performance or of achievement, such as gardening or
cooking a meal, which may properly be termed social practices, but which (when
performed alone) do not involve social interaction. It includes both semiotically
mediated discursive practices such as talking and gesturing, and socially orga-
nized non-discursive practices such as co-participation in games or in physical
constructions.
Primary intersubjectivity in infancy is a mode of co-participation in which
the body of the infant is not so much the vehicle or medium of engagement,
as the very engagement itself. Primary intersubjectivity is embodied in the stron-
gest sense of the word. In semiotic terms, there is no distinction between the
bodily movement as signifier, and the signified meaning that is communicated,
between the inter-mental and the inter-corporeal. There is also, as yet, no differ-
entiation between discursive and non-discursive co-participation.
Inter-corporeal co-participation is not supplanted in development, but is
elaborated and extended by semiotic mediation, most obviously in discursive
practices employing conventionalized gesture and language. In this section, we
explore the neglected role of objects (especially artefacts) in the constitution of
intersubjectivity and subjectivity. The neglect stems not so much from a failure
to recognize that the material world is an important dimension of co-participa-
tion, as from the tendency to downplay its semiotic status and regard it as mere
Language and the signifying object 365

context to language. Goodwin and Goodwin (2004:222), for example, define


participation as actions demonstrating forms of involvement performed by par-
ties within evolving structures of talk [our italics], although they also recognize
the need to expand our notion of human participation in a historically built so-
cial and material world by attending to material structure in the environment
(ibid. p. 239). Our purpose in this section is to foreground the semiotic aspect of
materiality, and the material aspect of meaning, and to analyze their role in the
development of intersubjectivity and normativity.
We owe the notion of semiotic mediation to Vygotsky, whose explanation of its
operation in cognition, and in cognitive development, focused on the internaliza-
tion of conventional signs originating in contexts of discursive practice. Although
Vygotsky attributed great importance to the formative role of language in the
emergence of inner speech and verbal thought, his employment of the concept
of semiotic mediation also encompassed the use of non-systematic signs, including
objects-as-signifiers. One of his most celebrated examples of semiotic mediation is
that of a mother tying a knot in the handkerchief of her child, to remind him of the
need to convey a message to the teacher a social practice which was widespread,
not only in Russia, until quite late in the 20th century. Vygotsky writes:
When a human being ties a knot in her handkerchief as a reminder, she is, in
essence, constructing the process of memorizing by forcing an external object to
remind her of something; she transforms remembering into an external activity.
This fact alone is enough to demonstrate the fundamental characteristics of the
higher forms of behaviour. In the elementary form something is remembered; in
the higher form humans remember something. (Vygotsky 1978 [1930]:51)

The semiotic value of the knot is conventional, not by virtue of the knot being an
element of a sign system, but because it is normatively framed by a social practice
of reminding. It is this frame of practice which underpins the meaning signified
by the knot on any given occasion, constituting the semiotic status of the knot as
an example in miniature of what Searle (op cit.) calls an institutional fact.
Vygotskys knot in the handkerchief, and Searles twenty dollar bill, are thus
both institutional facts; and both are exemplars of the material semiotic media-
tion of social practices the exchange of respectively information and goods.
We may note both similarities and differences between the two cases. First, the
similarities. There is no intrinsic property of the material substrate (cotton, paper
and ink) which determines the semiotic or monetary value of the token, which is
conventionally determined. Hence, the token is equivalent, for purposes of use,
to any other type-identical token, which need not be made of the same mate-
rial (a piece of string round the wrist, an electronic credit on a chip card). This
366 Chris Sinha and Cintia Rodrguez

i ndependence of semantic or monetary value from material substrate is, of course,


a fundamental property of signs, a mark, as it were, of the domain of semiosis or
signification.
Now we may take note of the differences between the knot and the twenty
dollar bill. If the monetary token is materially destroyed, the value that it signi-
fies is also destroyed, whereas if the knot is untied, the information it signifies
is not. The twenty dollar bill is cashed out or used up (for the purposes of the
user) once exchanged, since it passes from the ownership and control of the user.
However, its monetary value is preserved until it is withdrawn from circulation.
Conversely, the knot can be used again by the user to recall another, different
message, while the message signified by the knot no longer has any utility or com-
municative value once it has been exchanged. Finally, while it makes sense to say
that the knot stands for the message, the twenty dollar bill does not stand for
(say) twenty one-dollar bills, but is exchangeable for or equivalent to them. All
these differences can be summed up by saying that while the knot is a sign of the
message, the twenty dollar bill is its monetary exchange value, it is self-identical to
that value. Nonetheless, although the twenty dollar bill is not a sign, its self-iden-
tity to its monetary exchange equivalents is not physical, but social and semiotic.
Searles account of social or institutional facts (such as money) is that they
depend upon collective agreement and knowledge that, under determinate rules,
something counts as an instance of a social object. Hence, the general form of
such rules is:
X counts as Y in context C (Searle 1995:28).

Note, here, that this definition is wider than, but subsumes:


S (a sign) stands for M (a message) in context C.

For example, we could say that Vygotskys handkerchief counts as a sign for a mes-
sage in its context of use, and the standing for relationship obtains between the
handkerchief and the specific message in context C. So, on this interpretation, the
sign relationship can be expanded into:
X counts as S, and S stands for M, in context C.

The distinction between the counts as and the stands for relationship can now
be used to distinguish between the grammatical acceptability and the semantic
interpretation of a sentence (Itkonen this volume):
James eats meat counts as a correct sentence in English, and stands for the
proposition that James eats meat in context C.
Language and the signifying object 367

Note, now, that it is also in virtue of the combination of its formal arrangement
and its context, that the sentence James eats meat counts as an assertion of the
proposition; the sentence does not stand for the assertion, rather the act of ut-
tering it in a particular context is that assertion, just as Searles twenty dollar bill is
the twenty dollars, rather than standing for it. Hence, both the grammaticality and
the illocutionary force of an utterance are aspects of what the utterance counts as
(being) in its context, while its semantic interpretation is the interpretation that it
stands for, in that same context. All of this is irreducibly normative, and it is this
duality of normative structure, of counting as and standing for, that underlies
the conditions on representation that are analyzed by Sinha (1988:37): To repre-
sent something is to cause something else to stand for it, in such a way that both
the relationship of standing for, and that which is intended to be represented, can
be recognized. The fact that the standing for, or sign relation, is embedded in the
counting as, or institutional relation, also makes it clear why language must be
viewed as primarily a social institution (Itkonen this volume).
This account might suggest, too, that the institutional counting as relation is
somehow cognitively simpler than the sign relation. This cannot be the case with-
out qualification, since coined money was only invented in the period of 800-600
BCE, in Greece and China, at a time when we have ample evidence of written
language. Indeed, Sohn-Rethel (1977) argues that it was the invention of coinage
which simultaneously brought into existence both generalized commodity produc-
tion and the very notion, fundamental to logic, of abstract equivalence and purely
formal identity. Sociogenetically, then, institutional semiotic forms have continued
to be historically elaborated along with symbolic forms (Sinha in press).
Ontogenetically, however, we shall argue that the normative understanding of
counting as precedes the development of symbolization and language. To make
this argument, we briefly cite Searle once again, who points out that: in order
that something be a chair, it has to function as a chair; and hence, it has to be
thought of or used as a chair. Chairs are not abstract or symbolic in the way that
money and property are, but the point is the same in both cases. And the point,
of course, is normative. Let us examine more carefully the semiotics of material
artefacts. Now, anything can be used as a chair, provided it has the affordances, in
the sense of Gibson (1979), which permit it to be sat in or on. Such affordances
are part of what Searle calls the brute or natural facts, as opposed to insti-
tutional or social facts. Is there, however, any sense in which something can be

. This account also implies that Searle is wrong to characterize money as symbolic, inas-
much as symbolization involves denotation or representation (Sinha 2004). We can certainly
agree, however, that the self-identical relation of counting as is inherently semiotic as well as
social.
368 Chris Sinha and Cintia Rodrguez

said to properly count as a chair in Searles sense of an institutional fact? The


answer, we suggest, is yes: an object counts as a chair if it is an artefact intended
and designed to be used as a chair. The physical properties of the chair are then
no longer merely brute facts, but socially constructed and normatively regu-
lated affordances, which make possible the canonical function of the chair. The
canonical functions of artefacts are therefore social facts, and the material world
of artefactual objects is not one only of brute facts in their physical aspect, but
also one of social meaning.
We conclude that, in analogous fashion to the way that the twenty dollar bill
signifies (without standing for) its normative identity as a representation of ex-
change value, the artefactual object (such as a cup, a chair, or a computer) signifies
(also without standing for) its normative canonical function or use value. Objects,
then, not only (as with Vygotskys handkerchief) can be signs for something else,
but, when they are artefacts, as most objects we encounter in our everyday lives
are, are also signifiers of their proper, socially standard, canonical functions in a
context of social practices.
Of course, a condition for the semiotic status of artefacts, as with any semiotic
status, is that human subjects are capable of cognitively grasping it. As Searle says,
for a chair to function as a chair, it has to be used as a chair and thought of as a
chair. When do human infants begin to display such a cognitive grasp, and where
does it come from?

3. Early object use and exchange: Canonicality and normativity

In a series of experiments Walkerdine and Sinha (1978), Freeman, Lloyd and


Sinha (1980), Lloyd, Sinha and Freeman (1981), Freeman, Sinha and Condliffe
(1981), and Sinha (1982, 1983) investigated infants and young childrens under-
standing of object function, using infant search, action imitation and acting-out
language comprehension paradigms. In an age range from 9 months to 3 years and
6 months, they found error patterns which were characterized by canonicality ef-
fects. Infants at the end of the first year of life were more successful in A-not-B
search tasks (otherwise known as object permanence tasks) when the object was
hidden in an upright than in an inverted cup. It seems that these infants under-
stood that a cup is a better container when in an upright orientation than when
inverted. Slightly older infants were generally unable to imitate the placement of
a small block on the bottom of an inverted cup, preferring to turn the cup back

. Expressed in an older philosophical lexicon, canonicality of object function is a normative


phenomenon existing at the interface between Erste Natur and Zweite Natur.
Language and the signifying object 369

into an upright orientation and place the block inside the cup. In this response
strategy, the infants showed that they were locked into a normative apprehen-
sion of the cup as a canonical container, which over-rode the brute affordance of
the flat surface of the bottom of the inverted cup. Even after this response strategy
disappeared in action imitation tasks, it re-appeared in language comprehension
tasks: for example two year olds, when asked to place a block on an inverted cup,
turned it to the upright position and placed the block inside it.
These experiments can be interpreted as showing that, in the first place, ob-
jects are cognitively apprehended by infants, from an early age, in terms of their
socially-imposed, normative and canonical function (the object counts as a con-
tainer). In the second place, the emerging conceptualization of spatial relations
between objects is also derived as much from the canonical functional relations
which objects contract with each other as from purely perceptual-geometric in-
formation (for a discussion of the functional basis of spatial relational meaning,
see Vandeloise 1991).
Where does this understanding, on the part of the infant, of the canonical
function of objects come from? This question is important, because of the inti-
mate relationship between the physical properties of the artefact, and its socially
baptized canonical function. In contrast with, for example, the monetary token
(in which the relationship between the material from which the token is made,
and its exchange value, has historically become increasingly attenuated, arbitrary
and even, as money assumes the mantle of pure informational form, virtual), the
physical structure of traditional artefacts such as cups is not only non-arbitrary,
but essential to its fulfilment of its canonical function.
Infants motivation to explore the physical world is well known, and it might
be hypothesized that their apprehension of object properties in terms of func-
tion derives from an untutored, spontaneous sensori-motor engagement with the
object as a purely physical entity (for example, the exploration of the cavity of a
container giving rise to the dominance of this cavity in the early pre-conceptual
representation of the object).
We have several sources of evidence that this is not so. First, while there is
evidence of understanding of containment as a physical relationship at 6 months
(Hespos and Baillargeon 2001), we were unable to detect canonicality effects in
search tasks below the age of 9 months. This may, however, be a consequence
of a motor-involving against a violation-of-expectancies methodology. Second,
when the perceptual-cognitive link between canonical orientation and canoni-
cal containment function of cups was broken, by painting schematic faces ei-
ther upright or upside down on the cups, the canonicality effect in infant search
was abolished (Lloyd et al. 1981). This finding reinforces the conclusion that the
370 Chris Sinha and Cintia Rodrguez

c anonicality effect is dependent upon socially cued expectations about the nor-
mative use of the object.
Even more decisive experimental evidence for the role of joint action in es-
tablishing canonical object concepts comes from the experimental design used in
Freeman at al. (1981), where the object was functionally ambiguous, consisting
of a set of stacking / nesting cubes. The child was invited by the experimenter to
play with the entire set of cubes, and the experimenter set up this pre-test game
as either a nesting or a stacking activity. After successfully completing, as joint ac-
tion, an activity of constructing either a nest of cubes, or a tower of stacked cubes,
the experimenter extracted a medium-size cube and a small cube, and conducted
either an action imitation task involving the placement of the smaller cube on top
of/ inside/under the larger cube, or an acting-out language comprehension task
with instructions to place the smaller cube in, on or under the larger cube.
The results were dramatic. After playing a nesting game, childrens error patterns
showed a response bias similar to the canonicality effect manifested in the same
task using cups. In other words, there was a response preference for placing the
small cube inside the larger cube. However, this effect was abolished in the stack-
ing condition, in which there was a tendency to preferentially place the smaller
cube on top of the larger cube.
To conclude this review of experimental evidence, we emphasize that canoni-
cal function and orientation, though they are in some sense intrinsic to the
object as a material entity with determinate structure and affordances for human
action, are not essential object properties in the same way as object substance.
The stacking / nesting cubes experiment showed that the framing of the object
in terms of its normatively appropriate function and orientation can be local-
ly taught and negotiated. There is also inter-cultural variation in the canonical
orientation and function assigned to classes of objects which may be materially
identical between the cultures. For example, in the indigenous agrarian Zapotec
culture of Southern Mexico, baskets are commonly stacked, and are frequently
used as covers for foodstuffs and in childrens games of catching chickens. As well
as these differences between Zapotec and Euro-American cultural practices, the
Zapotec language lexicalizes the different spatial relations that are lexically distin-
guished by English in and under using a single body-part term, translatable
as the English word stomach. Young Zapotec children differed from their Dan-
ish counterparts not only in their response patterns in language comprehension
tasks using baskets, but also in non-linguistic action imitation tasks. The Zapotec
children clearly did not regard the relationship of what we consider to be canoni-
cal containment, and the orientation that we would regard as upright, as being
Language and the signifying object 371

canonical (Sinha and Jensen de Lopez 2000; Jensen de Lopez 2003; Jensen de
Lopez, Hayashi and Sinha 2005).
The experimental evidence we have reviewed supports the view, then, that
it is the intersubjective structuring of the childs participation in joint action, as
much as (and indeed more so) than the affordances of the object in itself , that
enables the child, in a process of guided reinvention (Lock 1980), to appropri-
ate the norms governing object use and to achieve an object representation in
terms of canonical function. This process has a long developmental history, and
the episodes of joint action are accompanied and mediated at every stage by the
use of communicative signs by the adult participant, as is attested by the observa-
tions reported by Rodrguez and Moro (1999, 2002, this volume; see also Moro
and Rodrguez 2005).
Throughout this developmental process, objects are invested with signifi-
cance. They become, for the child, material representations and signifiers of the
rules, norms, values, rituals, needs and goals of the entire matrix within which
they are embedded. In short, they become part of a meaningful system of signs
(Sinha 1988:204).

4. From signifying object to communicative symbol

Artefacts, we have argued, have an intrinsic meaning given by their canonical


function or use value. What an object means on any given occasion, however, is
dependent upon more than just canonical function. Not only can an artefact be
used non-canonically, as when, for example, a cup is used as a paperweight, but
there are also socially constituted meanings which are relatively (and conceptu-
ally) autonomous from the canonical use value of the object. Primary amongst
these, at least from a developmental point of view, is the meaning of the object as
an object of exchange.
Give-and-take routines develop in our culture early in the second year of life.
Such exchanges involve the super-imposition on the object of a semiotic status
which is independent of its canonical function: that of an abstract signifier, and
material embodiment, of a social relationship of exchange. Social and anthropo-
logical researchers from Marcel Mauss (2000 [192324]) onwards have posited
exchange as a fundamental human universal (see Goux 1990). Object-exchange
and the participatory induction of the infant into the normative knowledge of
canonical object function are not interactively distinct in earlier triadic exchanges
(Rodrguez and Moro this volume). However, the emergence of the give-and-take
372 Chris Sinha and Cintia Rodrguez

routine as a normative, reciprocal and mutually controlled format of co-partici-


pation lays the basis, we suggest, for the differentiation of signifier from signified
that is necessary for mastery of the symbolic system of language. The object now
becomes a signifier within a field constituted by differential, reciprocal and shift-
ing subject positions: that of giver and that of recipient.
From a communicative and symbolic point of view, the ability to negotiate
these shifting subject positions constitutes a precursor of deixis. The object se-
miotically mediates the constitution of the triadic interaction as implicating I
and You, a decisive differentiating moment in the construction of subjectivity. It
has often been claimed that subjectivity is constructed in and through language.
This may be so, in the sense that language provides the key symbolic support for
the adoption of differential subject positions, but the subject that occupies these
positions as simultaneously an I and a Me for You is, we suggest, constituted
at the moment of entry into language through participation in the proto-institu-
tion of object exchange. There is also a psychoanalytic dimension of investment
in this process of constitution of subjectivity, since the object signifies both power
(to give or to withhold) and desire (the object represents a wish whose fulfilment
is dependent upon the subjectivity of the Other, rather than being the immediate
goal of a simple demand).
Whether or not participation in give-and-take routines is a strict precondi-
tion for language acquisition, it is undoubtedly, in typical developmental trajec-
tories, a precursor of it. Object exchange is usually co-terminous with the early
stages of the development of language, and precedes the vocabulary explosion of
the second half of the second year of life. We hypothesize, then, that it represents
a fundamental step in the emergence of both subjectivity and the mastery of sym-
bolization. The voluntary control in object exchange of the grasp and relinquish-
ment of objects, governed by norms of communication rather than by immedi-
ate consummatory goals, prefigures the voluntary representational use of signs.
Object exchange formats introduce into the triadic structure of joint attention a
signifying element that is potentially extensible to the representational, standing
for function of language. It also puts in place, in schematic and skeletal form, the
perspectivally shifting dynamics of deictic identification of speaker and hearer.
Early object exchange, we submit, like the guided appropriation by pre-linguistic
infants of canonical object functions, is a neglected, fundamentally social, mate-
rially mediated aspect of the development of primary intersubjectivity towards
symbolic intersubjectivity.
Language and the signifying object 373

5. Beyond the dyad: Imagining communities and culture

In the preceding sections, we have focussed upon interobjectivity as semiotically


mediating the development of participation by the infant and young child in joint
actions based upon intersubjective and socially shared conventional meanings.
We have also focussed upon triadic contexts of interaction, in which the object is
the third term of a semiotic triangle constituted by the interactions between two
individuals (the prototypical dyad of developmental accounts of intersubjectivity,
and the ideal-typical speaker and hearer of linguistic theory). What is missing in
our account so far is the Social Third Person: not the Object, but the commu-
nity of practice and meaning that ultimately sanctions the norms governing the
interactions between any two or more participants in their dealings with social
reality. In confronting this construct Society, Community or however it may be
designated we encounter a fundamental problem of the social and human sci-
ences. How do we reconcile the agency of human subjects, their capacity for cre-
ating novelty, with the determining (though not strictly deterministic) structures
and processes which permit the development of the encultured and socialized
subject? In this section, we maintain our focus on the role of objects and inter-
objectivity in what Fogel, Valsiner and Lyra (1997) have called the dynamics of
indeterminism in developmental and social processes; with particular reference
to the article in that volume of that title by Smolka, de Ges and Pino.
Smolka et al. (1997:160) pose the following question: In what way is [the
development of the] sign related to the processes that generate or anchor creativ-
ity and individual resistance, the power of violating canonical rules? They report
and analyze an episode of socio-dramatic pretend play by a group of three 56
year old girls in the house corner of a primary school classroom, in which a
cowboy hat played a crucial role as a prop in an enacted dramatic narrative. The
hat, initially introduced into the play with an extended canonical meaning as a
fashion accessory, later became a signifier of a new identity adopted by one of the
girls as a feminine counterpart of a cowboy character who was a part of the back-
ground common knowledge of all the girls comprising the group. Crucial both to
the investigators interpretation of this process, and to the childrens construction
of their play world, was the creative linguistic designation of the character (sig-
nified by the hat as well as by the linguistic sign) as Bete Carrera, a grammatical
feminization (in the Portuguese language) of the name of the male cowboy char-
acter Beto Carrero recruited from common knowledge (see Sinha2005, for a
fuller analysis).
374 Chris Sinha and Cintia Rodrguez

As Smolka et al. point out, the cowboy hat, qua artefactual object, remained
throughout a hat, never used by the children as anything other than a hat. At the
same time, it became or, rather, came to signify more than the canonical
rules of object-usage that it embodied.
Through language, the children created Bete Carrera (Turn 7), the feminine of
Beto Carrero ... Language allows for this specific appropriation, for such a con-
struction and transformation; it allows for a performance that synthesizes old
and new modes and models of acting. Through language, it is possible to become
another, to become homo duplex or, in fact, multiplex. In this consists the dra-
matic character of human experience. (Smolka, Ges and Pino 1997:161)

The hat is thus simultaneously situated at two levels of meaning. At the first level,
its canonical function is appropriated enactively by the participants (by putting it
on and taking it off). At this first level, the construal of the hat is intersubjectively
shared, non-contested and constant: the hat remains a hat. At the second level, the
hat is invested with a surplus meaning which goes beyond canonicality. At this
second level the hat comes to signify the subjective positionings and perspectives
of the individual participants within a more comprehensive, discursively consti-
tuted and gendered frame, by means of which, say Smolka et al. (1997:161), the
signifying aspect of the (inter)subjective actions necessarily implies immer-
sion in language and meaning production.
The discursive frame is one of narrativity (Gallagher and Hutto this volume),
through which, as Lightfoot (1997:174) puts it, temporal rhythm becomes his-
tory, and transitory meanings become forms of knowledge which linger long
enough to be toyed with. Through intersubjectively shared and constructed nar-
rative, the world and the identity of the subject can simultaneously be explored,
renewed and consolidated. As we emphasized earlier, this is a process in which
emotional investment plays as important a role as cognitive structure, the two
aspects being fused in what the cultural theorist Raymond Williams called struc-
tures of feeling. Here is what Williams (1977:128) says about temporality in cul-
tural activity and structures of feeling:
If the social is always past, in the sense that it is always formed, we have in-
deed to find new terms for the undeniable experience of the present: not only
the temporal present, the realization of this and this instant, but the specificity
of the present being, the inalienably physical, within which we may discern and
acknowledge institutions, formations, positions, but not always as fixed products,
defining products.

Earlier, we drew upon Durkheims notion of social facts, emphasizing the al-
ready there and formed exteriority and objectivity of norms and institutions.
Language and the signifying object 375

Williams, in contrast, reminds us that it is through intersubjective agency in the


present that social life and its normative institutions are enactively re-fashioned,
permitting through the medium of shared narrative resources the construction
of both the here-and-now and face-to-face shared mind, and the imagined com-
munity of unknown others whose history and identity we share (Anderson 1991).
Indeed, it is through narrative deployment of the symbolic resources of language
that our social reality becomes simultaneously actual and virtual, constrained by
objectively existing circumstances, but pregnant with potentialities through the
investment of the present by the horizon of the future. Williams also reminds
us of the inalienably physical nature of participation and experience. Through-
out this chapter, we have emphasized the neglected but vitally important role not
only of the (inter)corporeal body, but of the (inter)objective materiality of shared
things at hand; not merely in sustaining, but in developmentally constructing
the shared mind.

References

Anderson, B. 1991. Imagined Communities. London: Verso.


Bruner, J.S. 1966. An overview. In Studies in Cognitive Growth, J.S. Bruner, R.R. Oliver,
P.M.Greenfield, J.R. Hornsby and H.J. Kenney (eds.), 319326. New York: Wiley.
Clark, H.H. 1996. Using Language. Cambridge: Cambridge University Press.
Durkheim, E. 1895. Les Rgles de la mthode sociologique. Paris: Alcan. 1894a, with slight modi-
fications, and a preface. Tr. 1982 as The Rules of Sociological Method. In The Rules of
Sociological Method and Selected Texts on Sociology and its Method, S. Lukes (ed.), 29163.
London and Basingstoke: Macmillan.
Fogel, A., Lyra, M. and Valsiner, J. (eds.). Dynamics and Indeterminism in Developmental and
Social Processes. Mahwah, NJ: Lawrence Earlbaum Associates.
Freeman, N., Lloyd, S. and Sinha, C. 1980. Infant search tasks reveal early concepts of contain-
ment and canonical usage of objects. Cognition 8: 243262.
Freeman, N., Sinha, C. and Condliff, S. 1981. Confrontation and collaboration with young
children in language comprehension tasks. In Communication in Development, W.P.Rob-
inson (ed.), 6388. London, Academic Press.
Gallagher, S. and Hutto, D.D. this volume. Understanding others through primary interaction
and narrative practice.
Gibson, J.J. 1979. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin.
Goodwin, C. and Goodwin, M.H. 2004. Participation. In A Companion to Linguistic Anthro-
pology, A. Duranti (ed.), 222244. Oxford, Blackwell.
Goux, J.J. 1990. Symbolic Economies. Ithaca, NY: Cornell University Press.
Hespos, S. and Baillargeon R. 2001. Knowledge about containment events in very young in-
fants. Cognition 78: 204245.
Itkonen, E. this volume. The central role of normativity for language and linguistics.
376 Chris Sinha and Cintia Rodrguez

Jensen de Lopez, K., Hayashi, M. and Sinha, C. 2005. Early shaping of spatial meanings in
three languages and cultures: Linguistic or cultural relativity? In Selected Papers from the
LACUS Forum XXXI 2003: Interconnections, A. Makkai, W.J. Sullivan and A.R. Lommel
(eds.), 377386. Houston, Texas: Linguistic Association of Canada and the Unites States.
Jensen de Lopez, K. 2003. Baskets and Body-Parts: A cross-cultural and cross-linguistic in-
vestigation of childrens development of spatial cognition and language. PhD dissertation,
University of Aarhus.
Jensen de Lopez, K., Hayashi, M. and Sinha, C. 2005. Early shaping of spatial meanings in
three languages and cultures: Linguistic or cultural relativity? In Selected Papers from the
LACUS Forum XXXI 2003: Interconnections, A. Makkai, W.J. Sullivan and A.R. Lommel
(eds.), 377386. Houston, Texas: Linguistic Association of Canada and the Unites States.
Jones, R.A. 1986. Emile Durkheim: An Introduction to Four Major Works. Beverly Hills, CA:
Sage Publications.
Latour, B. 1996. On interobjectivity. Mind, Culture and Activity 3: 228245.
Lewis, D.K. 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University
Press.
Lightfoot, C. 1997. Transforming the canonical cowboy: Notes on the determinacy and inde-
terminacy of childrens play and cultural development. In Dynamics and Indeterminism
in Developmental and Social Processes, A. Fogel, M. Lyra and J. Valsiner (eds.), 165174.
Mahwah, NJ: Lawrence Earlbaum Associates.
Lloyd, S., C. Sinha and N. Freeman. 1981. Spatial reference systems and the canonicality effect
in infant search. Journal of Experimental Child Psychology 32: 110.
Lock, A. 1980. The Guided Reinvention of Language. London, Academic Press.
Marx, K. 1976. Postface to the Second Edition, 1873. Capital vol. 1. Harmondsworth: Penguin
Books.
Mauss, M. 2000. The Gift: The Form and Reason for Exchange in Archaic Societies (Original
publication 19231924: Transl. W.D. Halls). New York: W.W. Norton.
Merleau-Ponty, M. 1962. Phenomenology of Perception. London: Routledge and Kegan Paul.
Moro, C. and Rodrguez, C. 2005. Lobjet et la construction de son usage chez le bb. Une ap-
proche smiotique du dveloppement prverbal. Berne-New York: Peter Lang.
Moscovici, S. 2000. Social Representations. Cambridge: Polity Press.
Murray, L. and Trevarthen, C. 1986. The infants role in mother-infant communications. Jour-
nal of Child Language 13(1): 1529.
Netchine-Grynberg, G. and Netchine, S. 2002. Vygotski, Wallon et les mondes communs. In
Avec Vygotski, Y. Clot (ed.), 85104. Paris: La Dispute.
Rakoczy, H. 2006. Pretend play and the development of collective intentionality. Cognitive
Systems Research 7: 113127.
Reddy, V., Hay, D., Murray, L. and Trevarthen, C. 1997. Communication in infancy: Mutual
regulation of affect and attention. In Infant Development: Recent Advances, G. Bremner,
A.Slater and G. Butterworth (eds.), 247273. Hove: Erlbaum Taylor and Francis Ltd.
Rodrguez, C. 2006. Del ritmo al smbolo. Los signos en el nacimiento de la inteligencia. Barce-
lona: Horsori.
Rodrguez, C. and Moro, C. 1999. El mgico nmero tres. Cuando los nios an no hablan.
Barcelona: Paids.
Rodrguez, C. and Moro, C. 2002. Objeto, comunicacin y smbolo. Una mirada a los primeros
usos simblicos de los objetos. Estudios de Psicologa 233: 32333.
Language and the signifying object 377

Rodrguez, C. and Moro C. this volume. Coming to agreement: Object use by infants and
adults.
Searle, J. 1983. Intentionality: An Essay in the Philosophy of Mind. Cambridge: Cambridge Uni-
versity Press.
Searle, J. 1995. The Construction of Social Reality. London: Allen Lane.
Shotter, J. 1978. The cultural context of communication studies: Theoretical and methodologi-
cal issues. In Action, Gesture and Symbol: The Emergence of Language. A. Lock (ed.), 43
78. London: Academic Press.
Shotter, J. 1984. Social Accountability and Selfhood. Oxford: Basil Blackwell.
Shotter, J. 1995. In conversation: Joint action, shared intentionality and ethics. Theory and
Psychology 5: 4973.
Sinha, C. 1982 Representational development and the structure of action. In Social Cognition:
Studies in the Development of Understanding, G. Butterworth and P. Light (eds.), 137162.
Brighton: Harvester.
Sinha, C. 1983. Background knowledge, presupposition and canonicality. In Concept Devel-
opment and the Development of Word Meaning, T. Seiler and W. Wannenmacher (eds.),
269296. Berlin: Springer-Verlag.
Sinha, C. 1988. Language and Representation: A Socio-Naturalistic Approach to Human Develop-
ment. Hemel Hempstead: Harvester-Wheatsheaf.
Sinha, C. 1999. Grounding, mapping and acts of meaning. In Cognitive Linguistics: Founda-
tions, Scope and Methodology, T. Janssen and G. Redeker (eds.), 223256. Berlin: Mouton
de Gruyter.
Sinha, C. 2004. The evolution of language: From signals to symbols to system. In Evolution
of Communication Systems: A Comparative Approach, D. Kimbrough Oller and U. Griebel
(eds.), 217235. Cambridge, MA: MIT Press,
Sinha C. 2005. Blending out of the background: Play, props and staging in the material world.
Journal of Pragmatics 37: 15371554.
Sinha, C. in press. Iconology and imagination in human development. In Religious Narra-
tive, Cognition and Culture: Image and Word in the Mind of Narrative, A.W. Geertz and
J.S.Jensen (eds.). London: Equinox Publishing.
Sinha, C. and Jensen de Lpez, K. 2000. Language, culture and the embodiment of spatial
cognition. Cognitive Linguistics 11: 1741.
Smolka, A.-L., Ges, M. de and Pino, A. 1997. (In)Determinacy and the semiotic constitution
of subjectivity. In Dynamics and Indeterminism in Developmental and Social Processes, A.
Fogel, M. Lyra and J. Valsiner (eds.), 153164. Mahwah, NJ: Lawrence Erlbaum Associates.
Sohn-Rethel, A. 1977. Intellectual and Manual Labor: A Critique of Epistemology. Atlantic
Highlands, NJ: Humanities Press.
Tomasello, M. 1999. The Cultural Origins of Human Cognition. Cambridge, MA: Harvard Uni-
versity Press.
Trevarthen, C. 1979. Communication and cooperation in early infancy: A description of pri-
mary intersubjectivity. In Before Speech: The Beginning of Interpersonal Communication,
M. Bullowa (ed.), 321347. Cambridge: Cambridge University Press.
Trevarthen, C. 1998. The concept and foundations of infant intersubjectivity. In Intersubjec-
tive Communication and Emotion in Early Ontogeny, S. Brten (ed.), 1546. Cambridge:
Cambridge University Press.
378 Chris Sinha and Cintia Rodrguez

Vandeloise, C. 1991. Spatial Prepositions: A Case Study from French. Chicago: Chicago Univer-
sity Press.
Vygotsky, L.S. 1978. Mind in Society: The Development of Higher Psychological Processes. Cam-
bridge, MA: Harvard University Press.
Walkerdine, V. and Sinha, C. 1978. The internal triangle: Language, reasoning and the social
context. In The Social Context of Language, I. Markova (ed.), 151176. London: Wiley.
Wallon, H. 1925/1984. LEnfant Turbulent. Etude Sur les Retards et les Anomalies du Dveloppe-
ment Moteur et Mental. Paris: PUF.
Williams, R. 1977. Marxism and Literature. Oxford: Oxford University Press.
Zazzo, R. 1975. Psychologie et Marxisme. La Vie et lOeuvre dHenri Wallon. Paris: Denol/
Gonthier.
Zlatev, J. this volume. The co-evolution of intersubjectivity and bodily mimesis.
Author index

A 201, 207, 215, 220, 229, 230, D


Adamson, L. 78, 97, 144, 192, 257, 263, 346 DEntremont, B. 125, 156
194, 198 Brooks, R. 21, 24, 117, 118, 126, Damasio, A. 54
Anscombre, J.-C. 311, 312, 127 Dapretto, M. 56, 60, 61
315, 317 Bruner, J. 29, 30, 32, 33, 78, 109, Dautenhahn, K. 64, 260, 261,
Arbib, M. A. 180, 217, 219, 152, 171, 357 272
220, 221, 222, 248, 250, 255, Bhler, K. 119, 139 Dawkins, R. 311
264, 267 Butterworth, G. 155, 178, 187, de Saussure, F. 284, 297, 298
Astington, J. 29, 32, 216, 236 188, 193, 195, 196, 198, 202, de Villiers, J. 216, 217, 236
Austin, J. L. 121, 151, 172, 324, 206, 207 de Waal, F. B. M. 168, 195, 196,
328 Byrne, R. W. 168, 174, 176, 200, 207, 217, 224226, 228
229, 231, 243 Deacon, T. 219, 220, 232, 239
B Decety, J. 32, 52, 53, 58, 59, 118
Bakeman, R. 79, 97, 144, 168, C Dennett, D. 327
176, 194 Call, J. 123, 146, 167, 176, 177, Donald, M. 217, 218, 220, 231,
Baldwin, D. A. 20, 21, 22, 216, 180, 187, 195197, 200, 201, 238, 245, 248, 249, 252256,
267 205, 207, 217, 227231, 234, 261
Bard, K. 5, 6, 9, 96, 120, 121, 236, 245, 246, 252, 269 Dretske, F. 302
122, 145, 172, 176, 187, 190, 196, Camaioni, L. 125, 152, 175, 193, Du Bois, J. W. 333, 334
198, 204, 205, 206, 224, 225 220 Ducrot, O. 311313, 315, 317
Baron-Cohen, S. 23, 91, 142, Carpendale, J. 26, 91, 142, 146, Dunbar, R. 180, 258261, 264,
175, 188, 190, 193, 202, 248, 150, 158, 265 309
257, 261, 266 Carpenter, M. 142, 146, 156, Dunham, P. J. 146, 215
Barresi, J. 6, 7, 19, 40, 42, 43, 47, 158, 178, 202, 217 Dupr, J. 142, 148, 149, 151, 245
48, 60, 61, 76, 83, 96, 117, 126, Carruthers, P. 255, 262 Durkheim, E. 5, 11, 357, 360,
143, 215, 217, 218, 220, 282, 302 Chafe, W. 346, 347 361, 362, 374
Bates, E. 120123, 135, 152, 157,
Cheney, D. L. 166, 308
158, 170, 171, 172, 175, 176, 178,
Chomsky, N. 279, 293, 294, E
179, 190, 193, 198, 199, 201,
296, 300 Emmorey, K. 344
203, 205, 220, 252256
Churchland, P. 152 Enfield, N. 196
Bateson, G. ix, 5, 188, 189, 190
Bateson, W. 189 Clark, H. 233, 289, 293, 316, 358 Engels, F. 291
Benveniste, . 324 Collingwood, R. G. 301 Enkvist, N. E. 343, 344
Bermdez, J. L. 21, 249 Condillac, E. B. D. 180
Bickerton, D. 219, 221, 260, 262 Corballis, M. C. 96, 180, 255, F
Bloom, P. 216, 239 260, 268 Farroni, T. 125, 133
Bretherton, I. 146, 172 Costall, A. 91, 92, 94, 96, 109, Feldman, C. F. 29, 33
Brinck, I. 5, 6, 8, 115, 120122, 111 Fillmore, C. 320
128, 135, 136, 146, 152, 155, 158, Csibra, G. 125, 127 Fodor, J. 1, 300
170172, 189, 190, 191, 194, Currie, G. 34, 265 Fogel, A. 142, 153, 373
380 The Shared Mind

Fouts, R. S. 195, 197, 200, 205, Hopkins, W. D. 5, 6, 9, 96, Kita, S. 146, 234
232 120122, 145, 175, 187, 190197, Klin, A. 133, 138
Franco, F. 178, 193, 194, 195, 199, 201, 205 Knoblich, G. 40, 42, 64, 250
198, 207 Hubley, P. 23, 42, 77, 127, 141, Krause, M. 195, 197, 200, 205,
Frith, U. 55, 61 146, 193 207, 210
Hudson, R. 297
G Humphrey, J. H. 336339, 345 L
Gallagher, S. 37, 10, 1721, Humphrey, N. 42 Lakoff, G. 152, 268, 296
42, 48, 143, 191, 217, 221223, Hurley, S. 19, 249 Lamarque, P. 30, 37, 262
225, 246, 250, 251, 264, 265, Hutto, D. D. 3, 57, 10, 12, 17, Lawrence, S. 333338, 345, 351
345, 374 18, 25, 28, 33, 42, 48, 143, 180, Leakey, M. D. 204
Gallese, V. 19, 42, 45, 117, 217, 191, 217, 218, 220, 222, 226, Leavens, D. 5, 6, 9, 96, 120122,
222, 224, 250252 236, 239, 245, 246, 251, 255, 145147, 152, 155, 175, 176, 187,
Grdenfors, P. 128, 217, 218, 257260, 262, 265267, 345, 190, 195202, 204207
229, 247, 249, 257 374 Lee, T. 22, 37, 69, 70, 72, 87
Gardner, R. A. 176, 197, 201 Leech, G. 284
Garfinkel, H. 153 I Leeson, L. 346, 349
Gibson, E. 94 Iacoboni, M. 50, 53, 56, 60, 264 Legerstee, M. 21, 37, 97, 207
Gibson, J. J. 3, 24, 25, 96, 367 Inhelder, B. 95 Leslie, A. M. 17, 37, 41, 58, 64,
Gile, D. 335, 340, 341, 345, Itakura, S. 200, 201 91, 262, 267
348350 Itkonen, E. 1, 4, 5, 10, 11, 109, Leung, E. H. L 193, 195, 198
Givn, T. 333, 347, 351 207, 216, 218, 229, 233, 239, Levinson, S. C 312
Goldman, A. 17, 19, 42, 118, 279, 280, 283, 284, 287, 289, Lewin, R. 221, 232
251, 252 294296, 299, 300302, 307, Lewis, C. 26, 29, 37, 91, 158, 265
Gomez, J. C. 176 312, 342, 358, 361, 366, 367 Lewis, D. K. 233, 288, 289, 296,
Goodall, J. 168 307, 358
Goodwin, C. 360, 365 J Liebal, K. 169, 172174, 176,
Gopnik, A. 17, 22, 41, 48, 223, Jackendoff, R. 1, 296, 302 180, 200, 201, 207
248, 249, 250 Janzen, T. 11, 333, 335, 341, 345, Liebermann, P. 166
Gordon, R. 17, 42, 264 346, 350 Lightfoot, D. 374
Gould, J. 95 Jeannerod, M. 19, 118 Linell, P. 334, 341, 343
Grice, P. 220, 259, 338 Jenkins, J. M. 216, 236 Liszkowski, U. 156, 158, 179,
Guajardo, N. R. 29, 32, 265 Johnson, C. M. 188, 189, 190 203, 207
Johnson, M. 152, 268 Lloyd, S. 368, 369
H Johnson, M. C. 125 Lock, A. 179, 193, 371
Hacker, G. P. 144, 145, 147, Johnson, S. C. 21, 22 Locke, J. 256
150, 152 Johnson-Laird, P. 301 Lohmann, H. 217, 236
Hare, B. 226, 227 Jones, R. A. 360 Lou, H. C. 59
Harris, P. 42, 253 Jones, S. 249 Lovejoy, C. O. 256
Hobson, R. P. 6, 8, 20, 22, 42,
67, 6973, 77, 80, 83, 85, 126, M
142, 143, 207, 217, 251, 264 K MacKinnon, J. R. 168
Kac, M. 300 Maestripieri, D. 168, 183
Hobson (Meyer), J. A. 6, 8, 42,
Kanner, L. 71, 84 Malle, B. 2
67, 72, 73, 87
Karmiloff-Smith, A. 95 Mandler, J. 92, 93, 113
Hockett, C. F. 232, 293, 294,
Katz, J. 299, 300 Marler, P. 166, 167, 255
307, 329
Kendon, A. 195 Martinet, A. 329
Hofstadter, D. 298300
Keysers, C. 42, 45, 55, 117 Marx, K. 291, 362
Honderich, T. 11 Matsuzawa, T. 37, 47
Kiparsky, P. 299
Author index 381

Mauss, M. 371 P Rowlands, M. 259, 267


McGrew, W. C. 168, 179 Parker, T. 205, 243 Rumbaugh, D. M. 167, 168,
Meltzoff, A. N. 21, 22, 24, 37, Patterson, F. G. 176, 232, 235 176, 232
93, 94, 117, 118, 126, 127, 223, Paul, H. 284, 296, 299 Russell, C. L. 198
248, 249, 250 Perner, J. 83, 235 Russell, J. L. 120, 121, 187, 190
Menyuk, P. 221 Persson, T. 217, 218, 249 Russell, B. 284
Menzel, E. W. 176 Peterson, C. C. 29, 166, 216,
Merleau-Ponty, M. 3, 217, 222, 236 S
357, 359 Petitto, L. 187, 188 Sabater-Pi, J. 176, 195, 203
Messer, D. 100, 195 Pettit, P. 283 Samson, D. 58, 59
Miall, C. R. 54, 58 Phillips, A. 31, 265 Savage-Rumbaugh, S. 167, 168,
Miles, H. L. 176, 195, 197, 203, Phillips, W. 23 175, 176, 193, 197, 202, 203, 221,
230, 232, 233 Piaget, J. 12, 90, 9397, 170, 171, 232, 233
Mindess, A. 344, 345 193, 219, 220 Scheler, M. 3, 4
Mitani, J. C. 167, 177, 195, 206 Pika, S. 6, 9, 96, 165, 169, 172, Schutz, A. 3
Mitchell, R. W. 190, 191, 194 173, 176179, 190, 195, 200, Searle, J. 11, 324, 357, 358,
Mitchell, P. 235 201, 204, 206 361363, 365368
Moll, H. 48, 146, 217 Plooij, F. X. 168, 169, 172, 176 Seibert, J. M. 78, 84
Moore, C. 6, 7, 19, 39, 40, 42, Popper, K. 289, 291 Senghas, A. 234
43, 47, 48, 60, 61, 76, 83, 92, Povinelli, D. J. 175, 187, 190, Seyfarth, R. M. 166, 308
96, 117, 126, 143, 146, 156, 193, 200202, 245, 246 Shaffer, B. 11, 333, 335, 343, 346,
203, 207, 215, 217, 218, 220, Premack, D. 176 347, 351
282, 302 Preston, S. 217, 224, 225 Shotter, J. 109, 359, 363
Moore, D. G. 22 Prinz, W. 40, 139 Sigman, M. 71, 78
Moore, M. K. 93, 94, 248, 249 Pyers, J. 216, 217, 236 Singer, T. 54, 55
Morissette, P. 157 Sinha, C. 1, 46, 11, 98, 109,
Moro, C. 4, 5, 8, 89, 93, 94, 99, Q 137, 180, 190, 197, 207, 217, 239,
109, 371 Quigley, S. P. 340, 341 248, 253, 255, 267, 268, 288,
Morton, E. S. 166, 167, 308, 310, Quill, K. 221 316, 329, 357, 359, 367, 368,
311, 328, 329 371, 373
Moses, L. J. 127, 130, 265 R Smith, R. T. 341, 345
Racine x, 1, 5, 6, 8, 12, 68, 110, Smolka, A.-L. 373, 374
Mundy, P. 7880, 84, 125
137, 141, 142, 144, 146, 150, 155, Sommerville, J. A. 46, 59, 265
Murray, L. x, 359, 363
157, 190, 191, 201, 207, 334 Sonesson, G. 220, 221, 239
Myowa-Yamakoshi, M. 21,
Rakoczy, H. 79, 80, 128, 363 Spelke, E. 92, 114
223, 248 Reddy, V. 117, 118, 126, 137, 192, Sperber, D. 312
193, 207, 223, 224, 359
N Sterelny, K. 245, 251
Rheingold, H. L. 78, 193, 195,
Nelson, K. 26, 29, 31, 93, 98, Stern, D. 7, 8, 22, 116, 131, 132,
198, 211
218, 221, 261 219, 220, 223
Ricard, M. 157, 199
Nicolopoulou, A. 26, 31, Stokoe, W. 232
Richner, E. S. 26, 31, 38, 261
261 Rizzolatti, G. 42, 45, 50, 117, Striano, T. 79, 95, 97, 125, 126,
Nishida, T. 168, 179 217, 222, 249, 250 156, 158
Nuyts, J. 324, 328 Rochat, P. 97, 126 Sugarman, S. 78, 190, 193, 205
Rodrguez, C. 46, 8, 11, 89, Susswein, N. 5, 6, 8, 68, 141,
O 9294, 99, 109, 110, 357, 359, 190, 191, 201, 334
ONeil, D. K. 194, 253, 290 371
Owings, D. H. 166, 167, 308, Roth, D. 91 T
310, 311, 328, 329 Roth, R. R. 176 Tager-Flusberg, H. 217, 236
382 The Shared Mind

Tanner, D. 168, 174, 176, 200, Verhagen, A. 6, 11, 285, 307, Winch, P. 281, 286, 289
231 310, 311, 315317, 320, 322, 325, Wittgenstein, L. 35, 8, 10, 23,
Tomasello, M. 47, 48, 79, 97, 326, 334, 342 38, 81, 109, 110, 144, 158, 230,
109, 123, 126128, 142, 146, Verschueren, J. 284, 295 279283, 286, 287, 290, 294,
156, 158, 165, 167, 169180, Volterra, V. 152, 179, 190, 193, 297, 300, 307
187, 188, 190, 193, 195198, 220, 240 Woodruff, G. 176
200202, 204, 205, 207, 216, von Glaserfeld, E. 171, 186 Woodward, A. L. 46, 47, 127,
217, 226231, 234, 236, 245, Vonk, J. 245, 246 137, 265
246, 263, 269, 308311, 323, Vygotsky, L. S. 35, 8, 8991, Wynn, T. 247, 253
324, 363 96, 231, 334, 365, 366, 368
Trevarthen viixi, 7, 17, 20, 23, Y
42, 68, 77, 116, 117, 126, 127, W Youngs, J. P. 340, 341
141143, 146, 190, 192, 193, Wadensj, C. 343, 347, 349, 352
207, 219, 220, 223, 359, 363 Wallon, H. 90, 96, 359 Z
Tutin, C. E. G. 168, 179 Watson, S. 29, 32 Zahavi, D. 3
Wellman, H. 31, 38, 265 Zazzo, R. 359
V Whiten, A. 187, 195, 197, 229, Zlatev xi, 1, 5, 6, 9, 10, 20, 21,
Valsiner, J. 1, 373 249 25, 34, 38, 42, 43, 47, 56, 96,
van Lawick-Goodall, J. 168, Wicker, B. 55, 64, 117 111, 137, 166, 178, 215218, 221,
195, 204 Wilcox, S. 343, 346, 347 222, 225, 226, 232, 249, 255,
Vandeloise, C. 369 Williams, E. 92 259, 264, 267, 268, 283, 288,
Varela, F. 1 Williams, R. 374, 345 289, 307, 309, 329, 359
Vea, J. J. 176, 186 Wilson, D. 312 Zuberbhler, K. 166, 167, 177
Subject index

A artefacts 4, 91, 268, 288, 358, body


affective 359, 364, 367369, 371 role for intersubjectivity 1,
contact 73, 84 attribution 3, 4, 10, 11, 21, 23, 51, 52, 54,
engagement 71, 72 of mental states/propositional 57, 82, 215, 219, 224, 234,
intentional relations 34, attitudes 5, 26, 27, 30, 59, 237, 238, 280, 357, 364, 375
54 (see also intentional 127, 153, 175, 190, 193, 251, body schema vs. body
relations) 265, 267 image 222, 223, 225, 250
response 203, 207 body-centered 52 (see also
autism 2, 68, 43, 44, 6062,
sharing 131, 134 first-person)
6776, 8186, 91, 92, 126, 142,
states 22, 56, 215, 223 brain 32, 49, 54, 56, 57, 60, 96,
148, 202, 221, 224, 251, 266
affordances 24, 136, 367371 117, 118, 189, 247, 250, 256,
autoscopic hallucinations 53
agent 21, 2325, 30, 4048, 52, 258, 264
53, 94, 98, 148, 151, 152, 154, damage 53, 58
175, 179, 295 B bridging modalities 45, 54
agency 54, 190, 192, 373, 375 background knowledge 26, 27
agreement beliefs C
adult-baby 8, 89, 90, 98, and desires 18, 20, 2124, canonical
100, 108111 91, 110, 265 forms/activities 261, 268
collective 267, 366 attribute/understand function 368, 369, 370,
in grammar 293 beliefs 10, 19, 27, 41, 132, 371, 374
in judgment 81 191, 215217, 222, 233, 234, uses 93, 107
alarm calls 166, 167, 308, 310, 236239, 245, 267, 295, caregiving 203, 204
327 307, 348, 358, 361 (see also Cartesian 280, 302
alignment 78, 132 causal
attribution)
allocentric representation 51, role/factor/force 5, 96, 159,
concept of belief 25, 26,
52, 60, 132 (see also third- 218, 222, 235, 236, 295
person information) 83, 234 vs. definitional issues 8, 9,
American Sign Language false belief (tasks) 25, 26, 141, 142, 147, 149, 159, 292
(ASL) 11, 232, 235, 333, 29, 39, 40, 49, 53, 58, 63, 91, precondition 141, 147, 148
335346, 348351, 353355 211, 217, 231, 235, 236 chimpanzees (Pan troglodytes)
amygdala 32, 56 bipedalism 204, 205, 206, social interactions 47, 77
analogy 221, 280, 301, 340 207, 255 gestures 168, 170, 172174,
animal communication 11, blindness 83 176179, 187, 194201,
166171, 231, 234, 307311, 329 bodily 204, 205
argumentative 11, 285, 307, intelligence 96
intentionality 21, 22 (see
311316, 322, 323, 325, vocal signals 166, 167
also intentionality)
327329, 334 neonatal mirroring 223
orientation vs. strength mimesis (see mimesis) mutual gaze 224
317321, 326 based 132, 217 consolation 226
384 The Shared Mind

understanding intentions cues/signals 42, 175, 179, use of objects 93, 99, 100,
226, 227, 230 196, 200, 224, 309 103, 105, 106, 108, 110, 111
deception 228 intention 10, 161, 178, 179, -normative 219, 233, 283
language 232 216, 220, 230, 231, 234, 237, (see also norm)
false belief 236 238, 239, 259, 267 (see also conventionalization 165, 178,
cingulate cortex (CC) 54, 55 intentions) 289, 296, 361, 362, 364
co-evolution 215, 218, 222 intent indicators 122, 124, conversation 27, 29, 153, 242,
cognitive empathy 217, 225, 125, 127, 131 258, 260, 262, 323, 334,
226, 237 (see also empathy) signs 166, 169, 172, 175, 179, 340, 347
cognitivism 359 188, 195, 196, 200, 212, 215, analysis 153, 154
collective intentionality 362, 224, 230, 231, 309, 310, 371, in autism 72, 75
363 (see also intentionality) 374 (see also signs) in children 25
collectivism 10, 290, 291 system 188 proto- viii, ix, 117, 132, 143
common code 40, 46 triangle 263 conversational implicature 312
common knowledge 5, 10, view of language 259 cooperation 11, 229, 342
216, 220, 233, 239, 279, 280, complementation 323, 324, correctness 10, 220, 287, 290,
286291, 298, 357359, 361, 326, 329 291, 295, 296
362, 373 compositionality 219 cross-modal mapping 117, 219,
commonsense psychology 41 conceptual 220, 223
communication 5, 171, 174, 176, categorization 92, 94, 110 culture vi, viii, 1, 93, 191, 218,
179, 219, 269, 308, 310, 327, development 49 231, 253, 309, 315, 316, 344,
328, 372 expanations (vs. causal/ 370, 371, 373
animal/primate (see animal empirical) 4, 142, 147, mimetic (see mimetic
communication) 149 culture)
infant 8, 90, 95, 97, 111, understanding 8, 49, 55, traditional 225
126, 170 58, 81 customs 261
intentional 8, 9, 115, 116, consciousness 54, 217, 219, 220,
120131, 134, 135, 137, 171, 223, 225, 237, 280, 282, 283, D
172, 174, 175, 188, 190, 192, 286, 334, 347 declarative
193, 198, 208, 218, 220, 228, and intuition 297 behaviour 204
229, 230, 269, 353 and language 10 knowledge 251, 252
linguistic/verbal 11, 180, human vi pointing 9, 122, 145, 147, 152,
307, 311, 312, 329 (see also landscape of (see landscape of 155158, 175, 176, 202, 203,
language) consciousness) 205, 218220, 263
non-verbal/gestural 68, 71, pre- 222, 223
sentence 283, 287
75, 98, 172, 175, 218, 231, self- 85, 223
signals 309
264, 266, 269 (see also visual vii
decoupling 58
gestures) contagion 56, 57, 132, 219, 220
decontextualisation 8, 135137
communicative 44, 47, 69, 71, contextualization
definitional
74, 79, 84, 89, 91, 115, 120, and ASL 11, 334336, 342,
121, 123, 126, 130, 134136, 343, 344, 348, 350352 criterion 221
171, 190, 192, 197, 198, 233, convention/conventional 96, issues 8, 142, 159
269, 317, 323, 325, 341, 351, 98, 111, 166, 216, 220, 221, question 215
361, 364, 366, 372 236, 238, 255, 261, 267269, deictic
actions/acts 52, 111, 187, 188, 284, 285, 307310, 312, 318, elements 308
189, 193, 203 359, 364, 373 identification 372
competence ix, 342 signs 196, 232, 365 (see also gestures 165, 178, 179, 193,
contexts 90, 98100, 109, sign) 218 (see also gestures)
110 symbols 97, 165 desire 120, 158, 372
Subject index 385

and beliefs 18, 2024, 27, intentional relation 43 (see expansions


91, 110, 191, 220, 265, 315, also intentional relations) in ASL 11, 335339, 341, 345,
372 interaction/engagement 3, 346, 348, 349, 351
-based psychology 257 257 expectations 28, 30, 31, 120,
developmental mind/knowledge vii, 1, 222, 198, 234, 238, 335, 370
change x, 43, 79, 153, 172, 288, 359, 364 exteroception 219
192 person/agent 48, 68, 76 eye contact 69, 70, 78, 84, 85,
level/stage 9, 172, 239 practice/action 20, 23, 34, 97, 125, 129, 130, 131, 133, 134,
model 219, 237 52, 267 157, 171 (see also mutual gaze)
process/trajectory 82, 86, schemas 268
116, 131, 133, 371373 state 49, 55 F
psychology/theory/science stories 261 face-to-face 22, 23, 77, 122, 127,
ix, 12, 18, 20, 30, 76, 82, emotion viii, 4, 8, 20, 22, 25, 202, 238, 334, 375
84, 95, 99, 116, 119, 141, 32, 56, 57, 60, 62, 67, 79, 80, facial expressions 7, 32, 34, 117,
148150, 155, 259, 263 85, 117120, 126128, 126, 130, 198, 224, 250
diachronic 297 130132, 134, 192, 203, 207, farewells 6769
linguistics 279, 280, 295 (see 217, 223, 225, 226, 228, 233, first-person
also historical linguistics) 363 experience/knowledge 45,
differentiation 8, 100, 219, 237 contact 70 48, 58, 61
expression-content 220, expressions 22, 44, 56, 60, information 7, 41, 4346,
223, 372 224, 254 48, 5055, 60, 220
means-ends 170 intentional relation 41, 46, model/simulation 19, 20
self-other 7, 8, 223, 225, 207 49, 58 perspective 57, 61, 116, 132
discourse 11, 217, 236, 262, 285, states 142145, 166, 226, 359 (see also egocentric)
311317, 319, 321, 322, 329, empathy 3, 10, 5457, 62, 217, representation 40, 52
333338, 340352 223226, 228, 237, 301, 302, theory of mind 60, 264
disgust 55 359 (see also sympathy) (see also theory of
distributed 187, 188, 189, 190, cognitive 217, 225, 226, 237
mind)
191, 290 enactive 251, 374, 375
verbs 324, 325
dramatic 31, 373, 374 re-enactive 10, 261
fMRI 32, 54, 56, 58
re-enactments 33, 261 social perception 21
folk psychology 7, 10, 20, 25,
dualistic 189, 191 encephalisation 247, 256
29, 30, 150, 215, 236
dyadic enculturation
exchange 71 in apes 10, 155, 228, 230, 231, competence/abilities 28, 29,
interaction 44, 47, 62, 148, 236, 238, 239 35, 245, 249, 266, 270
178, 180 Environment of Evolutionary explanation 28, 29
mimesis 47, 219, 220, Adaptiveness 245 narratives 26, 2831, 33,
225229 (see also mimesis) evolution 10, 167, 189, 194, 195, 42
218, 224, 229, 232, 237, 238, reasoning 239
E 246, 251, 258, 260, 327 functional permanence 94
ecological psychology 3 of intersubjectivity 9, 221,
egocentric 135 238, 327 G
representation 5153, 60, of language 9, 166, 264, 327 gaze 7, 24, 47, 58, 70, 73, 7780,
61 (see also body- of symbolic gestures 179 85, 116, 121, 125, 126, 129,
based, first-person of triadic interactions 100 131, 134, 155157, 176, 192,
representation) of triadic mimesis 218 (see 200, 201, 207, 216,
embodied also mimesis) 225
capabilities 25 zone or proximal evolution alternation 119, 122, 125, 130,
comportment 22 232, 238 132, 188, 193, 198, 199
386 The Shared Mind

following 46, 119, 125, 127, hominid individual


129, 130, 132, 134, 136, evolution 10, 204, 229, 231, intentionality 362
145149 245250, 252, 254258, sharing 119, 120
mutual (see mutual gaze) 260, 262, 267, 268, 270 individualism 10, 290, 291
reading 125, 133, 134 Homo ergaster/erectus 10, 254, infants 2123, 40, 42, 46, 47,
referential 131, 133 255, 256, 257, 258, 262 52, 78, 79, 9294, 100, 101,
generative grammar/linguistics Homo habilis 247 116, 120, 122, 124131, 133,
299, 300 Homo sapiens 237, 246, 256 135137, 141149, 153, 157159,
gestures 34, 116, 119, 196, 218, 166, 168172, 178, 179, 195,
219, 249, 250, 255, 264, I 198200, 204, 205, 223225,
268, 336, 364 iconic 246, 249, 250, 263, 363, 368,
and signed language 234 gestures 176, 179, 218, 219, 369, 372
in apes 9, 165, 166, 168170, 235 (see also gestures) inferior parietal (IP) 5153,
172180, 195, 196, 198, 199, sign 220, 235 (see also sign) 57, 58
203, 231, 269 signed language 235 (see innate
in autism 70, 75, 78, 84 also signed language) intersubjectivity viii, 142
in children ix, 8, 22, 23, identification (see also intersubjectivity)
99, 102, 105109, 111, 121, of individuals/particulars theory of mind 10, 41, 257
122, 125, 129, 132, 135, 149, 258, 289, 308, 309 (see also theory of mind)
155158, 175, 178180, 193, with others 6, 8, 32, 55, 68, mappings 22, 45
194, 203 7176, 81, 82, 86, 118, 126, contagion 44 (see also
goal emulation 248, 249 217, 223, 226, 237, 264, 265, contagion)
goal-directed 359, 372 body schema 223 (see also
agents 47 with personas 34 body schema)
behaviour/action 24, 40, 45, images intentionality 363 (see also
46, 50, 127, 134, 135, 153 mental 248, 281, 285, 286, intentionality)
gaze 134 342 institutional facts 362, 365, 366
intentionality 116, 119, 135 motor x instrumental
movement 21, 46 visual 57, 93 action 80, 118, 119, 122, 127,
gradual 47, 97, 126, 127, 136, 223, imagination 10, 48, 5557, 218, 134, 203, 205, 206, 269
246, 268, 297, 324 252, 262, 265 intentionality 135 (see also
grammar 309, 311, 317, 323, 327, imitation 3, 10, 20, 44, 50, 52, intentionality)
335346, 348, 349, 350, 351 53, 56, 60, 82, 147, 179, 218, insula 5457, 60
development 235 219, 224226, 247252, intentions vi, x, 8, 10, 18, 20,
evolution 235, 262 263, 269, 368370
2225, 27, 29, 34, 102,
linguistic 291, 296, 299, in autism 68, 73, 74, 75, 80
104108, 110, 111, 119, 128,
301, 307 neonate 21, 117, 118 (see also
132, 134137, 146, 151155,
Panini 293, 299 neonatal mirroring)
grammaticalization 295, 296, 159, 170, 171, 179, 198, 217,
imperative
301 219, 225229, 231, 237, 259,
gaze 129
greetings 6769 gestures 177, 179, 180, 203, 263, 265, 266, 360, 361
269 (see also gestures) communicative (see
pointing 122, 158, 171, communicative intention)
H
hand-eye coordination 247 175177 (see also pointing) discourse 334, 335, 352
head nods/shakes 7072, 75 sentence 283, 284 joint (see joint intention)
heterochrony 204 index finger 79, 121, 156, 157, in-acting 265
historical linguistics 342 (see 158, 178, 196, 197 intentional 21, 23, 24, 27,
also diachronic linguistics) indexical 128, 219, 220 30, 38, 42, 49, 146, 169,
sign 106, 107, 220 170, 178, 180, 191, 258,
Subject index 387

264, 267, 351, 352, 358, dyadic 44, 47, 62, 148, 178, primary (see primary
360363 180 intersubjectivity)
agent 47, 52, 53, 58, 94, 128, triadic 8, 44, 45, 47, 62, 89, secondary(see secondary
179 (see also agent) 90, 91, 97, 98, 100, 104, 110, intersubjectivity)
attitudes 33, 220, 251, 265, 111, 126, 148, 149, 335, 372 intuition 291, 293, 297, 300, 301
266 Interaction Theory 17
communication 8, 9, 115, interaffectivity 115, 131, 133, 134, J
116, 120131, 134, 135, 137, 136, 137, 220, 223 joint action 90, 95, 97101,
160, 171, 172, 174, 175, 188, interattentionality 131, 132, 134 104, 110, 111, 357, 361364, 370,
190, 192, 193, 198, 213, 218, (see also joint attention) 371, 373
220, 228230, 269, 353 (see interintentionality 131133, 135, joint attention 23, 24, 30, 46,
also communication) 137 (see also joint intention) 7880, 82, 84, 90, 97, 111, 120,
islands 47 internal states 54, 62 122, 127, 129, 132, 146, 147, 148,
relations 7, 39, 40, 41, 43 interobjectivity 357, 364, 373 153155, 215, 219, 224, 226,
48, 5054, 57, 5962 (see interpersonal 7, 30, 42, 46, 48, 228230, 237239, 263268,
also Intentional Relations 58, 72, 73, 77, 80, 85, 97, 223 307310, 329, 363, 372 (see
Theory) co-ordination 68, 71, 82 also interattentionality)
schema 43, 52, 53, 60, 61 engagement 70, 74, 76, joint communicative action 89
stance 23 8184, 224 joint intention 362 (see also
Essentially intentional understanding 76, 81 interintentionality)
behaviour 123, 124 interpretation
Intentional Relations Theory between languages 11, 335, K
(IRT) 7, 39, 4143, 48, 49, 337, 338, 339, 340, 349, kinesthetic 22, 50, 53, 219
5053, 55, 5862, 282 (see also 351, 352 know-how (vs. knowledge
intentional relations) functional 251, 360 that) 252
intentionality 9, 23, 36, 44, 47, of others 21, 29, 52, 57, 315, knowledge base 334
58, 116, 127, 128, 135, 140, 360
141, 165, 166, 193, 222, 226, rich (vs. lean) 198, 265 L
362, 363 semantic 367 landscape
bodily (see bodily inter-rater reliability 74 of action 33, 34
intentionality) intersubjective of consciousness 3234
collective (see collective language 2, 46, 912, 25,
engagement 9, 34, 68,
intentionality) 31, 42, 43, 47, 48, 84, 93,
72, 73, 7981, 83, 85, 86,
individual (see individual
141144, 146, 155, 245, 255, 96, 99, 102, 109111, 115,
intentionality)
263265 116, 135, 149, 152, 153, 155,
instrumental (see
experiences 22 166, 167, 174, 176, 179, 191,
instrumental
system 72 196, 197, 200, 203, 215218,
intentionality)
intersubjectivity 112, 40, 221, 222, 228, 230239, 245,
intention-based semantics 10,
259 41, 68, 7274, 76, 79, 255, 256, 259262, 264,
interaction 1, 3, 4, 10, 17, 2023, 8184, 116119, 123135, 266, 268, 279, 280285,
30, 32, 46, 57, 58, 71, 78, 92, 137, 141144, 149, 154, 157, 287297, 299302,
99, 103, 109, 116118, 120, 158, 172, 177, 179, 180, 190, 307312, 315, 317, 322,
122, 129, 131, 133136, 215218, 220225, 228232, 327329, 334338, 340
141143, 152, 153158, 237239, 265, 279, 280, 281, 346, 348352, 357, 358, 361,
188, 190, 192, 193, 202, 207, 301, 307310, 312, 317, 320, 364370, 372374
224, 231, 255, 258, 267, 283, 322, 323, 327329, 333335, signed (see signed language)
284, 309, 310, 333, 334, 338, 350, 357359, 361365, 372, language change 285, 297 (see
346348, 352, 363, 364, 373 373 also diachronic linguistics)
388 The Shared Mind

langue 284, 285, 297, 298 (see state 9, 10, 1820, 22, 24, evolutionary 224
also parole) 25, 33, 3942, 48, 49, motivational intentional
layered model 22, 219 5659, 91, 126, 131, 132, relations 53, 54 (see also
left hemisphere 52 (see also 141, 143145, 150, 151, 153, intentional relations)
brain) 165, 175, 190, 191, 203, 228, multimodal 43, 45, 51
linguistics 6, 10, 11, 12, 279, 290, 308, 309, 315, 327, 329, mutual
280, 283, 292, 295, 296, 358, 361 engagement 117119, 133,
299, 300, 302, 310, 324, 354 metacognition 50, 58 134, 207, 225
generative (see generative metarepresentational 59, 60, gaze 219, 224, 225, 240 (see
linguistics) 235, 238, 245, 258, 259, 262, also eye contact)
historical (see historical 263, 265, 266 sharing 119, 120, 124, 127,
linguistics) methodological 73, 74, 83, 84, 132, 133, 307, 310 (see also
love 4143, 49, 359 147, 191, 202, 225, 292, 303, sharing)
317, 360, 369
M individualism 12, 358, 362 N
matching mimesis 10, 215, 226, 231, 232, narrative 7, 18, 20, 25, 2734,
between self and other 19, 233, 234, 237, 252, 255, 256, 49, 191, 236, 261, 262, 338,
4045, 52, 5557, 59, 61, 62, 262, 268 339, 345, 357, 373375
117, 126, 131, 132, 224, 249 bodily 9, 214, 217225, 238 practice 17, 28, 29, 30, 236,
of direction 125 hierarchy 217219, 221, 222, 345
material grounding 288 237, 238 narrative practice
meaning viii, 9, 2224, 33, abilities 250, 252, 255257, hypothesis 7, 12, 17, 28,
34, 41, 42, 62, 80, 81, 89, 267 236, 273
90, 92, 98111, 121, 123, 130, mimetic narrativity 12, 374
142, 151155, 167, 195, 229, culture 255, 256, 261, 262, negation 307, 318323, 326, 329
233, 235, 309, 310, 357, 364, 267 negotiation of meaning 130,
schemas 267 153, 319, 334336, 340, 347
365, 368, 369, 371, 373, 374
skills 218, 253, 255 351, 370
as use 123, 158
Mimetic Ability Hypothesis neonatal mirroring 219, 220
iconic 235 (see also iconic
(MAH) 10, 245, 257, 261, (see also imitation)
signs)
270 neuroscience 6, 7, 12, 17, 19, 22,
intended 175, 177, 179
miming 220, 255 39, 40, 49, 61, 62, 138, 218, 221,
linguistic 1, 216, 229, 233,
mind 222 (see also brain)
260, 281287, 289, 291, 293,
sharing 39, 61 (see also non-linguistic conventions 255,
294, 296, 311315, 317, 318,
sharing) 261 (see also convention)
321, 326, 328, 333, 335338, understanding 39 (see also non-verbal communication (see
341343, 346352 understanding) communication)
shared 94, 98, 99, 101, 105, -reading 18, 22, 118, 272 nonverbal reference 115, 116,
106, 110, 229, 342, 349 mirror neurons 7, 17, 19, 21, 22, 120126, 128130, 135 (see also
speakers 259 39, 40, 50, 52, 60, 117119, referential)
referential (vs. 222, 223, 251, 264, 302 norms 17, 27, 28, 31, 233, 236,
emotional) 224 systems 222, 250, 264 255, 256, 269, 283, 289293,
mechanisms for reading mirror self-recognition 48, 295297, 300, 360, 361, 363,
gaze 134 219, 226 364, 371374
Medial Prefrontal Cortex monkey 40, 45, 58, 166, 226, normativity 1012, 27, 109112,
(MPFC) 5659 271 216, 219, 220, 238, 279, 280,
mental motivation ix, 26, 81, 82, 128, 281, 283, 289, 292294, 296,
agents 49, 190, 236 142, 188, 192, 193, 203, 204, 297, 302, 303, 307, 329, 330,
image 281, 285, 286, 342 227, 229, 230, 255, 269, 338, 354, 359, 361365, 367372,
space 80, 322 351, 369 375
Subject index 389

O perspective shifting 263, 265 proprioception 7, 21, 45, 46, 53,


object phenomenology 3, 4, 7, 1820, 219, 222224, 250
exchange 46, 372 132, 221, 359 proto-conversation viii, ix, 117,
permanence 9294, 110, phylogeny 61, 77, 194, 217, 222, 132, 143
113, 368 253, 259 protolanguage 219, 221238,
-directed 39, 40, 4447, physical grooming 258, 261 262264
51, 147 Piagetian 95, 172, 182, 205, 208 proto-mimesis 20, 219,
okkasionelle Bedeutung 284, pointing 8, 9, 47, 97, 99, 220, 223, 225, 237 (see also
285 (see also usuelle 105109, 111, 121, 122, 125, mimesis)
Bedeutung) 126, 129, 134, 135, 145149, proto-sign 264
ontology 10, 279, 280, 286, 152, 154158, 176, 186188, psychological reality 299, 363
358, 363 190, 193198, 201207, 220, psychology 2, 4, 9, 12, 18, 74,
ostensive 132 230, 279, 280 76, 91, 9496, 98, 99, 116,
actions 102 (proto)declarative (see 119, 121, 122, 170, 188, 189,
denotation 25 declarative pointing) 191, 202, 222, 246, 257, 258,
gestures 102, 107, 108, 109, (proto)imperative (see 279, 284, 295, 298, 299,
111 imperative pointing) 300, 359
signs 99108 whole-hand points 195 commonsense (see
uses of objects 8, 101, 111 points of view 13, 25, 48, 76, commonsense
other minds 2, 11, 26, 62, 139, 251, 334 psychology)
221, 280, 307 post-mimesis 219221, 232, folk (see folk psychology)
out-of-body experience 53 234, 237, 238, 283 (see also of language 298, 300
mimesis)
P pragmatic 24, 25, 89, 90, 93, R
pain 3, 54, 55, 57, 339 98, 103, 109111, 151, 221, rationality 289, 295
parietal cortex 5054, 58, 60, 296, 312, 318, 328, 335, 338, readiness to interact 129
220 345 reality vii, 83, 89, 90, 226, 286,
parole 284, 285, 297 (see also contexts 7, 17, 23, 31 300, 313, 357, 360, 373
langue) intersubjectivity 23 objective 360
participation 343, 351, 357, 360, pragmatics 97, 98, 266, 279, psychological (see
361, 363365, 371373 283285, 312, 335, 346 psychological reality)
pedagogy 256 premotor cortex 50, 57, 63 social (vs. physical) 8, 89,
perception-based understanding preparatory (attention getting) 91, 94, 97, 98, 99, 103, 109,
22 behaviour 122 111, 363, 375
perceptual pretend play 25, 31, 91, 261, 373 reasons 7, 2531, 33, 150, 265
categorization 92 pre-theoretical 285, 293, 294 reciprocal miming 255
intersubjectivity 215, 244 primary intersubjectivity (see recreative imagination 245,
(see also intersubjectivity) intersubjectivity) 262, 265
re-enactments 248 primary sensory areas 54, 55 referential 9, 90, 91, 115, 116,
personal 19, 27, 31, 40, 42, private-language argument 121, 125, 127, 129, 131,
(PLA) 4, 279, 280282
4648, 55, 56, 73, 80, 82, 133136, 153, 155, 165167,
propositional
83, 222, 258, 296, 315 169, 175, 177, 180, 187, 206,
attitudes 220, 237, 251, 258,
engagement 69 207, 224, 232, 233, 266, 267,
260, 265
relatedness 67 269, 308, 309, 327, 329, 347,
reasoning 228
persons 4, 8, 18, 21, 22, 2931, 358
form 233
33, 42, 67, 68, 76, 81, 82, 96, instructions 249 behaviour 122, 124
128, 132, 141, 142, 145, 146, 149, knowledge 238 reading 129, 130, 134
150, 152, 181, 221, 274, 280, representation 233 explicit reference 9, 76, 113,
290, 309, 316, 340 thought 234, 253, 259 139, 160, 187, 194, 195
390 The Shared Mind

reflective 51, 57, 81, 82, 215 and other xi, 2, 5, 8, 3955, individual (vs. mutual) 119,
intersubjectivity 8 57, 5962, 82, 118, 132, 223 120
understanding (vs. non- conscious xi, 35, 85, 223 looks 68, 7375, 79
reflective/pre-reflective) feeling self 54, 56, 57 mental states 9, 42, 141, 143,
9, 40, 58, 81, 82 self/other contrast/ 144, 250
thinking 150 differentiation 225237, mutual 119, 120, 124, 127,
representations 40, 4348, 250, 265 132, 133, 307, 310
5154, 5662, 82, 83, 92, self/other-orientation sign 93, 101, 104, 106, 108, 110,
96, 99, 117, 220, 229, 233, 7375 111, 135, 154, 182, 184, 220,
249251, 326, 336, 358, self-other equivalence 7, 39, 229232, 234, 235, 238, 308,
367369 41, 43, 118, 119, 126, 129 341, 365, 366, 367, 373
dialogic cognitive 128, 230 semantic 147, 151, 186, 218, 220, signal 47, 54, 122, 123, 125, 129,
material 371 221, 233, 234, 236238, 266, 134, 156, 169, 171, 174, 176178,
mental 49, 359 283, 296, 366, 367 254, 308, 313, 315, 327, 328,
mimetic 218 semantics 322, 324 351
shared (see shared and pragmatics 279, signed language 216, 219, 232,
representations) 283285, 287, 312 234, 235, 340
social 360 intention-based 10, 259 Swedish Sign Language 232
representational 3, 10, 48, 96, musical xi Nicaraguan Sign
128, 200, 203, 215, 218, 220, semiotic Language 234
226, 232, 357, 358, 359, 360, capacity 219 American Sign Language (see
372 mediation 4, 89, 364, 365 Americal Sign Language)
artifacts 91 systems 89, 90, 98100, 102, social x, 1, 37, 1012, 20,
mental states 39, 49, 127 106, 111 22, 30, 31, 32, 3944,
meta- (see sensory-motor 20, 25 47, 49, 54, 5762, 6870,
metarepresentational) shared 74, 76, 77, 7985, 8999,
mind viii, 1, 87 attention 10, 23, 102, 144, 104, 106, 109111, 119, 120,
non-representational 200, 187, 203, 215, 219, 226, 125137, 141143, 145151,
223, 251 237, 316 (see also joint 153158, 165169, 171,
relation 83, 229 attention)
172, 176, 178180, 188,
resonance systems 17, 22, 264 knowledge 11, 307, 322, 333,
190, 192194, 198,
response priming 248, 249 335, 340, 346, 348
200203, 207, 218, 229,
right hemisphere 32, 51, 52 (see meanings 94, 98, 99, 101,
231, 237, 238, 246, 254258,
also brain) 106 (see also meaning)
260, 263, 269, 279, 280,
role-reversal 308 representations 17, 19, 21
rule 125, 135, 281283, 291294, 282292, 300, 302, 308,
(see also representations)
298, 315, 316, 360 situations 23 309, 333, 343, 357368,
rule-sentence 292294 use 90, 98, 100 371375
world 13, 78, 79, 83, 264 cognition 2, 11, 12, 17, 18, 25,
S sharing ix, xi, 1, 3, 6, 8, 32, 39, 149, 161, 169, 190, 213, 217,
secondary intersubjectivity (see 40, 42, 43, 45, 48, 55, 61, 220, 221
intersubjectivity) 62, 67, 7384, 86, 87, cohesion 245, 258, 261, 270
second-person 8, 19, 117 115120, 123128, 131, 132, facts 11, 96, 287, 291, 357,
self vii, 3, 4, 21, 24, 27, 81, 83, 137, 154, 160, 161, 172, 175, 360363, 367, 368, 374
90, 97, 99, 107, 109, 111, 179, 215, 234, 245, 250, 256, function of gaze 129, 133
117, 175, 179, 213, 218, 259, 267, 269, 308, 309, learning 179, 229, 247, 248,
222, 224, 228, 231, 237, 363 252
255, 264, 281283, 293, affective 56, 133 ontology 290, 304
296 attention 78, 175, 308 reality (see reality)
Subject index 391

referencing 7982, 119, T 104, 110, 111, 126, 148, 149,


125, 127, 129132, 145148, taxonomic concept 6, 9, 141, 335, 372
198, 263 143, 148, 149, 158 mimesis 10, 47, 166, 215,
somatosensory 50, 54 Temporal/Pariental Junction 218220, 228231, 233,
spatial neglect 53 (TPJ) 51, 52, 53, 57, 58, 59 235, 237, 238, 268 (see also
speech acts 233, 268, 324, 328, theory of mind (ToM) 2, 7, 8, mimesis)
329 10, 11, 1719, 25, 29, 3234, truth-condition 91, 287
stage 39, 4043, 49, 52, 57, truth value 287
in development x, 68, 130, 5860, 82, 83, 90, 91, typological linguistics 279,
137, 148, 149, 152, 172, 180,
92, 109, 146, 175, 203, 215, 280, 295, 301
193, 205, 217221, 224, 238,
216, 217, 222, 235,
260, 265, 371
238, 239, 245, 248252, U
still-face 77, 192
254259, 261263, 266, understanding
Stimulus enhancement 248,
249 267, 360 attention 129, 145, 147, 148,
subpersonal 19, 24, 32, 34, 47, modules 2, 10, 25, 222, 238, 149, 152, 158 (see also joint
250 239, 245, 246 attention)
Superior Temporal Sulcus theory theory (TT) 7, 1719, belief 25, 26, 40, 226 (see
(STS) 35, 50, 52, 56, 58 20, 4143, 48, 59, 61, 62, also belief)
symbolic 97, 99, 100, 132, 191, 250252, 265 communicative intentions
216, 220, 221, 232, 245, 269, simulation theory (ST) 10, 134, 178, 234, 238 (see
308, 357, 367 7, 17, 1820, 36, 42, 43, also communicative
art 262 48, 61, 251, 252, 264, 265, intentions)
cognition 232 272 intention 9, 128, 153, 159,
culture 1 third-order mentality 218, 226, 265, 267 (see also joint
gestures 178180 228231, 233, 234, 236, 238, intentions)
language 260, 268, 270, 288, 289 others 3, 7, 17, 18, 20, 24,
329, 372 third-person 19, 26, 39, 57, 128, 25, 40, 43, 47, 49, 91, 216,
play 31, 81, 83, 261 220, 282, 323 217, 222
uses of objects 8, 99, information 4, 7, 41, 4346, usuelle Bedeutung 285, 296 (see
4854, 56, 5862, 220 also okkasionelle Bedeutung)
100102, 107, 111
reference 206, 219, 221 representation 51, 52, 62
(see also allocentric) V
representations 83
stance/perspective 19, 39, vision 45, 54, 109, 133, 219,
system 221, 235, 236, 372
224, 298
thinking 86, 267 128
volition 220, 223
sympathetic imagination 55 topos 315, 316, 318, 319, 322, 327
sympathy vii, viii, ix, x, 2, 35, touch 55, 158, 172, 178, 225,
W
55, 57 (see also empathy) 260, 279
wave 69, 70, 75 (see also
synchronic linguistics 296 transduction 188, 189 gestures)
synchrony 44 Transitory Magnetic Stimulation
systematically 73, 219, 285, 308, (TMS) 50, 55 Z
312, 317, 329 triadic Zapotec 370
systematicity 221234, 238, interaction 8, 44, 45, 47, Zone of Proximal Development
239, 320 62, 89, 90, 91, 97, 98, 100, (ZPD) 231, 238
In the series Converging Evidence in Language and Communication Research the following
titles have been published thus far or are scheduled for publication:

12 Zlatev, Jordan, Timothy P. Racine, Chris Sinha and Esa Itkonen (eds.): The Shared Mind.
Perspectives on intersubjectivity. 2008. xiii,391pp.
11 Lewandowska-Tomaszczyk, Barbara (ed.): Asymmetric Events. 2008. xii,287pp.
10 Steen, Gerard J.: Finding Metaphor in Grammar and Usage. A methodological analysis of theory and
research. 2007. xvi,430pp.
9 Lascaratou, Chryssoula: The Language of Pain. Expression or description? 2007. xii,238pp.
8 Plmacher, Martina and Peter Holz (eds.): Speaking of Colors and Odors. 2007. vi,244pp.
7 Sharifian, Farzad and Gary B. Palmer (eds.): Applied Cultural Linguistics. Implications for second
language learning and intercultural communication. 2007. xiv,170pp.
6 Deignan, Alice: Metaphor and Corpus Linguistics. 2005. x,236pp.
5 Johansson, Sverker: Origins of Language. Constraints on hypotheses. 2005. xii,346pp.
4 Kertsz, Andrs: Cognitive Semantics and Scientific Knowledge. Case studies in the cognitive science of
science. 2004. viii,261pp.
3 Louwerse, Max and Willie van Peer (eds.): Thematics. Interdisciplinary Studies. 2002. x,448pp.
2 Albertazzi, Liliana (ed.): Meaning and Cognition. A multidisciplinary approach. 2000. vi,270pp.
1 Horie, Kaoru (ed.): Complementation. Cognitive and functional perspectives. 2000. vi,242pp.

Anda mungkin juga menyukai