Anda di halaman 1dari 18

Copyright 1992 by the American Psychological Association, Inc.


Psychological Review
1992, Vol. 99, No. 4, 587-604

How to Build a Baby: II. Conceptual Primitives

Jean M. Mandler

Department of Cognitive Science

University of California, San Diego
and Medical Research Council Cognitive Development Unit
London, England
A mechanism of perceptual analysis by which infants derive meaning from perceptual activity is
described. Infants use this mechanism to redescribe perceptual information into image-schematic
format. Image-schemas create conceptual structure from the spatial structure of objects and their
movements, resulting in notions such as animacy, inanimacy, agency, and containment. These
earliest meanings are nonpropositional, analogical representations grounded in the perceptual
world of the infant. In contrast with most perceptual processing, which is not analyzed in this
fashion, redescription into image-schematic format simplifies perceptual information and makes
it potentially accessible for purposes of concept formation and thought. In addition to enabling
preverbal thought, image-schemas provide a foundation for language acquisition by creating an
interface between the continuous processes of perception and the discrete nature of language.

analysis results in redescriptions of spatial structure in the form

of image-schemas. These redescriptions constitute the meanings that infants use to create concepts of objects, such as animate and inanimate things, and relational concepts, such as
containment and support. I further propose that image-schemas provide a level of representation intermediate between
perception and language that facilitates the process of language
Concepts such as animacy or containment are complex and
unlike sensory attributes, yet they appear early in infancy. At
the same time, it is not necessary that they be innately given in
order to account for their early appearance. As will be seen,
many of the ideas I propose in this article are not couched in the
traditional framework of concept formation and are somewhat
speculative in nature. However, as Fodor (1981) pointed out,
theories of concept attainment have been limited to a small
range of theoretical options. Perhaps it is time to try a new
The view that only sensory attributes are innately given and
are used in conjunction with experience to derive the nonsensory concepts of adults has been the dominant view of conceptual development since the time of the British empiricists. In
more recent formulations of this view, the term perceptual is apt
to be substituted for sensory, but the general approach is the
same. For example, during the second half of the first year of
life, infants are said to begin to form so-called basic-level categories, such as faces, dogs, or cars, that enable them to recognize
new instances (e.g., Cohen & Younger, 1983; Strauss, 1979). The
ability to create such equivalence classes is thought to rest on
responsivity to the correlated features that make up the objects
(e.g., Rosch & Mervis, 1975). It is assumed that for infants these
features are perceptual, however, leaving still unsolved the question of how nonperceptual understanding of objects develops.
These early categories are called perceptual because they
need not have conceptual content (or in another terminology,
these categories are not yet representational). "Voung infants,
like lower organisms, industrial vision machines, and various
connectionist programs, can form perceptual prototypes by

When you keep putting questions to Nature and Nature keeps

saying "no," it is not unreasonable to suppose that somewhere
among the things you believe there is something that isn't true.
(Fodor, 1981, p. 316)

One of the least understood developments in infancy is how

children become able to think, that is, to go beyond perceptual
categorization to form concepts. As Quinn and Eimas (1986)
expressed this mystery:
The categories of infants . . . are sensory in nature, whereas the
categories and concepts of adults are often far removed from the
world of sensory data.. . . Nor can it be readily envisioned how
the attributes of adult concepts can be constructed from the attributes innately available to infants. Nevertheless, a commonly, if
often tacitly, held view is that the nontransducible attributes of
adult concepts are somehow derived from the sensory primitives
available to infants. Adequate (that is, testable) descriptions of the
developmental process remain to be offered, however, (p. 356)

My ultimate goal is to develop such a theory and to show how

the attributes of adult concepts can be derived from the primitives of infants. I believe this can only be done, however, by
abandoning a crucial assumption in Quinn and Eimas's (1986)
formulation: that infants only engage in sensory (or perceptual)
categorization. In a previous article (J. M. Mandler, 1988), I
reviewed some of the evidence for nonsensory conceptual activity in infancy and outlined a mechanism of perceptual analysis
by which such conceptual processing might be achieved. The
goal of the present article is to specify this mechanism in more
detail and to explore the nature of the format in which the
resulting notions are represented. I propose that perceptual

Preparation of this article was supported in part by National Science

Foundation Research Grant BNS89-19035. Thanks to Nancy Johnson,
Melissa Bowerman, Annette Karmiloff-Smith, and John Morton for
helpful and insightful comments on earlier drafts. Some of the ideas in
this article appear in germinal form in J. M. Mandler (1991).
Correspondence concerning this article should be addressed to Jean
M. Mandler, Department of Cognitive Science, University of California, San Diego, La Jolla, California 92093-0515.




learning to abstract the central tendencies of perceptual patterns. Forming discriminable categories, however, does not
imply that the organism or machine has thereby formed any
theory of what the objects being categorized are or can use this
information for purposes of thought. Thus, the infant (or machine or program) that can differentiate between male and female faces (Pagan & Singer, 1979) or can discriminate trucks
from horses (Oakes, Madole, & Cohen, 1991) could be doing so
on a purely perceptual basis and is not thereby displaying conceptual knowledge about people or animals (J. M. Mandler,
1988; Nelson, 1985). To have a concept of animal or vehicle
means to have at least some rudimentary notion of what kind of
thing an animal or vehicle is.1 There is ample evidence that
even 3-year-old children have such concepts (e.g., Wellman &
Gelman, 1988). How does the infant advance to this new kind
of understanding?

Piaget's Theory of Concept Formation

The most widely accepted answer to the question of how
concepts are first acquired is based on Piaget's (1952) theory of
sensorimotor development. In his theory, much of the first year
and a half of life is taken up with developing perceptual (sensorimotor) categories of objects, such as those described earlier.
Piaget recognized that these categories need not be conceptual
in nature; indeed, he posited that they are not conceptual, consisting instead of sets of perceptual-motor schemas that enable
infants to recognize a variety of objects (and events) and to act
appropriately in their presence. He saw no evidence that infants, during this period, have any conceptual representations
that would enable thought about objects. In Piaget's analysis,
concepts develop when sensorimotor schemas become "interiorized," "speeded up," and freed from ongoing perception and
action. This transformation, which depends heavily on learning
about objects through physical interactions with them, is said to
occur at the transition from the sensorimotor to the preoperational period at about 1.5 years of age.
Piaget's (1952) theory of a sensorimotor stage in infancy has
been widely accepted and seems to have escaped much of the
critical response that has been directed toward other aspects of
his theory. Nevertheless, his view of infant cognition faces a
number of theoretical and empirical difficulties (e.g., J. M.
Mandler, 1988; Sugarman, 1987). On the theoretical side, it
does not provide a satisfactory description of how sensorimotor
schemas are transformed into concepts. Piaget's theory states
only that over the course of the sensorimotor period, actionschemas gradually become speeded up and freed from their
sensorimotor limitations. The main avenue for this development is said to be imitation. Infants gradually become able to
imitate more and more complex activities that eventually become interiorized in the form of images. How images are actually created by this process is not discussed, but in Piaget's view
images constitute the first nonsensorimotor, conceptual form
of representation. Imagery allows infants to re-present objects
and events to themselves, and so it provides the foundation for
the beginning of thought. However, the claim of a purely sensorimotor form of representation for the first year and a half of
life depends on the assumption that there is no imagery during
this period. If imagery could occur in younger infants, there

would be nothing in principle to prevent them from engaging

in conceptual thought. In fact, there is no evidence that imagery is such a late-developing process. In addition, the theory
leaves unsettled how images come to have more than perceptual content. How does forming an image of an object conceptualize it? Thus, this view of concept formation begs the very
question that was posed.
There are also empirical difficulties with Piaget's (1952)
theory. Current evidence suggests that infants do not require
such an extended period of experience to learn the most basic
characteristics of objects. At least from 3 months of age they
live in a stable perceptual world, consisting of objects that are
seen as coherent, bounded things, separate from the background and behaving in predictable ways (Spelke, 1985). By 4
to 5 months they understand that objects are both solid and
permanent (Baillargeon, in press), and they have begun to differentiate causal from noncausal object motion (Leslie, 1982).
Some of this understanding may be sensorimotor in character
and not require conceptual activity of the type defined earlier.
Therefore, Piaget's theory could probably be adapted to fit this
evidence, although it does create a puzzle as to why conceptual
activity should be delayed so long.
More serious, however, is recent evidence that conceptual
representation is not long delayed, that concepts are developing
concurrently with the development of sensorimotor schemas.
For example, one of the criteria that Piaget (1952) used to demonstrate conceptual activity was recall of absent objects or
events. Recall requires the infant to re-present information not
given by current sensorimotor activity. Piaget thought such recall did not occur until about the middle of the second year.
However, recall has been demonstrated in the laboratory as
early as 8 months of age (Baillargeon, De Vos, & Graber, 1989).
Recall of past events over 24-hr delays has been shown at 9
months (Meltzoff, 1988), and two studies have provided evidence that events taking place before 10 to 11 months of age can
be recalled up to 1.5 years later (McDonough & Mandler, 1990;
Myers, Clifton, & Clarkson, 1987). Such findings indicate that
a long-lasting declarative memory system is in place by about 8
months of age; its onset may be even earlier but the relevant
research has not yet been conducted.
In addition, work on the acquisition of sign language has
provided evidence for the beginning of symbolic functioning as
early as 6 to 7 months of age (e.g., Bonvillian, Orlansky, & Novack, 1983; Meier & Newport, 1990; see J. M. Mandler, 1988,
for a discussion of the symbolic character of these early signs).
There has been some controversy over the linguistic status of
these early signs (e.g., Bates, O'Connell, & Shore, 1987; Petitto,
1987), but the relevant point here is that children of this age are
capable of using a gesture (probably associatively learned) to
refer to or express a meaning. Along with recall, this hallmark
of conceptual thought appears before infants have become
skilled in manipulating objects or have begun to locomote
through the environment. Thus, several lines of evidence indicate the presence of conceptual thought during the time when
This view of concepts as ideas about kinds emphasizes their theoretical nature (Carey, 1985; Keil, 1989; Medin & Wattemaaker, 1987), as
opposed to a view of concepts as distributions of features or properties.

Piaget posited it to be absent. Evidently, a great deal of conceptualization about objects and events in the world comes from
observation rather than from physical interaction with objects.2
Nevertheless, it must be determined how observation of objects
forms the basis from which more abstract conceptual knowledge about those objects is derived.

Concept Formation by Perceptual Analysis

I have proposed that perceptual analysis is the mechanism by
which concepts are first formed (J. M. Mandler, 1988). Perceptual analysis is a process in which a given perceptual array is
attentively analyzed, and a new kind of information is abstracted. The information is new in the sense that a piece of
perceptual information is receded into a nonperceptual form
that represents a meaning. Sometimes perceptual analysis involves comparing one object with another, leading to conceptualizing them as the same (or different) kind of thing, but
often it merely involves noticing some aspect of a stimulus that
has not been noticed before. The process is different from the
usual perceptual processing, which occurs automatically and is
typically not under the attentive control of the perceiver. Most
of the perceptual information normally encoded is neither
consciously noticed nor accessible at a later time for purposes of
thought. Perceptual analysis, on the other hand, involves the
active receding of a subset of incoming perceptual information
into meanings that form the basis of accessible concepts.
Perceptual analysis is a simple version of what KarmiloffSmith (1986,1991) calls redescription of procedural information. She has concentrated on how elaborate systems of procedural information, such as the pronominal system in language,
become redescribed, eventually achieving a form that is accessible to consciousness. Perceptual analysis is a simpler kind of
redescription, one that is often concurrent with perception itself; that is, redescription does not have to take place off-line on
long established representations, as suggested by the data with
which Karmiloff-Smith has worked, but can also take place
on-line with the registration of perceptual information. In both
cases, however, information is being receded into a different
format, and in the process some of the original information is
lost. What people consciously attend to and store in a form that
they can potentially think about involves a reduction and redescription of the huge amount of information provided by their
sensory receptors. A great deal of the perceptual information
that people process is stored in procedural (implicit) form, as
illustrated by the fact that they readily learn complex perceptual categories such as faces, but cannot state what the information is that they use; that is, the information is not explicit (J. M.
Mandler, 1988). Forming an explicit knowledge system requires a different formatwhat one might call a vocabulary of
meanings. The process of redescribing perceptual information
forms such a vocabulary. This vocabulary is both simpler than
the original information and, at least to some extent, is optional
as to whether or when it is formed. It is simpler because it
contains less information than is processed by the perceptual
system and is more coarsely grained. For example, representing
a zebra as striped (either by language or by a sketch) is a coarse
summary description of a specific complex grating. It is optional in the sense that it need not occur when perceptual pro-


cessing goes on. It is the optionality of what one attends to that

accounts for much of the variability in conceptual development, as opposed to the nonoptional, automatic processing of
perceptual information that takes place most of the time.
Although perceptual analysis is not necessary for much of
one's traffic with the environment, adults seem to do a good
deal of it, or more precisely, engage in a process that is similar.
For example, someone may consciously notice that an acquaintance has started wearing glasses, a fact that may have been
missed for several weeks in spite of looking at this person every
day. A few years ago, during a boring seminar, I discovered that
most people's ears are at the same level as their eyes, which is
something I must have processed perceptually since infancy
but that I had not analyzed and made part of my concept of a
face in the past. The results of such analysis can be expressed in
verbal or in imaginal form and are then available for purposes
of recall, planning, making choices, and so on. It is important
to note, however, that these examples from adult experience
take place in a processing system that already has a large conceptual vocabulary that can be used to add new facts to its
accessible knowledge store. In that sense, they are somewhat
misleading vis-a-vis the more basic process of perceptual analysis that infants must carry out. Infants have to create the meanings from which the conceptual base is formed in the first
place. These meanings are unlikely themselves to be
consciously accessible, but must be redescribed once again into
imagery or words. This issue is discussed further in the section
on image-schemas.
In principle, perceptual analysis could begin quite early in
life even though it might initially be rather primitive in form.
That is, I assume that the capacity to engage in perceptual analysis is innate. However, it also seems likely that such analysis
requires the development of at least some stable perceptual
(sensorimotor) schemas. As mentioned earlier, research (e.g.,
Baillargeon, in press; Kellman, Gleitman, & Spelke, 1987;
Spelke, 1985) suggests that stable schemas of three-dimensional objects, seen as coherent, solid, and separate from the
background, are formed by 3 to 4 months of age. It is about the
same time that the first indications of perceptual analysis appear (J. M. Mandler, 1988). The measures are necessarily indirect, but are implicated in many accounts of behavior after the
first few months of life. For example, Werner and Kaplan
(1963) described the development of a contemplative attitude
between 3 and 5 months; Fox, Kagan, and Weiskopf (1979) and
Janowsky (1985) reported an increase in vicarious trial and
error (VTE) behavior, or active comparison of stimuli, between
4 and 8 months; Ruff (1986) described the "examining schema"
as already well developed by 7 months, the earliest age she
studied. Perceptual analysis can also be inferred from Piaget's
(1951) account of early imitation in which he described his
It is sometimes claimed that Piaget means action in a more abstract
sense, for example, as eye-movements or even as anything that transforms reality (Sinclair, 1971, p. 134). Even if the latter abstract definition can be meaningfully included under the rubric of action, Piaget is
quite clear that the roots of conceptualization lie in infants' physical
interaction with objects (Piaget, 1951, 1952). In any case, this more
typical usage of the term action has been the predominant understanding of Piaget's theory among developmental psychologists.



children as young as 3 to 4 months as displaying intense concentration on the models he provided.3

A Vocabulary for Preverbal Concepts
In an earlier article (J. M. Mandler, 1988) I emphasized the
process of perceptual analysis, but had little to say about the
format of the resulting representations. If perceptual analysis
results in concept formation, there must be some vocabulary, or
set of elementary meanings, from which the concepts are composed. Whatever meanings infants derive from perceptual analysis, they seem likely to be rather global in character. Even for
adults, conscious conceptualizations about objects and events
tend to be crude and contain vastly less information than the
perceptual knowledge used for recognition. Infants' concepts
are apt to be cruder still and may not even be couched in prepositional form as many adult concepts appear to be.
The problem of specifying a conceptual vocabulary is not
one that can be avoided, regardless of the nature of one's theory
of conceptual development. However, with the exception of
Leslie (1988), who discussed the primitives involved in the first
causal concepts, about the only researchers working on the
problem are those in the area of language acquisition who talk
about preverbal conceptualizations or semantic primitives. It
has long been assumed that the notions of objecthood, agency,
actionality, and location expressed in the earliest speech are
drawn from nonlinguistic representations (e.g., Bloom, 1970;
Bowerman, 1973; Brown, 1973; see also Sinclair, 1971). However, the precise nature of these preverbal representations has
not yet been addressed. One of the rare discussions of what
these representations might be like is found in Slobin (1985),
who suggested that young language learners not only have concepts of objects and actions but also make use of sets of more
abstract relational notions. He discussed these notions in terms
of prototypical scenes that highlight various components and
the paths that they take.4
If language acquisition depends in part on preverbal concepts, they must already be present by 10 months when children
begin to acquire their first spoken words and use them for
communication. The work on sign language in 6- to 7-monthold deaf infants suggests that the relevant conceptualizing process begins even earlier. What is needed is to find ways of characterizing the representation of these preverbal concepts. It is
not sufficient to say, as has traditionally been done within the
Piagetian framework, that a concept is formed by transforming
a sensorimotor schema; the format in which the concept is represented must be specified as well. The initial conceptual representation seems likely to be perceptually based but not in the
format that is used by the procedural workings of the perceptual systems.
For example, consider possible sources for a primitive concept of animate thing or animal. There is an excellent source in
perception to deliver some of the information needed for this
basic notion, namely, the perceptual categorization of motion.
Adults easily differentiate animate from inanimate (or mechanical) motion (Stewart, 1984). The data have not yet been
collected to determine whether infants do so as well, but it is
known that infants differentiate caused from noncaused motion (Leslie, 1982,1988). In addition, infants are highly respon-

sive to motion; not only do they look longer at moving stimuli

than at motionless stimuli, but various perceptual achievements are first attained when moving stimuli are used (e.g.,
object parsing: Spelke, 1985; face recognition: M. H. Johnson &
Morton, 1991). Of particular relevance, infants perceptually
differentiate the motion of people from similar but biologically
incorrect motion as early as 3 months of age (Bertenthal, in
press). This work suggested that it is likely that infants can
make the more general categorization of animate versus inanimate (mechanical) motion. I make the assumption that they
can and that the perceptual categorization of motion is one
source for dividing the world into classes of things that move in
different ways. More is needed, however. To form a concept of
animal, as opposed to a perceptual category of a particular type
of motion, the infant needs to notice and conceptualize something like the following: Those things that move in one way
start up on their own (see Gelman, 1990; Premack, 1990) and
sometimes respond to the infant from a distance, whereas those
things that move in another way do not start up on their own
and never respond to the infant from a distance. (There may be
other important bases. I have tried to use a minimum set that
would be sufficient to understand something about what kind
of thing an animal is above and beyond what it looks like.)
Such simple meanings are sufficient to constitute an early
concept of animal, a concept that is present in some form before the end of the first year of life. Several experiments have
shown that infants respond to animals in a way that cannot be
accounted for on the basis of perceptual categorization alone.
For example, Golinkoff and Halperin (1983) found that an
8-month-old produced a unique emotional response to both
real and toy animals that varied widely in shape, features, and
texture. In my laboratory we have found that 9-month-olds dishabituate to a vehicle after seeing various models of animals
(and vice versa) even when the individual animals vary greatly
in their perceptual appearance (McDonough & Mandler,
1991). (Ongoing work indicates this is true of 7-month-olds as
well). Furthermore, 9-month-olds also dishabituate to models
of birds after seeing airplanes (and vice versa), even though, in
this case, the shapes of the airplanes and birds are similar
(McDonough & Mandler, 1991). It is usually assumed that similar shapes and features form the basis of early perceptual categorization (e.g., Cohen & Younger, 1983). If true, then perceptual
categorization is insufficient to account for these kinds of responses; some conceptual basis is required. (This issue is discussed further in the last section of this article.)
To further this account, it is necessary not only to specify the
meanings involved in an early concept of animal (such as
"moves on its own" and "responds to other objects") but also to
specify the format in which they are represented. Whatever
their exact nature, these meanings are unlikely to consist of
sensory descriptions, as the British empiricists would have had
it. "Brown" and "square" are not the sorts of attributes that lead
to understanding what an animal is. Nor are the features talked
These measures are discussed in more detail in J. M. Mandler
Slobin based some of his notions on the work of Talmy (e.g., 1983),
who was also one of the root sources for some of the ideas in this article.

about in the classical theory of concepts (see E. E. Smith &
Medin, 1981), such as "has wings" or "four legs," likely to lead
to conceptual understanding either, for the same reason that
feature theories have failed as an approach to understanding
semantic development (Armstrong, Gleitman, & Gleitman,
1983; Medin & Ortony, 1989; Palermo, 1986). It seems likely
that early conceptual understanding will be of a global (i.e.,
shallowly analyzed) character, with analysis of features
("brown," "has wings," etc.) a later intellectual achievement
(J. M. Mandler, Bauer, & McDonough, 1991; Nelson, 1974;
L. B. Smith, 1989a). Thus, an approach is needed that does not
require detailed featural analysis but at the same time allows
the formation of some global conceptions about what is being

Image-Schemas as Conceptual Primitives

The approach to preverbal conceptual representation that I
take here is derived from the work by cognitive linguists on
image-schemas (M. Johnson, 1987; Lakoff, 1987; Langacker,
1987; Talmy, 1983,1985). Although not developmental psychologists, these researchers have been led by their linguistic concerns to seek the underlying basis of the concepts expressed in
language. It is claimed that one of the foundations of the conceptualizing capacity is the image-schema, in which spatial
structure is mapped into conceptual structure. Image-schemas
are notions such as PATH, UP-DOWN, CONTAINMENT, FORCE,
PART-WHOLE, and LINK, notions that are thought to be derived
from perceptual structure. For example, the image-schema
PATH is the simplest conceptualization of any object following
any trajectory through space, without regard to the characteristics of the object or the details of the trajectory itself. According
to Lakoff and to Johnson, image-schemas lie at the core of
people's understanding, even as adults, of a wide variety of
objects and events and of the metaphorical extensions of these
concepts to more abstract realms. They form, in effect, a set of
primitive meanings. (Primitive in this sense means foundational; it does not mean that image-schemas are atomic, unitary, or without structure.)
A good deal has been written about image-schemas, although, with the exception of formal linguistic treatments (e.g.,
Langacker, 1987), most discussions have been relatively informal and have been directed primarily to showing how imageschemas function within semantic theory. My focus here is
somewhat different because I am interested in the origins of
such meanings and the larger role they play in psychological
functioning. Therefore, after providing a definition of imageschemas, I will discuss various aspects that have not been
stressed by cognitive linguists, including the role that imageschemas play in preverbal concept formation.
Even though image-schemas are derived from perceptual
and (perhaps to a lesser extent) motor processes, they are not
themselves sensorimotor processes. I characterize them as condensed redescriptions of such processes. Perceptual analysis
involves a redescription of spatial structure and of the structure
of motion that is abstracted primarily from vision, touch, and
one's own movements. In this view, perceptual analysis involves
receding perceptual inputs into schematic conceptualizations
of space. Hence, image-schemas can be denned as dynamic


analog representations of spatial relations and movements in

space. They are analog in that they are spatially structured representations. They are dynamic in that they can represent continuous change in location, such as an object moving along a path.
Their continuous, as opposed to discrete, nature means that
they are not prepositional in character, although due to the
simplicity of the information they represent, it should be a relatively simple task to redescribe them into prepositional form.
They are abstracted from the same type of information used to
perceive, but they eliminate most details of the spatial array
that are processed during ordinary perception. At the same
time they are redescriptions because they use a different vocabulary (described in the next section). These new representations are the primitive meaning elements used to form accessible concepts.5
To illustrate, the perception of movement trajectories
through space is receded into the image-schema PATH. The infant sees many objects moving in many different ways, but each
can be redescribed in less detailed form as following a path
through space. In many cases, the infant sees the object beginning to move and also coming to rest, and these aspects of the
event can be represented in image-schematic form as well. Because image-schemas are analog in nature, they have parts. One
can focus on the path itself, its beginning, or its ending. In this
sense, image-schemas embed. BEGINNING-OF-PATH can be embedded in PATH; each can be considered an image-schema in its
own right. Similarly, perception of contingent motion is recoded into the notion of coupled paths or LINK (discussed
later). I propose that image-schemas such as PATH (with the
focus on BEGINNING-OF-PATH) and LINK constitute the meanings involved when a concept such as animacy is formed. Thus,
a first concept of animals might be that they are objects that
follow certain kinds of paths, that begin motion in a particular
kind of way, and whose movement is often coupled in a specific
fashion to the movement of other objects. Notice that several
image-schemas have been combined to form the concept of
animal. That is, the concept of animal is complex, involving
more than one meaning.
This article describes a format to represent the meanings that
make up several such concepts. To carry out thought with these
concepts requires relating one to another. Whether thinking in
this sense requires a prepositional form of representation or can
be carried out by constructing a mental model of an imageschematic nature (e.g., along the lines of Fauconnier's, 1985,
mental spaces) is an open question and one that is beyond the

It will be noted that I am emphasizing spatial rather than temporal

analysis. I am assuming that temporal concepts are at least partly derived from spatial ones. Many linguists have noted that in all languages
studied, most if not all temporal terms have a spatial sense as their
primary meaning (e.g., Clark, 1973; Fillmore, 1982; Traugott, 1978).
Even though movement trajectories have both spatial and temporal
extent, it seems easier to analyze movement through space than movement through time; for example, a nonverbal analysis of a change in
location can be expressed by pointing or moving the hand, an option
not available for a change in time. Of course, even very young infants
use temporal parameters in their processing but that does not mean
they conceptualize them. I suggest that temporal concepts are in the
first instance represented as analogical extensions of spatial schemas.



scope of this article. However, it is worth noting that both Lakoff (1987) and M. Johnson (1987) have argued that the structure of image-schemas gives rise to various entailments, suggesting that at least some inferences do not require a prepositional form of representation. That is, the dynamic and
relational nature of image-schemas provides a kind of syntax,
although not couched in predicate-argument structures that
specify truth conditions. If this view is correct it suggests that
an infant equipped with image-schemas could begin to think
about the world without a propositionally based language of
An alternative architecture to a built-in prepositional language, such as described by Fodor (1975), is one in which the
infant has the ability to simplify perceptual input, including
objects participating in events, thus creating analog sketches of
what it perceives. The resulting image-schemas provide the earliest meanings available to the infant for purposes of preverbal
thought (leaving open the issue of how they are concatenated)
and form the basis on which natural language rests.6 This formulation does not imply that image-schemas are themselves
accessible to consciousness; no language of thought is directly
accessible. Image-schemas represent the meanings from which
accessible concepts are formed. (See G. Mandler, 1985, for a
discussion of the processes involved in conscious constructions.) Hence, this architecture agrees with Karmiloff-Smith's
(1986) proposal that more than one level of redescription is
required before conscious access can be achieved. In the present case, the levels consist of inaccessible perceptual processing, redescription into image-schematic meanings, followed by
further redescription into conscious imagery or language.
An appealing aspect of an image-schema architecture is that
it provides a natural grounding for symbolic representation.
One problem that rarely appears in discussions of the language
of thought is how contentless symbols are chosen to represent
its meanings (see Harnad, 1990). Unless a symbol has been
innately assigned to a meaning, such as animacy, it is difficult
to understand how it becomes assigned. An image-schema
avoids this problem because its meaning resides in its own
structure; it does not require other symbols or another system to
interpret it. Furthermore, this architecture avoids positing an
enormous list of innate meanings. Forming an image-schema
requires only an innate mechanism of analysis, not innately
known content. New content can be added to the system whenever perceptual analysis takes place on aspects of the input not
previously analyzed.
The issue of why particular image-schemas are derived from
perceptual structure rather than others has rarely been discussed by cognitive linguists, and it certainly has not been resolved. Why, from observing an object in motion, should its
path be abstracted rather than some other aspect of the input?
One possibility is that our perceptual input systems are prewired or weighted to form schematic summaries of certain
types of perceptual information rather than others. On this
account one could say that the infant has an innate predisposition to form image-schemas of a certain type. However, it is
equally plausible that certain types of redescription are simply
the outcome of the way an infant's immature input systems
process the spatial structure that exists in the world. On this
account, one would not need to bias or preset the system that

processes spatial information. As a hypothetical example, a

newborn might not perceive much more in a rapid perceptual
encounter than a blurry object moving along a path. If so, a
mapping of Euclidean structure into topological structure
might be a predictable type of schematization for a processor
that is not yet skilled at analyzing the fine details of shape or
angular distance. This result would happen even though it was
not preset to abstract topological, rather than Euclidean, information. Although the resolution of this issue will ultimately be
of great importance, it seems premature to try to do more than
speculate about it at this time. First, it needs to be determined
whether a theory of early concept formation in terms of imageschemas is feasible and productive.
M. Johnson (1987) and Lakoff(1987) stressed that imageschemas are not the same as "real" images (what they called
"rich" images). Image-schemas are more abstract than images;
they consist of dynamic spatial patterns that underlie the spatial relations and movements found in actual concrete images.
It should be noted, however, that real images are typically not as
rich as Johnson and Lakoff implied. Furthermore, they are not
mental pictures if a mental picture means a copy of what has
been perceived. Visual imagery is constructed from what a person knows; images are not uninterpreted copies of reality but
rather are constructed from the underlying concepts the person
has already formed (Chambers & Reisberg, 1992; Intons-Peterson & Roskos-Ewoldsen, 1989; Intraub & Richardson, 1989;
Piaget & Inhelder, 1971). If concepts were represented largely in
propositional form, then one would need a complex theory
such as that proposed by Kosslyn (1980) to show how analog
images could be constructed from discrete information. However, if concepts, perhaps especially preverbal concepts, are
couched in image-schematic form, then the process of forming
images should be considerably simplified. The image-schema
itself structures an image space in which specific objects and
more detailed paths and spatial relationships can be filled in. It
is these concrete, or rich, images that appear in awareness, not
the image-schemas themselves. That is, when one consciously
thinks about what an animal is, the meanings from which this
concept is composed come to mind as specific images, words,
or both. Image-schemas, therefore, are a crucial part of our
mental architecture. They are not only used to create meanings
but also to help form the specific images that instantiate them
and to understand the words that refer to them. In summary,
then, perceptual analysis operates on perceptual information,
leading to image-schemas, which in turn form the foundation
of the conceptual system, a system that is accessible first via
imagery and later via language as well.
In the following sections, I will characterize several imageschemas used to create concepts that appear to be early developmental achievements: animacy, inanimacy, causality, agency,
containment, and support. To the extent that the relevant
image-schemas involve simplifying and redescribing percep-

This theory of the basis of thought is more or less the opposite of
that proposed by Piaget (1951). Piaget claimed that imitation of the
actions of others produces imagery, which then enables thought. My
position is that image-schemas provide the meanings that enable infants to imitate actions in the first place.


tual input, they seem compatible with what is known about
infant capabilities. As discussed earlier, perceptual schemas are
formed during the first few months of life and are therefore
available to be operated upon. There is no a priori reason why
redescribing these perceptual schemas into image-schematic
form should require another year of sensorimotor learning.
However, the time of onset of these concepts is not crucial for
the arguments I make. The issue is not the exact age at which
conceptualizations of the world begin to be constructed but
whether, in principle, an image-schema formulation can account for the preverbal conceptualization that occurs.

The Concept ofAnimacy

It seems that people judge motion to be animate on the basis
of perceptual characteristics of which they are not aware. Some
of the bases that adults use, primarily involving violations of
Newton's laws of motion, have been described by Stewart
(1984). Judging motion to be animate is similar to judging that
a face is male or femalepeople easily make this categorization but have little idea about the kinds of information they use
(J. M. Mandler, 1988). So, if someone is asked how they know
that a briefly viewed moving object is an animal, they might
say, "It moved like an animal," or "It started up on its own."
What is meant by such statements?
There are two broad types of onset of motion, self-instigated
motion (called self-motion here) and caused motion (see Premack, 1990). From an early age infants are sensitive to the difference between something starting to move on its own and
something being pushed or otherwise made to move (Leslie,
1988; see the discussion later on causality). In its simplest form,
self-motion means that an object is not moving, and then, without any forces acting on it, it starts to move. If one were asked to
express this notion without using words, the best one might be
able to do is to put one's hand at rest and then move it. Thus, an
image-schema SELF-MOTION might be graphically expressed as
in Graphic 1: a vector extending from a point A where A represents an entity at the beginning of its path. (This and the following diagrams are obviously not meant to be literal interpretations of image-schemas; they are merely attempts to illustrate
nonverbal notions.)




This is basically the notion of a trajector (M. Johnson, 1987;

Langacker, 1987; Lindner, 1981). Self-motion is the start of an
independent trajectory; that is, no other object or trajectory is
involved. By itself an object starting to move without another
visible trajector acting on it is not a guarantee of animacy. A
windup toy with a delay mechanism might appear to start itself,
or an unfelt tremor might cause a precariously balanced object
to fall from a surface. Although any such occurrence could
cause a mistaken judgment of animacy, perhaps especially by
an infant, the concept of animacy involves more than one
meaning, not self-motion alone. As just discussed, moving objects can be perceptually categorized into two kinds, and the
motion of a mechanical toy or a falling object does not fit into
the appropriate class on grounds other than the onset of its
motion, namely, the type of trajectory it follows. I address this
issue further in the following paragraphs.

My claim is that Graphic 1 is the most primitive characterization of what "moves by itself" means. It is a schematic representation abstracted from a class of varied and complex movements. The schematization could be achieved merely by analyzing observed movement in the environment. It might also be
abstracted from felt movement of the self and thei|efore have a
kinesthetic base. Nevertheless, the same representation should
result: a kind of spatial representation of movement in space.
This representation can be redescribed once again into language, of course, but the basic meaning is not prepositional in
I assume that in addition to self-motion, infants represent
something about the form of the trajectories that self-moving
things follow. I also assume that whatever analysis infants carry
out is rather simple. Perceptual discrimination of biological
motion (Bertenthal, in press), like perceptual categorization of
faces as male or female, is something one's perceptual recognition system easily does, but this information is represented at
best crudely and often inaccurately. Biological motion is extremely complex; even theoretical psychologists are not yet certain which parameters are crucial to judgments of animacy
(Stewart, 1984; Wilson, 1986). Similarly for faces: Our perceptual mechanisms use complex information about the texture
and proportions of the face to distinguish male from female
(Bruce et al, 1991), but that information is not part of our
concept of what a male or female looks like. We do not seem to
have a sufficiently large image-schema vocabulary to accomplish detailed analyses of such complex phenomena and must
rely instead on simpler conceptualizations to think about them.
I assume that perceptual analysis of biological motion results
in little more than the notion that it does not follow a straight
line. Adults think of biological motion as having certain rhythmic but unpredictable characteristics, whereas mechanical motion is thought of as undeviating unless it is deflected in some
way. Therefore, an image-schema of ANIMATE MOTION may be
represented simply by an irregular path, as illustrated in
Graphic 2.

' ^_^->-


Although crude, this image-schema of an animate trajectory is

sufficient for its purpose. In order to understand what animals
are, it is not necessary to analyze in detail the movements that
set up the perceptual category in the first place. Nevertheless,
because of the concentrated attention that infants give to moving objects, I assume that some analysis of animate trajectories
takes place along with the analysis of the beginning of their
paths. An example would be noticing that dogs bob up and
down as well as follow irregular paths when they move. In this
regard, we have observed an illuminating bit of behavior by
some of the 1- to 2-year-old children who have taken part in our
studies of early concepts (e.g., J. M. Mandler, Bauer, & McDonough, 1991). In a typical experiment, the children play with
little models of a variety of animals and vehicles, and we record
the sequences in which the children interact with them. Occasionally a child will respond to the animals by making them
hop along the table and to respond to the vehicles by making
them scoot along in a straight line. Making a fish or a turtle hop
is a graphic example of a representation of animate movement



that is faithful to the analysis in Graphic 2, no matter how

inaccurate it is in detail.
The image schemas for SELF-MOTION and ANIMATE MOTION
can be combined; that is, an image-schema embedding
Graphic 1and Graphic 2 represents an object starting up on its
own and then moving on an animate path. In Graphic 1 the
focus is on the onset of the motion (the straight line in the
diagram is merely a default representation for a path), whereas
in Graphic 2 the focus is on the path itself. Combining the two
gives an image-schema that might be called SELF-MOVING ANIMATE, as illustrated in Graphic 3.


Graphic 3 illustrates the nonunitary and relational character of

most image-schemas. It has two parts (BEGINNING-OF-PATH
and PATH) and a simple kind of syntax in the sense that one can
move from one part to the other.
In addition to self-motion and animate motion, a third notion that I propose that contributes to the concept of animacy is
contingency of motion between objects, especially contingency
that acts at a distance rather than through direct physical contact. The motor limitations of young infants severely restrict
their manipulation of objects (although they have ample opportunity to watch others do so). However, they are surrounded by
people who interact contingently among themselves and who
respond from a distance to the infant's actions and vocalizations in a way that inanimate objects do not. The seminal experimental work on this topic was that of Watson (1972). He
showed that at 2 months of age infants learned to make a mobile hanging above their crib turn when the movement was
contingent on their pressing their heads on a pillow. When the
mobile did not turn or turned noncontingently vis-a-vis the
pillow action, head presses did not increase. The most interesting response of the contingent group was that by the 3rd to 4th
day of exposure to the contingent mobile, the infants began to
smile and coo at it. Watson hypothesized that the mobile began
functioning as a social stimulus and that the initial basis of a
social stimulus is limited to some set of contingency experiences. It is not clear how one would differentiate between a
concept of a social stimulus and an animate stimulus in infancy;
perhaps a social stimulus is one that reacts contingently to one's
own movements, as opposed to reacting contingently to the
movement of other objects. In either case, one should expect to
see infants reacting to contingent toys or mobiles as if they were
The hypothesis is supported in research by Frye, Rawling,
Moore, and Myers (1983). These investigators used signal detection analyses of videotapes of infants responding either to
their mothers or to a toy; in addition, the mothers and toy
interacted with the infants in either a contingent or noncontingent fashion. Observers had to discriminate among the tapes,
on which only the infants could be seen. Three-month-olds
were judged to behave similarly when either their mothers or a
toy reacted contingently to them (i.e., the observers could not
tell these tapes apart) and to behave differently in those situations in which there was no contingent interaction. Ten-monthold infants, on the other hand, were judged also to differentiate
contingent mothers from contingent toys, suggesting that by

this age infants have developed a more detailed conception of

persons or animate things.
Poulin-Dubois and Shultz (1988) suggested that infants only
learn about the notion of independent agency in the last
quarter of the first year; they found that 8-month-olds habituated more to the sight of a chair acting as an independent agent
than to a person acting as an independent agent. Although it is
not clear exactly how to interpret this result, it seems reasonable
to suppose that if contingent motion and self-motion define
animacy for infants in the first months of life, then the "mistakes" that young infants make may not be because they do not
have a concept of animacy but because it is too broad. Infants
who smile and coo at a mobile that is contingent on their behavior or infants whose interactions with an animated toy cannot
be distinguished from interactions with their mother, may be
revealing that they think those things are animate, and so they
react accordingly.
How might the notion of contingency be conceptualized by a
young infant? This is a difficult question because the extent to
which infants interpret contingency as involving a causal relation is not known. It seems plausible that in the first instance
they do not and that the initial representation of contingency is
a simpler notion. A simple derivation of a notion of contingency
is possible from a redescription of spatial structure in terms of
the LINK image-schema. Lakoff(1987) discussed this schema
only briefly, noting that its structure consists of two entities and
a link connecting them. To my knowledge, there has not been
much analysis of this kind of image-schema, but there appears
to be a family of LINK schemas with related meanings, much as
Brugman (1988) and Lakoff (1987) used a family of schemas to
represent the various meanings of the term "over." The simplest
version of the LINK image-schema is illustrated in Graphic 4.
Other versions of LINK schemas are listed below it.
In Graphic 4 the link (which is not the same as a path) means
that two entities or events, A and B, are constrained by, or dependent on, one another even though they are not in direct
contact. Links can occur across both spatial and temporal gaps
and can be one-way or mutual. For example, to express the
contingency in the Watson (1972) experiment described earlier
in which the infant A acts and the mobile B turns, one might
use a LINK schema in which the contingency is one-way, as in
Graphic 4a. (The arrow is meant to represent the direction of
the contingency) This kind of contingency seems likely to require a number of repetitions before an expectation is set up, as
occurred in the Watson experiment. Presumably, therefore,
analysis of the contingency would not happen on the first occasion. Repetition may also be required for Graphic 4b, which
represents the back-and-forth interaction between an infant
and its mother described in the Frye et al. (1983) experiment.
In addition, there are linked paths, in which two objects move



in tandem as linked trajectories, shown in Graphic 4c; if A

moves, then B moves too. It may be possible to analyze this
kind of contingency on the first exposure to it. Notice that in all
these cases motion is involved, not in the link but by the entities
themselves; that is, the entities are parts of events.
Responsivity to contingency is one of the most basic predispositions of infants (or, for that matter, of organisms in general). For example, conditioning would not be possible without
it. Whether active analysis, or awareness, of contingency is required for conditioning to occur is a disputed issue and would
take us too far afield to discuss here, but responsivity to contingency, even in neonates, is not in doubt (Rovee-Collier, 1987).
One of the first kinds of contingency to be learned is that involved in S-S conditioning, that is, a contingency occurring
between two entities (or events) in the environment, rather
than a contingency occurring between an entity (or event) and
the neonate's own responses (see SamerofT & Cavanaugh,
1979). Therefore, it should be possible to analyze contingent
motion on the basis of the structure of environmental events
alone. The evidence to date is mostly about contingencies between self and the environment; for example, Greco, Hayne,
and Rovee-Collier (1990) suggested that 3-month-olds use motion that is contingent on their responses to categorize dissimilar appearing objects. However, analysis of objects' movements
vis-a-vis one's own responses, though useful, may not be required for analysis of contingent motion to take place. Whether
very young infants actually engage in perceptual analysis of
such environmental contingencies is another issue. My point
here is only that it is possible to associate contingent motion
with a particular category of moving objects without having to
observe their responsivity to self. Such a situation would allow
infants to conceptualize birds or animals in a zoo as animate
things without having to interact with them in a personal way.
The contingency of animate movement not only involves
such factors as one animate following another, as described by
LINKED PATHS, but also involves avoiding barriers and making
sudden shifts in acceleration. Stewart (1984) showed that adults
are sensitive to all of these aspects of animate movement, but
even though they appear to be perceptually salient, it is not yet
known whether infants are responsive to them or not. Nor has
anyone considered how factors such as barrier avoidance might
be represented in image-schema form. M. Johnson (1987) described several FORCE schemas, such as blockage and diversion,
that may be useful in describing barrier avoidance, but they
need to be differentiated into animate and inanimate trajectories. One might represent animate and inanimate differences in
response to blockage as a trajectory that shifts direction before
contacting a barrier versus one that runs into a barrier and then
either stops or bounces off from it.
I have outlined here some simple kinds of perceptual analysis
that give conceptual meaning to a category of moving things.
The claim is that infants generalize across the particulars of
perception to a representation that encompasses some abstract
characteristics the experiences have in common. Things that
move in one way can be conceptualized as starting up on their
own and interacting with other entities in a contingent way
without having to make contact with them. Such things can be
contrasted with another category of moving things that not only

move differently, but do not start up on their own and do not

interact contingently with other entities from a distance. These
analyses can be characterized as redescriptions of varied and
exceedingly complex perceptions into a few different kinds of
trajectories. Some of the trajectories imply self-motion; others
imply contingent response to objects taking part in events in
the environment. Such redescription could in principle be
carried out from a very young age by analysis of spatial structure into image-schematic form. It would provide the beginnings of a concept (or theory) of what an animate thing is,
above and beyond what it looks like.
So far I have concentrated on the conceptualization of animate things, saying little about the contrasting class of inanimates. Even though animates are the class of things that seem
to attract infants' interest and attention the most, infants do
attend with interest to all moving things, many of which are
inanimate. Therefore, I assume that conceptualization of inanimacy takes place concurrently with the formation of notions
about animacy; that is, there is no reason to assume that inanimates are merely a default class. A likely starting point is the
fact that inanimates involve caused motion rather than self-motion. How might caused motion be represented?
The Concepts of Causality and Inanimacy
The difference between self-motion and caused motion is
that in the latter case the beginning-of-path involves another
trajector. A hand picks up an object, whose trajectory then
begins, or a ball rolls into another, starting the second one on its
course. An image-schema of CAUSED MOTION could be graphically represented as in Graphic 5: a vector toward an object A,
with another vector leaving the point. The two trajectories are
not independent; the first one ends (or shifts direction) at the
point and time at which A begins its motion. The illustration
here, as in Graphic 1, is meant to emphasize the onset of motion. The trajectories are drawn as straight lines, again as a
default case (although they are appropriate for many kinds of
caused motion). In the case of the hand, the trajectories are
more complex; this issue is discussed in the following paragraphs.


Just as for animate motion, perceptual analysis of the paths of

objects caused to move, such as a launched ball or a car, results
in a representation of inanimate (or mechanical) motion. The
simplest way to express this notion is by a straight line as in
Graphic 6.



Again, combining the beginning-of-path with the path itself

forms an image-schema that might be called CAUSED-TO-MOVE
INANIMATE, as in Graphic 7. (The only difference between
Graphics 7 and 5 is that the beginning-of-path and the path
itself have been given equal emphasis in Graphic 7.)




Leslie (1982, 1988) has provided data on the perception of



causality in infancy, with findings similar to those found by

Michotte (1963) for adults. Leslie speculated that a concept of
causality is derived from this kind of perception. He found that
infants as young as 4 months distinguish between the causal
movement involved in one ball launching another and very
similar events in which there is a small spatial or temporal gap
between the two movements. In launching, the end-of-path of
the first trajectory is the beginning-of-path of trajector A, as
illustrated in Graphic 7. In the noncausal case there is no connection between the end of one trajectory and the beginning of
the next, as illustrated in Graphic 8.7

It is probably no accident that Leslie used films of balls being

launched that look similar to the sketch shown in Graphic 7
and contrasted them with films that looked much like Graphic
8. Launching is a paradigm case of causal motion, and the
image-schema underlying it may be used as the foundation of
causal understanding of all sorts. It also represents the paradigm case of the motion of inanimate objects. Inanimate objects are objects that typically do not move at all, but when they
do they are caused to move. The cause of motion can itself be
either animate or inanimate. In Graphic 7, A represents an
inanimate object; not only is it caused to move, but it follows an
inanimate trajectory. In Graphic 8, A also moves in an inanimate fashion, but creates an anomalous impression because it
starts up on its own.
If spatial analysis results in an early representation of causality, it would suggest that physical causality might be represented before psychological causality. This progression would
be the opposite of that usually assumed in development. According to Piaget (1954), psychological causation, or the awareness of one's own intention or efforts to make something happen, is the basis on which the later understanding of physical
causality rests. The spatialization of causal understanding is
said to begin only after infants experience many occasions of
drawing objects to themselves or pushing them away. However,
both Leslie's (1982, 1984) data and an approach to concept
formation in terms of redescription of spatial structure suggest
that the ontogenetic ordering may be the other way around. The
experience of intention or volition may not be required to form
an initial conception of causality. Indeed, internal states are
notoriously difficult to conceptualize, and children may require a good deal of experience before they begin to do so. We
have no evidence for any analysis of internal states by 4-monthold infants. In contrast, some analysis of the spatial structure of
moving objects has apparently begun by this age.8

objects usually do; rather, it moved in an inanimate fashion as

in Graphic 6. Such a display creates an anomaly, similar to that
illustrated in Graphic 8. Related data (Leslie, 1984) indicated
that infants have also learned something about animates as
agents at an earlier age than Poulin-Dubois and Shultz (1988)
suggested. In this experiment, 7-month-olds dishabituated
when a hand appeared to pick up an object without contacting
it, but they did not react differentially to blocks of wood that
"picked up" objects with or without contact. The infants may
have found the sight of a block of wood picking up an object
difficult to interpret whether it contacted the object or not and
so did not differentiate between the two. (Even 3-month-olds
have already learned some of the characteristic behavior of
hands. Needham and Baillargeon, 1991, found that 3-montholds were surprised when a hand pushed an object off a surface
and the object did not fall; a similar display in which the object
was grasped by a hand, did not surprise them.)
Leslie's (1984) data suggest that perceptual analysis of causal
and noncausal motion is involved not only in the formation of
concepts of animacy and inanimacy but also in the development of the concept of an agent. Animate objects not only move
themselves but cause other things to move; it is the latter characteristic, of course, that turns animates into agents. I offer a
representation of an image-schema of AGENCY in Graphic 9.
AGENCY is represented as an animate object, A, that moves
itself and also causes another object, B, to move. This imageschema is hardly more complex than the image-schema of
CAUSED-TO-MOVE INANIMATE (Graphic 7). The only difference
is that two objects are represented, and the object A that causes
B to move follows an animate path.

Leslie (1982) provided other data on causal perception that

are relevant to the issue of how a concept of agency can be
represented. He showed that both 4- and 7-month-olds were
surprised when a hand appeared to move an object without
touching it. That is, the image-schema of CAUSED MOTION was
violated because the first trajector did not reach the point
where the second trajector began its path. Thus, an object
started up on its own but did not move in the way that animate




One of the most frequent sights in early infancy is that of

people manipulating objects. These are typically objects that
rarely move at all under other circumstances. The trajectories
taken by manipulated objects in these cases are those of the
agent that is holding them. This complicates the analysis that
must be carried out but perhaps not so greatly as to be necessarily a late development. Inanimate objects are those that often
do not move at all, but when they do the beginning-of-path is
caused motion, not self-motion. The paths they follow are inanimate paths, unless the cause of motion is an agent picking
them up. In that case, their paths are contingent on the path of
the agent. If the paths part company (i.e., the agent lets the
object go), the path of the manipulated object reverts to that of
inanimate motion.

The Concept of Agency


Note that the separation between the two trajectories in Graphic 8

can represent either a spatial or a temporal gap; that is, a spatial or a
temporal separation results in the assessment that A begins on its own.
Using a temporal gap to judge that an independent trajectory has begun is an example of using temporal information without necessarily
conceptualizing it as such.
If this argument is correct, it provides an ontological basis for our
pervasive tendency to conceptualize the mental world by analogy to
the physical world, rather than the other way around. Extensive analyses of this tendency have been provided by cognitive linguists such as
M. Johnson (1987), Sweetser (1990), andTalmy (1988).



I have discussed several kinds of spatial movement that

might be important in early concepts about objects, but there
are certainly a number of others. For example, Wagner, Winner,
Cicchetti, and Gardner (1981) found that 9- to 13-month olds
were able to make a "metaphorical" match between certain
auditory and visual patterns. Specifically, the infants were more
apt to look at a broken line or a jagged circle when listening to a
pulsing tone and to look at a continuous line or a smooth circle
when listening to a continuous tone. Similarly they looked
more at an upward pointing arrow (which may tend to make
the eyes move upwards) when listening to an ascending tone
and to a downward arrow when listening to a descending tone.
This finding is of particular interest because it appears to be
the same phenomenon described by M. Johnson (1987) and
Lakoff (1987) as the way in which adults project image-schemas from one domain to another, for example, conceptualizing
quantity in terms of verticality (more is up, less is down).
The Wagner et al. (1981) data are also interesting because of
their potential application to the debate over the basis of crossmodal matching in infancy. For instance, when listening to a
single sound track while being presented with two different
films, 4-month-old infants are more apt to look at the film that
matches the sound track (e.g., Spelke, 1976). Kellman (1988)
suggested that the auditory and visual transducers output the
same pattern of information in an amodal form, which could
account for the matching behavior. Spelke (1988), on the other
hand, suggested that a more "thoughtful" central process is
required. Although Spelke did not describe this process in
terms of image-schemas, her suggestion could be so interpreted. In such an account, the matching behavior would not be
due to the transducers resulting in the same amodal output;
rather, the infant would be described as creating the same
image-schema to represent the two outputs.
The Concepts of Containment and Support
The last concept I will discuss is containment, along with a
cluster of related notions: going in or out, opening and closing,
and support. Containment in particular is an important notion
to consider because of its potential relevance to preverbal thinking. The data that are available suggest it is an early conceptual
development. Some concept of containment seems to be responsible for the better performance 9-month-old infants show on
object-hiding tasks when the occluder consists of an upright
container, rather than an inverted container or a screen (Freeman, Lloyd, & Sinha, 1980; Lloyd, Sinha, & Freeman, 1981).
These authors suggested that infants already have a concept of
containers as places where things disappear and reappear.
More direct work on containment has been conducted by Kolstad (1991), who showed that 5.5-month-old infants are surprised when containers without bottoms appear to hold things.
This finding suggests that the notions of containment and support may be closely related from an early age.
According to Lakoff (1987), the CONTAINMENT schema has
three structural elements: interior, boundary, and exterior. Such
a notion seems likely to arise from two sources: first, perceptual analysis of the differentiation of figure from ground, that
is, seeing objects as bounded and having an inside that is separate from the outside (Spelke, 1988), and second, from percep-

tual analysis of objects going into and out of containers. The list
of containment relations that babies experience is long. Babies
eat and drink, spit things out, watch their bodies being clothed
and unclothed, are taken in and out of rooms, and so on. Although M. Johnson (1987) emphasized bodily experience as
the basis of the understanding of containment, it is not obvious
that bodily experience per se is required for perceptual analysis
to take place. Infants have many opportunities to analyze simple, easily visible containers, such as bottles, cups, and dishes,
and the acts of containment that make things disappear into
and reappear out of them. Indeed, I would expect it to be easier
to analyze the sight of milk going into and out of a cup than
milk going into or out of one's mouth. Nevertheless, however
the analysis of containment gets started, one would expect the
notion of food as something that is taken into the mouth to be
an early conceptualization.
In the same way as discussed for LINK, there appears to be a
cluster of related image-schemas used to form the concept of a
container. One expresses the meaning of CONTAINMENT itself.
Thus, Graphic 10 represents the meaning of one thing being
inside another, by showing an object, A, in an at least partially
enclosed space.



Related image-schemas express the meaning of going in or going out in which a trajector goes from outside a container to the
inside, as in Graphic lOa, or vice versa, as in Graphic lOb.



As Kolstad's (1991) data suggest still another aspect that

seems to be involved in an early concept of a container is that of
support: true containers not only envelop things but support
them as well. I have already mentioned work by Needham and
Baillargeon (1991) showing that infants as young as 3 months
are surprised when support relations between objects are violated. A primitive image-schema of SUPPORT might require
only a representation of contact between two objects in the
vertical dimension. An image-schema of SUPPORT is represented in Graphic 11 in the form of an object, A, above and in
contact with a surface.


Baillargeon, Needham, and DeVos (1991) documented some of

the changes that occur in understanding support over a period
of a few months. At 3 months infants expect objects to be supported if they are in full contact with a surface, and otherwise
they expect objects to fall. By 5 months they expect an object to
be supported if any part of the object rests on the surface. By 6
months they have begun to differentiate between partial but
inadequate support (15% overlap between the object and the
supporting surface) and adequate support (70% overlap).
These data suggest that the earliest notion of support does not
include considerations of gravity or weight distribution, but at
least some aspects of these characteristics are learned within a
few months.
Still another example of early understanding related to con-

tainment is that of opening and closing. Piaget (1951) docu- before one can claim to have an adequate model of language
mented in detail the actions his 9- to 12-month-old infants per- acquisition. The first is a plausible theory of where children's
formed while they were learning to imitate acts that they could meaning representations come from, and the second is how
not see themselves perform, such as blinking. He noticed that they are used to facilitate language learning. (Bowerman also
before they accomplished the correct action they sometimes noted that we have not made much progress on either problem.)
opened and closed their mouths, opened and closed their The previous sections of this article addressed the first of these
hands, or covered and uncovered their eyes with a pillow. His issues, by showing how preverbal meanings might be formed
account testifies both to the perceptual analysis in which the and doing so in a language-independent way. That is, the motiinfants were engaging and their analogical understanding of vation for the characterization of the image-schemas I have
the structure of the behavior they were trying to reproduce. described was not to explain language learning but to account
Such understanding seems a clear case of an image-schema of for the kind of conceptual activity that occurs in preverbal chilthe spatial movement involved when anything opens or closes, dren. In this section, however, I will discuss some of the ways in
regardless of the particulars of the thing itself. The image- which image-schemas should be useful in language acquisition.
schema is not easily representable in a static sketch, but it in- Image-schemas, which are based on a redescription of the
volves an operation on the boundary of a container, such that spatial structure infants experience every day,would seem to be
something can move in or out of it, as in Graphics lOa and 1 Ob. particularly useful in the acquisition of various relational cateLakoff(1987) pointed out that the CONTAINMENT image- gories in language. On the basis of linguistic analyses, cognitive
schema forms a gestalt-like whole that has meaning by virtue of linguists have proposed image-schemas and mental spaces as
the relationships among the parts. In does not have meaning the roots of our understanding and use of modals (Sweetser,
without considering out and a boundary between them. Lakoff 1990; Talmy, 1988), prepositions (Brugman, 1988; Talmy,
went on to suggest that the CONTAINMENT schema forms the 1983), verb particles (Lindner, 1981), tense and aspect (Lanbasis for the later meaningfulness of certain logical relations, gacker, 1987), and presuppositions (Fauconnier, 1985). Some
such as P or not P (in or not in), and for the Boolean logic of of these categories have seemed more problematical for young
classes ("if all As are Bs and x is an A, then x is a B" deriving language learners than learning the names for things. Even to
from "if A is in container B and jc is in A, then jc is in B"). He learn names, it has been thought necessary to propose that
claimed that our intuitive understanding of the meaning of a set children have a bias to assume that labels refer to whole objects
(even though it is defined formally in set-theoretical models of rather than to object parts (Markman, 1989; Quine, 1960). A
logic) is due to the CONTAINMENT schema. In this view, mean- similar kind of bias may operate with respect to various relaing postulates are meaningful because they are prepositional tional constructions.
translations of the schemas that structure our nonverbal Brown (1973) and others showed that a number of relational
morphemes, such as the prepositions in and on, are early acquithought.
Lakoff's (1987) claim about the basis of our intuitive under- sitions. The meaning of these relational terms is given by the
standing of logic is attractive insofar as it suggests how repre- relevant image-schemas, just as the objects that enter into the
senting early concepts in image-schematic terms can provide a relations are given by the perceptual parser. It is true that obgrounding in preverbal thought for aspects of later language jects can be identified ostensively in a way that relations canlearning and reasoning. As Braine and Rumain (1983) pointed not. However, because of image-schematic analyses that have
out, it is difficult to imagine how children could understand already been carried out, it should not be difficult for the child
sentences containing logical connectives such as or if they did to learn that in expresses a containment relation or on a support
not already have what Braine and Rumain called inference sche- relation. Take for example a 20-month-old child hearing the
mas. Although Braine and Rumain did not suggest that such phrase "the spoon in the cup" On the basis of the data and
schemas are within the competence of infants, it becomes eas- image-schema analysis of containment discussed in the last
ier to see where they might come from if they consist of under- section, it can be assumed that the child will already have had
more than a year's experience in analyzing one thing as being in
standing the meaning of simple notions like CONTAINMENT. another.
A single image-schema can be used to join the two
The implication that something is either in or out of a container
and is not both in and out at the same time is spatially evident objects in a familiar relation. Hence, even though in cannot be
from examining a container and can be projected by means of pointed to, it can nevertheless be added to the semantic system.
for this view can be found in the fact that in and on are
the CONTAINMENT image-schema to other or and not relations. Support
earliest locative terms to be acquired in all the relevant
A similar kind of analysis of if-then may be derived from the the
languages that have been studied (Johnston, 1988) and are acLINK schemas in Graphics 4a and 4b. Such image-schema im- quired
plications are nonpropositional; they do not consist of strings 1977). (at least in English) in virtually errorless fashion (Clark,
of discrete symbols, yet they allow the meaning of certain inferdistinction between in and on is not a perceptual one.
ences to be understood. Because these kinds of analog represen- TheTheperceptual
makes many fine gradations where lantations have a relational structure, one might saythat they con- guages tend to system
categorical distinctions. Linguistic distain within themselves the basis of what will become in lan- tinctions are oftenmake
binary oppositions, and when further subdiguage (or logic) a predicate-argument structure.
tend to be few in number. Languages
Image-Schemas and Language Acquisition
vary as to where they make these cuts, and these the child must
Bowerman (1987) noted that there are two important aspects learn from listening to the language. But the hypothesis I am
of the role of meaning in learning to talk that must be solved operating under is that however the cuts are made, they will be


interpreted within the framework of the underlying meanings

represented by nonverbal image-schemas. That is, some of the
work required to map spatial knowledge onto language has already been accomplished by the time language acquisition begins. Children do not have to consider countless variations in
meaning suggested by the infinite variety of perceptual displays with which they are confronted; meaningful partitions
have already taken place. Some of the image-schemas I have
been describing provide roughly binary distinctions, as in the
case of self- versus caused motion, and therefore tailor-made to
be propositionalized. Others are not binary oppositions but
represent fundamental concepts in human thinking, such as
containment and support. What remains for children to do is to
discover how their language expresses these partitions.
To take an example from Bowerman (1989), languages vary
as to how they treat containment and support relations. English
speakers make an all-purpose distinction between in and on.
These map easily onto the CONTAINMENT and SUPPORT imageschemas described in the previous section. According to Bowerman, Spanish speakers typically use en for both the English
meanings. This cut may emphasize the notion of support over
the notion of containment, but in any case the lack of a distinction here should not cause young language learners any particular difficulty. Dutch, on the other hand, divides on into two
kinds of support relations. Op is used to express horizontal
support, as characterized by the image-schema described in
the previous section, as well as to express certain kinds of nonhorizontal support, as in a poster glued to a wall. Aan, another
term for on, is used to express a variety of other support relations in which only part of an object is attached to a supporting
surface, as in a picture hanging on a wall. I would predict that
this finer distinction, depending as it does on some understanding of gravity, might cause production errors. (Such errors
need not occur in the earliest usage of a term. To the extent that
children learn commonly used, fixed phrases for many situations, confusion among the various uses of on might not be
evident until the child attempts to describe a new situation.)
In these three languages (as well as many others) containment and support relations are expressed by prepositions that
are used quite generally with many different verbs so that children hear them in a great many situations. In addition, using
the prepositions alone (as children do in the one-word stage of
language production) is often sufficient to convey the appropriate meaning. There are languages, however, that have no
all-purpose morphemes equivalent to English in or on but express spatial meanings primarily by means of a variety of different verbs of motion (and sometimes by nouns such as top
surface or interior). Korean is one of these languages, and as
Choi and Bowerman (1992) documented, it presents a somewhat different task for the young language learner. In Korean
there are verbs that mean roughly to put into a container or to
put on a surface, but also a verb that means roughly to fit tightly
together that cuts across the English usages of in and on. My
interpretation of this three-way split is that the language provides a distinction between things going in or things going on,
but supersedes this distinction in the case in which the thing
going in or on results in a tight fit (as in a cassette tape going
into its case, or a lid being snapped onto a container). Choi and
Bowerman noted that these distinctions do not give Korean


children any difficulty, in the sense that the word for fit tightly
together is one of the first words they learn.
Bowerman (1989) and Choi and Bowerman (1992) suggested that this kind of example casts doubt on in and on as
privileged spatial primitives that can be mapped directly onto
language. They suggested instead that children must be sensitive to the structure of their input language from the beginning.
At the same time, they noted that how children figure out language-specific spatial categories remains a puzzle. Although I
agree that their data provide strong evidence that young children are sensitive to the structure of their language from an
early age, the mapping is only a puzzle if one assumes that in
and on are the only kinds of spatial analyses of containment and
support that have been carried out.
It may well be that in and on appear as early as they do in
English because of their status as separate morphemes, their
frequent usage, and their relatively straightforward mapping
onto two simple image-schemas. Korean children in the oneword stage cannot get by with in and on because these all-purpose morphemes do not exist in their language. Because children add only a few words at a time in the early months of
language production, Korean children express only some of the
many usages that English-speaking children manage with their
more general in and on. One of the words they do use means to
fit together tightly. Such a concept does not appear in English
samples of early speech, because there is no single morpheme in
the language to express it. However, this does not mean that
English children do not have such a concept until a later stage
when they begin to combine several morphemes together. I
suggest that this is an easy word for Korean children to learn
because its meaning reflects the kinds of spatial analyses that
preverbal infants everywhere are carrying out. To fit tightly (or
for that matter, to fit loosely) does not seem an unduly difficult
notion for infants who are engaged in analyzing many kinds of
containment and support relations.
Unfortunately, there is not yet a rich enough database to provide a definitive answer to the conceptual distinctions children
have mastered before language begins. The work of Baillargeon
and her colleagues, discussed earlier, indicates that a good deal
of analysis of support relations has already taken place by 6
months. A further year will elapse before particles expressing
support enter the child's vocabulary, during which time a great
many more distinctions are likely to have been conceptualized.
However, it is not yet known whether various kinds of horizontal support are understood before vertical support or the age at
which infants learn about the kinds of attachment necessary for
the vertical support indicated by the word aan. Before it is
possible to make detailed predictions about the course of mastering Dutch versus English in this regard, then, much more
data are needed on infants' preverbal concepts. Similar comments can be made about the concept of tight-fittingness in
Korean. Because most of the existing research has been conducted by speakers of languages that do not have a single morpheme expressing tight-fittingness, the development of this concept in infancy has not even begun to be investigated. Until
such research is carried out it will not be possible to determine
whether a given language merely tells the child how to categorize a set of meanings the child has already analyzed or whether



the language tells the child it is time to carry out new perceptual analyses.
Containment and support may be more complex than would
at first appear, but they are spatial relations and therefore
clearly require spatial analysis. However, there are other linguistic categories that are less obviously spatial in character and
even more complex, yet in spite of this, they are early linguistic
achievements. One example is linguistic transitivity (how subjects and objects are marked in transitive clauses, such as Mary
throws the ball, versus intransitive clauses, such as John runs).
As Slobin (1985) discussed, linguistic transitivity is among the
first notions marked by grammatical morphemes in the acquisition of language. Although it is an abstract linguistic notion, it
nevertheless depends on concepts of animacy, inanimacy,
agency, and causality, and I have argued that these concepts also
result from spatial analysis. Because the relevant image-schemas occur preverbally, linguistic transitivity should not present
the young language learner with undue difficulty. One might
even argue that language ought to make some distinctions between these notions if their presence in the infant's representational system is as prominent as I have suggested. At the same
time, to the extent that a particular language goes beyond preverbal image-schemas one can predict particular kinds of difficulties in early acquisition.
Slobin (1985) gave a detailed example of this kind of difficulty in the case of transitivity. He pointed out that whether
expressed in the child's native language by accusative inflections, direct object markers, or ergative inflections, marking of
transitive phrases is underextended in early child language.9
The marking occurs at first only when the child is talking about
an animate agent physically acting on an inanimate object; only
later is it extended to less prototypical cases of transitivity. Slobin referred to this underextension as children first using the
relevant morphemes to talk about a manipulative activity scene.
This scene requires the child to represent a cluster of notions,
such as the concepts representing the objects involved, concepts
of physical agency involving the hands, and concepts of change
of state or location. Although Slobin suggested that there is no
basis for proposing any particular analysis of this complex of
notions, they fit remarkably well with the image-schemas that
were described earlier.
Thus, one of the earliest grammaticized relational notions
can be characterized as a linguistic redescription of the imageschemas I have described for representing an animate acting
causally on an inanimate. It is particularly interesting that at
first only inanimate patients (such as ball) are given accusative
marking, even when the native language uses distinct accusative forms for animate and inanimate objects. Slobin (1985)
reported that children are slow to learn this accusative subdistinction. I would suggest that the difficulty is due in part to the
initial conception of what an inanimate is: It is something acted
upon and that does not act by itself. So the child may first
interpret the accusative marker as a marker for inanimacy,
rather than as a marker for a patient. This is a case in which
syntactic bootstrapping (Gleitman, 1990; Landau & Gleitman,
1985) seems to be required if the child is to realize that it is not
just inanimates that are marked in the abstract role of patient.
These underextension errors are relatively minor, but indicate that a grammatical notion combining several image-sche-

mas in a complex way may require some trial and error before
the particular set the language expresses is fully mastered.
There are other cases in which a grammatical distinction
matches early image-schemas more exactly. An example is the
distinction in Korean between intransitive verbs involving selfmotion and transitive verbs involving caused motion (Choi &
Bowerman, 1992). English does not mark these distinctions;
for example, English uses the same verb to say either that the
door opened or that Mary opened the door. Korean, however,
uses quite different verb forms for these two cases. Choi and
Bowerman reported that this distinction is respected by children as soon as they begin to use such words; the children they
studied never made a cross-category confusion between these
two types of verbs. Thus, learning this grammatical distinction
in Korean appears to be comparable to the acquisition of in and
on in English; a straightforward match between the underlying
image-schemas and the linguistic distinctions makes learning
rapid and easy.
Another example of a spatial analysis underlying a seemingly
nonspatial concept is that of possession. A number of writers
have noted this affinity (e.g., Brown, 1973; Lyons, 1968). Slobin
(1985) defined possession as a locative state in which an animate being enjoys a particular socially sanctioned relationship
to an object. He pointed out that young children may not yet
understand the social aspects of this concept and attend primarily to its locative aspects. In support of this view, he cited
Mills (1985), who found that German children sometimes conflate locative and possessive functors. For example, locative
functors such as to are sometimes used to mark possession, as in
the case of using zu (to) to express both going to and belongs to.
Slobin's analysis suggests that this conflation (or better, underextension of the adult use of the term) comes about because an
image-schema of a trajector has not yet been projected into the
social realm. The earliest meaning being expressed by a child
using the locative term to may well be spatial destination, represented as END-OF-PATH. In the same way that focus on BEGINNINO-OF-PATH emphasizes the animate or inanimate character
of a moving object, focus on END-OF-PATH emphasizes its destination: In the simplest sense, the place where an object comes
to rest is where it "belongs." A too complex notion of possessiveness is probably ascribed to young children who pepper their
conversations with "my book," "my chocolate," "my watch," regardless of whose book, chocolate, or watch is under discussion. In observing a 2-year-old child exhaustively use this construction, it seemed to me that nothing like possession in the
adult sense was being claimed; rather, a request for a movement
of the item to the child as destination was being expressed.
Indeed, at the time of these observations the child had yet to
learn the construction Ben's book, which might indicate a notion of possession closer to the full adult sense (Deutsch &
Budwig, 1983).
Slobin (1985) discussed the acquisition of a large number of
nonreferential grammatical morphemes. At first glance, many
of them, such as distinguishing between transitive and intransi9
In inflectional accusative languages, a marker is put on the direct
object in a transitive clause, whereas in ergative languages, transitivity
is marked on the subject noun (the agent).

tive activities or use of the progressive aspect of verbs, not only
seem impossibly abstract for the very young child to master,
but also slightly odd.10 Of what value to communication or
understanding is it to make these distinctions? Indeed, some
languages do without them. They may occur in most languages
partly because of analyses that have been carried out by preverbal organisms. In any case, when the notions that underlie these
linguistic devices are analyzed in image-schematic terms, some
of the linguistic problems facing the child seem more tractable.
This is a serendipitous result of an analysis that was originally
motivated by quite a different problem, namely, how to account
for the appearance in infancy of concepts such as animacy and

Perception and Conception Revisited

My analysis of redescription of spatial structure into an
image-schematic form of representation can also account for
why it is often difficult to tell whether perceptual or conceptual
categorization is occurring in infancy. The relationship between perception and conception has long been an arena of
argument among psychologists. I have tried to make a firm
dividing line between the two in order to highlight differences
that are often ignored or glossed over. However, if the earliest
conceptual functioning consists of a redescription of perceptual structure, there may be no hard and fast line. The reason
why there have been so many arguments over whether it is possible to separate percepts and concepts is that there are great
similarities between the two, even though they are considered
to involve different kinds of processing. For example, when we
showed in my laboratory that infants respond to a global category of animals (J. M. Mandler, Bauer, & McDonough, 1991;
McDonough & Mandler, 1991), some of our colleagues tried to
convince us that this was merely a perceptual categorythat all
animals look more alike than they look like plants or artifacts.
So what we were calling the concept of animal was really just an
extended "child-basic" category whose acquisition could be explained by perceptual factors alone. However, the perceptual
characteristics of animals are too varied to make such a characterization useful. To be sure, at a high level of description there
are some perceptual commonalities across animals; for example, it can be observed that animals all move by themselves,
even though they move in physically different ways. Self-motion, however, is not the same type of description as the parameters that describe movement; on the contrary, it is the concept
of self-movement that relates different perceptual descriptions
of the motion. The high-level parameters of motion that are
common to the hopping of a kangaroo and the gliding of a
shark are as yet only partially known. The perceptual system
may make use of such parameters to form a broad perceptual
category of biological motion, but the conceptual system constructs a simpler notion to express this commonality.
In our studies (J. M. Mandler & Bauer, 1988; J. M. Mandler,
Bauer, & McDonough, 1991; McDonough & Mandler, 1991) of
the development of the concept of animals we have typically
used small plastic models of animals and other objects. In one
of our tasks, we put a number of varied animals and vehicles in
front of the children and observe the way the children interact
with them. If a child touches the animals in sequence more


often than would occur by chance, we say that the child has a
category or concept of animals, in the sense that the animals are
seen to be related to each other. However, because the models
do not move one might ask, if movement is the basis for the
concept of animals why does the child respond systematically
in the absence of the cue to the conceptual meaning? Having a
concept of what an animal is is not the same thing as identifying
one. There are undoubtedly a number of perceptual bases used
to recognize something as an animal. For example, things that
move themselves have appendages of various sorts that are involved in the movement (legs, wings, or fins). These shape
characteristics can serve to identify something as an animal in
the absence of movement. Thus, shape is a good identifier of a
categorical exemplar, even though shape only specifies what the
exemplar looks like, not what it is. Most important, the shapes
of various appendages do not themselves look alike. They are
alike only because they all signify self-motion. If the basis for
sequential touching of animals were merely the physical appearance of appendages, it would be difficult to understand why
winged creatures are associated with legged and finny ones, but
not with winged airplanes (McDonough & Mandler, 1991).
There is a limit to the usefulness of characterizing knowledge
as perception. It is true that to the extent that concepts are
derived from percepts, there must often be perceptual bases for
conceptual decisions. The same image-schema can be used to
give a brief structural description of a percept or to give a brief
description of the meaning of a simple concept. Thus, the
image-schema of SELF-MOVING ANIMATE shown in Graphic 3
can be used to describe in part what a fish looks like as it begins
to swim away or what a cat standing on a mat looks like as it
walks off the mat. The details of the paths vary, but Graphic 3
describes their commonality. At the same time, Graphic 3 describes the conceptual meaning of what has happened in the
two situations. In spite of its perceptual roots, however, what I
have been calling the concept of animal brings together a number of diverse perceptual analyses into a more abstract characterization of what an animal is, that is, what Wellman and Gelman (1988) called the nonobvious aspects of something seen. It
is these nonobvious characteristics that 3-year-olds use conceptually to differentiate pictures of statues from pictures of animals they have never seen before (Massey & Gelman, 1988).
They say such things about a new animal as, "It can go by itself
because it has feet," even when no feet are visible in the picture.11 Such inferential ability (in this case, the use of inheritance properties) is the hallmark of concepts and cannot be
characterized as perceptual knowledge. The claim that I have
made here is that 2- to 3-year-olds can only make such inferences because they have previously formed concepts through
Focus on an object's path, rather than its beginning or end, emphasizes the ongoing trajectory itself. An image-schema of TRAVERSAL-OFPATH could provide the basis for the early understandingof the progressive aspect of verbs.
" The claim that the animal has feet is a conceptual justification.
The actual basis of recognition of the animals versus the statues may be
their different textures. Massey and Gelman (1988) reported children
saying such things as, "The statue can't go by itself because it is shiny"
Smith (1989b) made a similar point about the importance of texture in
object recognition.



the redescription of perceptual information into image-schematic form. The image-schemas allow them to represent the
idea of animacy, which in turn allows the inference of unseen
appendages. It is also what allows infants to categorize winged
and legged creatures together.
In conclusion, I have proposed that human infants represent
information from an early age at more than one level of description. The first level is the result of a perceptual system that
parses and categorizes objects and object movement (events). I
assume that this level of representation is roughly similar to
that found in many animal species. In addition, human infants
have the capacity to analyze objects and events into another
form of representation that, while still somewhat perceptionlike in character, contains only fragments of the information
originally processed. The information in this next level of representation is spatial and is represented in analog form by means
of image-schemas. These image-schemas, such as SELF-MOTION
and CONTAINMENT, form the earliest meanings that the mind
represents. The capacity to engage in this kind of perceptual
analysis allows new concepts to be formed that are not merely
combinations of earlier image-schemas; in this sense, it is a
productive system. This level of representation allows the organism to form a conceptual system that is potentially accessible;
that is, it contains the information that is used to form images,
to recall, and eventually to plan. A similar level of representation apparently exists in primates as well (Premack, 1983) and
perhaps in other mammals too (although the format would
presumably vary as a function of the particulars of the sensory
systems involved). Humans, of course, add still another level of
representation, namely, language. Whatever the exact nature of
the step required to go from image-schemas to language, it may
not be a large one, at any rate not as large as would be required
to move directly from a conceptless organism to a speaking one.

Armstrong, S., Gleitman, L. R., & Gleitman, H. (1983). What some
concepts might not be. Cognition, 13, 263-308.
Baillargeon, R. (in press). The object concept revisited: New directions. In C. E. Granrud (Ed.), Visual perception and cognition in infancy: Carnegie-Mellon symposia on cognition. Hillsdale, NJ: Erlbaum.
Baillargeon, R., DeVos, J., & Graber, M. (1989). Location memory in
8-month-old infants in a non-search AB task: Further evidence. Cognitive Development, 4, 345-367.
Baillargeon, R., Needham, A., & DeVos, J. (1991). The development of
young infants' intuitions about support. Unpublished manuscript,
University of Illinois, Urbana, IL.
Bates, E., O'Connell, B., & Shore, C. (1987). Language and communication in infancy. In J. Osofsky (Ed.), Handbook of infant development
(2nd ed., pp. 149-203). New York: Wiley.
Bertenthal, B. I. (in press). Infants' perception of biomechanical motions: Intrinsic image and knowledge-based constraints. In C.
Granrud (Ed.), Visual perception and cognition in infancy: CarnegieMellon symposia on cognition. Hillsdale, NJ: Erlbaum.
Bloom,L.(1970). Language development: Form and function in emerging grammars. Cambridge, MA: MIT Press.
Bonvillian, J. D., Orlansky, M. D., &Novack, L. L. (1983). Developmental milestones: Sign language and motor development. Child Development, 54,1435-1445.
Bowerman, M. (1973). Early syntactic development: Across-linguistic

study with special reference to Finnish. Cambridge, England: Cambridge University Press.
Bowerman, M. (1987). Commentary. In B. MacWhinney (Ed.), Mechanisms of language acquisition. Hillsdale, NJ: Erlbaum.
Bowerman, M. (1989). Learning a semantic system: What role do cognitive predispositions play? In M. L. Rice & R. L. Schiefelbusch
(Eds.), The teachability of language (pp. 133-169). Baltimore, MD:
Braine, M. D. S., & Rumain, B. (1983). Logical reasoning. In P. H.
Mussen (Series Ed.) & J. H. Flavell & E. M. Markman (Vol. Eds.),
Handbook of child psychology: Vol. 3. Cognitive development. New
York: Wiley.
Brown, R. (1973). A first language: The early stages. Cambridge, MA:
Harvard University Press.
Bruce, V, Burton, M., Dench, N., Hanna, E., Healey, P., & Mason, Q
(1991, July). Sex discrimination: How do we tell the difference between
male and female faces?Paper presented at the meetingof the Experimental Psychology Society, Sussex, England.
Brugman, C. M. (1988). The story of over: Polysemy, semantics, andthe
structure of the lexicon. New York: Garland.
Carey, S. (1985). Conceptual change in childhood. Cambridge, MA:
MIT Press.
Chambers, D, & Reisberg, D. (1992). What an image depicts depends
on what an image means. Cognitive Psychology, 24,145-174.
Choi, S., & Bowerman, M. (1992). Learning to express motion events
in English and Korean: The influence of language-specific lexicalization patterns. Cognition, 41, 83-121.
Clark, E. V (1977). Strategies and the mapping problem in first language acquisition. In J. Macnamara (Ed.), Language learning and
thought. San Diego, CA: Academic Press.
Clark, H. H. (1973). Space, time, semantics, and the child. In T. E.
Moore (Ed.), Cognitive development and the acquisition of language
(pp. 27-63). San Diego, CA: Academic Press.
Cohen, L. B., & \bunger, B. A. (1983). Perceptual categorization in the
infant. InE. Scholnick(Ed.), New trends in conceptual representation.
Hillsdale, NJ: Erlbaum.
Deutsch, W, & Budwig, N. (1983). Form and function in the development of possessives. Papers and Reports on Child Language Development, 22, 36-42.
Fagan, J. F, III, & Singer, L. T. (1979). The role of simple feature differences in infant recognition of faces. Infant Behavior and Development, 2, 39-46.
Fauconnier, G. (1985). Mental spaces. Cambridge, MA: MIT Press.
Fillmore, C. (1982). Towards a descriptive framework for spatial
deixis. In R. J. Jarvella & W Klein (Eds.), Speech, place, and action.
New York: Wiley.
Fodor, J. (1975). The language of thought. New York: Crowell.
Fodor, J. (1981). Representations. Cambridge, MA: MIT Press.
Fox, N., Kagan, J., & Weiskopf, S. (1979). The growth of memory
during infancy. Genetic Psychology Monographs, 99, 91 -130.
Freeman, N. H., Lloyd, S., & Sinha, C. G. (1980). Infant search tasks
reveal early concepts of containment and canonical usage of objects.
Cognition, 8, 243-262.
Frye, D., Rawling, P., Moore, C., & Myers, I. (1983). Object-person
discrimination and communication at 3 and 10 months. Developmental Psychology, 19, 303-309.
Gelman, R. (1990). First principles organize attention to and learning
about relevant data: Number and the animate-inanimate distinction
as examples. Cognitive Science, 14, 79-106.
Gleitman, L. (1990). The structural source of word meanings. Language Acquisition, 1, 3-55.
Golinkoff, R. M., & Halperin, M. S. (1983). The concept of animal:
One infant's view. Infant Behavior and Development, 6, 229-233.
Greco, C., Hayne, H., & Rovee-Collier, C. (1990). Roles of function,

reminding, and variability in categorization by 3-month-old infants.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 617-633.
Hamad, S. (1990). The symbol grounding problem. Physica D, 42,
Intons-Peterson, M. J, & Roskos-Ewoldsen, B. B. (1989). Sensory-perceptual qualities of images. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 15,188-199.
Intraub, H., & Richardson, M. (1989). Wide-angle memories of closeup scenes. Journal of Experimental Psychology: Learning, Memory,
andCognition, 15,179-187.
Janowsky, J. S. (1985). Cognitive development and reorganization after
early brain injury. Unpublished doctoral dissertation, Cornell University, Ithaca, NY.
Johnson, M. (1987). The body in the mind: The bodily basis of meaning,
imagination, and reasoning. Chicago: University of Chicago Press.
Johnson, M. H., & Morton, J. (1991). Biology and cognitive development: The case of face recognition. Oxford, England: Blackwell.
Johnston, J. R. (1988). Children's verbal representation of spatial location. In J. Stiles-Davis, M. Kritchevsky, & U. Bellugi (Eds.), Spatial
cognition: Brain bases and development (pp. 195-205). Hillsdale, NJ:
Karmiloff-Smith, A. (1986). From meta-processes to conscious access: Evidence from children's metalinguistic and repair data. Cognition, 23, 95-147.
Karmiloff-Smith, A. (1991). Beyond modularity: Innate constraints
and developmental change. In S. Carey & R. Gelman (Eds.), The
epigenesis of mind: Essays on biology and cognition (pp. 171-197).
Hillsdale, NJ: Erlbaum.
Keil, F. C. (1989). Concepts, kinds, and cognitive development. Cambridge, MA: MIT Press.
Kellman, P. J. (1988). Theories of perception and research in perceptual development. In A. \bnas (Ed.), Perceptual development in infancy (pp. 267-281). Hillsdale, NJ: Erlbaum.
Kellman, P. J., Gleitman, H., & Spelke, E. S. (1987). Object and observer motion in the perception of objects by infants. Journal of
Experimental Psychology: Human Perception and Performance, 13,
Kolstad, V T. (1991, April). Understanding of containment in 5.5month-old infants. Poster presented at the meeting of the Society for
Research in Child Development, Seattle, WA.
Kosslyn, S. M. (1980). Image and mind. Cambridge, MA: Harvard
University Press.
Lakoff, G. (1987). Women, fire, and dangerous things: What categories
reveal about the mind. Chicago: University of Chicago Press.
Landau, B. L., & Gleitman, L. R. (1985). Language and experience:
Evidence from the blind child. Cambridge, MA: Harvard University
Langacker, R. (1987). Foundations oj"cognitive grammar(Vol. 1). Stanford, CA: Stanford University Press.
Leslie, A. M. (1982). The perception of causality in infants. Perception,
Leslie, A. (1984). Infant perception of a manual pick-up event. British
Journal of Developmental Psychology, 2,19-32.
Leslie, A. (1988). The necessity of illusion: Perception and thought in
infancy. In L. Weiskrantz (Ed.), Thought without language (pp. 185210). Oxford, England: Clarendon.
Lindner, S. (1981). A lexico-semantic analysis of verb-panicle constructions with up and out. Unpublished doctoral dissertation, University
of California, San Diego, CA.
Lloyd, S. E., Sinha, C. G., & Freeman, N. H. (1981). Spatial references
systems and the canonicality effect in infant search. Journal of Experimental Child Psychology, 32, 1-10.
Lyons, J. (1968). Existence, location, possession and transitivity. In B.


van Rootselaar & J. F. Staal (Eds.), Logic, methodology and philosophy of science III (pp. 495-504). Amsterdam: North-Holland.
Mandler, G. (1985). Cognitive Psychology. Hillsdale, NJ: Erlbaum.
Mandler, J. M. (1988). How to build a baby: On the development of an
accessible representational system. Cognitive Development, 3,113136.
Mandler, J. M. (1991). Prelinguistic primitives. In L. A. Sutton & C.
Johnson (Eds.), Proceedings of the seventeenth annual meeting of the
Berkeley Linguistics Society (pp. 414-425). Berkeley, CA: Berkeley
Linguistics Society.
Mandler, J. M., & Bauer, P. J. (1988). The cradle of categorization: Is
the basic level basic? Cognitive Development, 3, 237-264.
Mandler, J. M., Bauer, P. J., & McDonough, L. (1991). Separating the
sheep from the goats: Differentiating global categories. Cognitive
Psychology, 23, 263-298.
Markman, E. M. (1989). Categorization and naming in children: Problems of induction. Cambridge, MA: MIT Press.
Massey, C. M., & Gelman, R. (1988). Preschoolers' ability to decide
whether a photographed unfamiliar object can move itself. Developmental Psychology, 24, 307-317.
McDonough, L., & Mandler, J. M. (1990, April). Verylongterm recall in
two-year-olds. Poster presented at the meeting of the International
Conference on Infant Studies, Montreal, Canada.
McDonough, L., & Mandler, J. M. (1991, June). Differentiation of globally-defined categories in 9- and 11-month olds. Poster presented at
the meeting of the American Psychological Society, Washington,
Medin, D, & Ortony, A. (1989). Psychological essentialism. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning (pp.
179-195). Cambridge, England: Cambridge University Press.
Medin, D. L., & Wattenmaker, W D. (1987). Category cohesiveness,
theories and cognitive archeology. In U. Neisser (Ed.), Concepts and
conceptual development: Ecologicaland intellectual factors in categorization (pp. 25-62). Cambridge, England: Cambridge University
Meier, R. P., & Newport, E. L. (1990). Out of the hands of babes: On a
possible sign advantage in language acquisition. Language, 66,1-23.
Meltzoff, A. N. (1988). Infant imitation and memory. Nine-montholds in immediate and deferred tests. Child Development, 59, 217225.
Michotte, A. (1963). The perception of causality. London: Methuen.
Mills, A. E. (1985). The acquisition of German. In D. I. Slobin (Ed.),
The crosslinguistic study of language acquisition (Vol. 1). Hillsdale,
NJ: Erlbaum.
Myers, N. A., Clifton, R. K., & Clarkson, M. G. (1987). When they
were very young: Almost-threes remember two years ago. Infant
Behavior and Development, 10,123-132.
Needham, A., & Baillargeon, R. (1991). Reasoning about support in
3-month-old infants. Unpublished manuscript, University of Illinois,
Urbana, IL.
Nelson, K. (1974). Concept, word, and sentence: Interrelations in acquisition and development. Psychological Review, 81, 267-285.
Nelson, K. (1985). Making sense: The acquisition of shared meaning.
San Diego, CA: Academic Press.
Oakes, L. M., Madole, K. L., & Cohen, L. B. (1991). Infants' object
examining: Habituation and categorization. Cognitive Development,
6, 377-392.
Palermo, D. S. (1986). Metaphor: A portal for viewing the child's mind.
In L. P. Lipsitt & J. H. Cantor (Eds.), Experimental child psychologist: Essays and experiments in honor of Charles C. Spiker. Hillsdale,
NJ: Erlbaum.
Petitto, L. A. (1987). On the autonomy of language and gesture: Evidence from the acquisition of personal pronouns in American Sign
Language. Cognition, 27,1-52.



Piaget, J. (1951). Play, dreams, and imitation in childhood. London:

Regan Paul, Trench, & Trubner.
Piaget, J. (1952). The origins of intelligence in children. Madison, CT:
International Universities Press.
Piaget, J. (1954). The child's construction of reality. New York: Basic
Piaget, J., & Inhelder, B. (1971). Mental imagery in the child. London:
Routledge & Kegan Paul.
Poulin-Dubois, D., & Shultz, T. R. (1988). The development of the
understanding of human behavior: From agency to intentionality. In
J. W Astington, P. L. Harris, & D. R. Olson (Eds.), Developing theories of mind (pp. 109-125). Cambridge, England: Cambridge University Press.
Premack, D. (1983). The codes of man and beast. The Behavioral and
Brain Sciences, 6,125-137.
Premack, D. (1990). The infant's theory of self-propelled objects. Cognition, 36,1-16.
Quine, W V O. (1960). Word and object. Cambridge, MA: MIT Press.
Quinn, P, & Eimas, P. (1986). On categorization in early infancy.
Merrill-Palmer Quarterly, 32, 331-363.
Rosch, E., & Mervis, C. B. (1975). Family resemblances: Studies in the
internal structure of categories. Cognitive Psychology, 7, 573-605.
Rovee-Collier, C. (1987). Learning and memory in infancy. In J. D.
Osovsky (Ed.), Handbook of infant development (2nd ed., pp. 98148). New York: Wiley.
Ruff, H. A. (1986). Components of attention during infants' manipulative exploration. Child Development, 57,105-114.
Sameroff, A. J., & Cavanaugh, P. J. (1979). Learning in infancy: A
developmental perspective. In J. D. Osovsky (Ed.), Handbook of infant development (pp. 344-392). New York: Wiley.
Sinclair, H. (1971). Sensorimotor action patterns as a condition for the
acquisition of syntax. In R. Huxley & E. Ingram (Eds.), Language
acquisition: Models and methods (pp. 121-135). San Diego, CA: Academic Press.
Slobin, D. (1985). Crosslinguistic evidence for the language-making
capacity. In D. I. Slobin (Ed.), The crosslinguistic study of language
acquisition: Vol. 2. Theoretical issues (pp. 1157-1256). Hillsdale, NJ:
Smith, E. E., & Medin, D. L. (1981). Categories and concepts. Cambridge, MA: Harvard University Press.
Smith, L. B. (1989a). From global similarities to kinds of similarities:
The construction of dimensions in development. In S. Vosniadou &
A. Ortony (Eds.), Similarity and analogical reasoning (pp. 146-178).
Cambridge, England: Cambridge University Press.
Smith, L. B. (1989b, April). In defense of perceptual similarity. Paper
presented at the meeting of the Society for Research in Child Development, Kansas City.

Spelke, E. S. (1976). Infants' intermodal perception of events. Cognitive Psychology, 8, 626-636.

Spelke, E. S. (1985). Perception of unity, persistence, and identity:
Thoughts on infants' conceptions of objects. In J. Mehler & R. Fox
(Eds.), Neonate cognition: Beyond the blooming buzzing confusion.
Hillsdale, NJ: Erlbaum.
Spelke, E. S. (1988). Where perceiving ends and thinking begins: The
apprehension of objects in infancy. In A. Yonas (Ed.), Perceptual
development in infancy (pp. 197-234). Hillsdale, NJ: Erlbaum.
Stewart, J. (1984, November). Object motion and the perception ofanimacy. Paper presented at the meeting of the Psychonomic Society,
San Antonio, TX.
Strauss, M. S. (1979). Abstraction of prototypical information by
adults and 10-month-old infants. Journal of Experimental Psychology: Human Learning and Memory, 5, 618-632.
Sugarman, S. (1987). Piaget's construction of the child's reality. Cambridge, England: Cambridge University Press.
Sweetser, E. (1990). From etymology to pragmatics: Metaphorical and
cultural aspects of semantic structure. Cambridge, England: Cambridge University Press.
Talmy, L. (1983). How language structures space. In H. L. Pick, Jr. &
L. P. Acredolo (Eds.), Spatial orientation: Theory, research, andapplication. New \brk: Plenum Press.
Talmy, L. (1985). Force dynamics in language and thought. In W H.
Eilfort, P. D. Kroeber, & K. L. Peterson (Eds.), Papers from the parasession on causatives and agentivity at the twenty-first regional meeting. Chicago: Chicago Linguistic Society.
Talmy, L. (1988). Force dynamics in language and cognition. Cognitive
Science, 12, 49-100.
Traugott, E. C. (1978). On the expression of spatio-temporal relations
in language. In J. H. Greenberg(Ed-), Universals of human language:
Vol3. Word structure (pp. 369-400). Stanford, CA: Stanford University Press.
Wagner, S., Winner, E., Cicchetti, D., & Gardner, H. (1981). "Metaphorical" mapping in human infants. Child Development, 52, 728-731.
Watson, J. (1972). Smiling, cooing, and "the game." Merrill-Palmer
Quarterly, 18, 323-340.
Wellman, H. M., &Gelman, S. A. (1988). Children's understanding of
the nonobvious. In R. Sternberg (Ed.), Advances in the psychology of
intelligence (Vol. 4). Hillsdale, NJ: Erlbaum.
Werner, H., & Kaplan, B. (1963). Symbol formation. New York: Wiley.
Wilson, N. (1986). An implementation and perceptual test of a principled model of biological motion. Unpublished master's thesis, University of Pennsylvania, Philadelphia, PA.

Received January 24,1991

Revision received November 19,1991
Accepted December 4,1991