Anda di halaman 1dari 236

' f U ^ e- ' •'

P r o c e e d i n g s of t h e

F o u r t h A n n u a l

C o n f e r e n c e

o f t h e

C o g n i t i v e

S c i e n c e S o c i e t y

Sponsored by the Program in Cognitive Science

of The University of Chicago and The University
of Michigan,
supported in part by a grant from the
Alfred P. Sloan Foundation.

A n n A r b o r , M i c h i g a n A u g u s t 4 - 6 . 1 9 8 2
P r c x r e e d i n g s o f t h e

F o u r t h A n n u a l

C o n f e r e n c e

o f t h e

C o g n i t i v e

S c i e n c e S o c i e t y

Sponsored by the Program in Cognitive Science

of The University of Chicago and The University
of Michigan,
supported in part by a grant from the
Alfred P. Sloan Foundation.

A n n A r b o r , M i c h i g a n A u g u s t 4 - 6 , 1 9 8 2

SYMPOSIUM—CONSCIOUSNESS W h a t C o n Philosophy Contribute to the Study of Natural

Conscious, Subconscious, Unconscious: A Language Processing? 59
Neodissocicrtion Perspective I Martin Ringle
John F. Kihlstrom Recognizing Humor in Newspaper Cartoons by
SYMPOSIUM—REPRESENTATION O F PROCESSES Resolving Ambiguities Through Pragmatics 62
A N D TIME Lawrence Mazlack, Noemi M . Paz
Modeling Events. Actions, and Time 5 Defaults Revisited or "Tell m e if you're guessing." 67
James Allen Jane Terry Nutter
Some Issues on Mechanistic Mental Models 7 Pragmatic Factors in Pronoun Reference
Johan de Kleer and John Seely Brown Assignment 70
A Note Concerning Qualitative Process Theory 10 Valerie C. Abbott, John B. Black
Kenneth Forbus Topic and C o m m e n t in Spoken Sentence
The Preconceptual Basis of Experiential Metaphor ... 12 Hans Brunner
Mark Johnson On-Line Processing of Pragmatic Inferences 74
Towards a Computational Model of Metaphor in CoUen M. Seifert, Scott R Robertson, John B. Black
C o m m o n Sense Reasoning 13 Generation of Useful Problem Representations in a
Jaime G. Carbonell Semantically Rich Domain: The Example of
Metaphors for Marriage in Our Cultvire 16 Physics 77
Naomi O u m n JoanI, Heller, FReif
Metaphoric Gestures 18 Analogical Reasoning Patterns in Expert Problem
David McNeill Solving 79
Metaphor and the Construction of Reality 20 John Clement
George Lakoff RABBIT: Cognitive Science in Interface Design 82
Michael D. Williams, Frederich N. Tou, Richard E. Fikes,
Austin Henderson, Thomas Malone
W h y is it Easy to Control Your Arms: 21
Constructing Runnable Mental Models 86
Peter H. Greene
Allan Collins, Dedre Gentner
Internal Directional Reference Frames for Motor
Bi-Directional Inference 90
Coordination 22
Stuart Shapiro, Joao Martins, Donald M c K a y
Curtis Boylls, Jr.
Actively Learning to Use a Word Processor 94
Conscious and Unconscious Components of Intentional
John M. Carroll, Robert Mack
Control 24
Examples in The Legal Domain: Hypotheticals in
Bernard J. Baars and Diane N. Kramer
Contract L a w 96
Edwina L. FUssland
SUBMITTED PAPERS Learning Recursive Procedures by Middleschool
H o w Do Children L e a m to Judge Grammaticality? A Children 100
Psychologically Plausible Computer Model 27 Yuichiro Anzai, Yuzuru Uesato
Mallory Self ridge Prior Knowledge Occupies Cognitive Capacity in Chess
Pathfinder: Investigating the Acquisition of Problem Solving, Reading, and Thinking 103
Communicative Conventions 30 Bruce K. Britten, Abraham Tesser
Robert Cummins, Eric Dietrich Dynamic Construction of Finite Automata From
Play Considered as a Strategy for Knowledge Examples Using Hill-Climbing 105
Acquisition 33 Masaru, Tomita
Paul D. Scott Retrieving Memories of Personal Experiences 109
A n Experimental Architecture that Supports Brian J. Reiser, John B. Black, Robert P. Abelson
Non-Temporal Prediction 36 Personal Memory, Generic Memory, and Skill: A
Paul Robertson Re-Analysis of the Episodic-Semantic Distinction . 112
The Logic of Events 39 William F. Brewer
John M. Morns Temporal Judgements about Natural Events 114
Fuzzy Semantic Networks: A N e w Knowledge Norman R. Brown, Lance J. Flips, Steven K. Shevell
Representation Structure 41 Psychological Issues Raised by a n AI Model of
Douglas D. Dankel II, Kenneth W. Sprague Reconstructive Memory 118
Getting and Using Context: Functional Constraints on Janet L. Kolodner, Lawrence W. Barsalou
the Organization of Knowledge 44 Soft Control of Cognitive Processes 121
James A. Galambos, John B. Black Michael R. Fehling (sponsored by Gary M. Olson)
Conceptual Combination and Fuzzy Set Theory 47 Styles of Thinking: From Algebra Word Problems to
Edward E. Smith, Darnel N.Osherson Programming Via Procedurality 125
Natural Language Processing Using Spreading Kate Ehriich, Elliot Soloway, Valerie Abbott
Activation and Lateral Inhibition 50 Arithmetic Procedures in Everyday Situations 128
Jordan Pollack, David Waltz Jean Lave
Using the Dance to Investigate the Pragmatic/Semantic H o w Novices Solve Physics Problems 131
Boundary Between Artificial and Natural Eillen Scanlon, Tim O'Shea (sponsored by Jon Slack)
Languages 54 Associative Encoding at Synapses 135
Laura Silver, Lawrence J. Mazlack Wilham B. Levy (sponsored by Richard B. Millward)
Neural Hardware and the Presumed Autonomy of Judgmental Inference: A Theory of Inferential
Psychology 137 Decision-Making During Understanding 177
William Bechtel, Bernard Ecanow Richard H. Granger
The Integrated Implementation of Imaginal and Structure-Mapping: A Theoretical Framework for
Prepositional Data Structures in the Brain 140 Analogy and Similarity 181
John Barnden Dedre Gentner
Programmers' Mental Models of Their Programming Principles of Procedures Composition 185
Tasks: The Interaction of Real-World Knowledge and Christopher K. Riesbeck, Edwin L. Hutchins
Programming Knowledge 143 A Computer Simulation Approach to the Study of
Hank Kahney, Marc Eisenstadt Emotional Behavior 188
Natural Problem Solving Strategies and Programming Rolf Pfeifer (sponsored by Herbert A. Simon)
Language Constructs 146 Where Do Goals C o m e From? 191
Jeffrey Bonar Jaime G. Carbonell
Tacit Programming Knowledge 149 Surprise and Coherence: Sensitivity to Verbal Humor in
Elliot Solowoy, Kate Ehrlich Right Hemisphere Patients 195
The Role of Metaphors In Novices Learning Hiram H. Brownell, Dee Michel, John Powelson, Howard
Programming 152 Gardner
Ann Jones (sponsored by Richard 'foung) Language Dominance and Gesture Hand
Programs, Theories, and Models 155 Preferences 197
Paul Thagard Debra Stephens (sponsored by David McNeill)
O n Changing the "Logic" of Proposed Logics of Scientific Knowledge Constraints and Language Comprehension
Discovery 158 in Aphasia 200
S.C. Grover Victor Rosenthal, Patnzia Bisiacchi, Evelyne Adreewsky
A General Model for Simulating Information Processing A Unified Theory of Cognitive Reference Frames 204
Experiments 160 Michael Leyton (sponsored by Stephen E. Palmer)
Earl Hunt, Pollyanna Pixton Knowing, Understanding, and Believing 210
Architecture-Directed Processing 164 Yutaka Sayeki
Richard M. Young Knowledge and Belief as Logical Levels of
Question Answering: Two Separate Processes 167 Representation 212
Marc Luria (sponsored by Charles Filmore) Gabriella Airenti, Bruno G. Bara, Marco Colombetti
Exploded Connections: Unchunking Schematic Representativeness Reconsidered 215
Knowledge 169 Maya Bar-Hillel
Steven L. Small
The Context Model: Language Understanding in
Context 174
Vigal Areas (sponsored by Robert Wilensky)
S y m p o s i u m — C o n s c i o u s n e s s
Conscious, Subconscious, Unconscious:
A Neodissoclatlon Perspective
John F. Kihlstrom
University of Wisconsin

What gives us the impression that we are states of mind "is the sovereign means of believ-
conscious? What kind of evidence would convince ing what one likes in psychology, and of turning
us that a machine such as a computer, or a lower what might become a science into a tumbling-
animal such as a dolphin or a chimpanzee, or — ground for whimsies" (p. 163). But the Freudian
for that matter — another hvman being, was con- psychology which was yet to come shared the force
scious? Cognitive scientists of all stripes, of James' critique with other trends in the
especially those who specialize in psychology, psychology of his time, such as those which impli-
philosophy, and artificial intelligence, dis- cated unconscious inference in perception and
agree violently on the answers, and even on whe- judgment. To the contrary, he argued that either
ther these are sensible questions. But nobody the allegedly unconscious thought was rapidly
doubts that we humans, at least, possess conscious- forgotten; or that it represented a revision of
ness. The facts that erase any doubt about our- an earlier (and conscious) thought; or that it
selves are the facts of experience. As James put was not a thought at all, but merely an innate or
it in the Principles, "the first fact for us, habittial brain process. For James, thought and
then. . .is that thinking of some sort goes on" consciousness were identical. It was as difficult
(p. 224). Introspectively, the experience of for him to contemplate unconscious thought as it
consciousness seems to have to do with two things: was for Hume to contemplate a round square cupola
monitoring ourselves and our environment, such that on Berkeley College.
certain perceptual events and memories come to be Nevertheless, James did admit that under some
accurately represented in phenomenal awareness; and circumstances "the total possible consciousness may
controlling ourselves and our environment, such be split into parts which coexist but mutually
that we are able to voluntarily initiate and termi- ignore each other, and share the objects of know-
nate behavioral and cognitive activities. ledge between them" (p. 206). Following Janet and
Cognitive science has been vexed by the prob- Prince, from whom he drew most of his examples, he
lem of consciousness since its prehistory. It has referred to this phenomenon as representing
had a checkered past, for example, in psychology: "secondary" consciousness, rather than
almost the whole of the field for James, but a "unconsciousness." In order to understand what
virtual nonentity with the onslaught of the James had in mind, it is necessary to consider an
behaviorist movement. Interest in the topic important but almost-forgotten school of thought
persisted in the hands of the psychoanalysts, aol within psychiatry and psychology at the turn of the
was revived within mainstream psychology with the century.
cognitive revolution and its emphasis on attention It is commonly thought that the concept of
and the span of apprehension. Neurologists unconscious mental processes traces its origin to
commonly encounter disorders of consciousness of Freud and the theory of psychoanalysis. To the
various types, and those associated with the contrary, as Ellenberger has shown, the idea has
"split-brain" syndrome have recently received much a long history before Freud. In 1775, with the
notice. Ethologists and behavioral biologists appearance of Mesmer on the European medical
have considered whether lower animals possess the scene, speculation about the unconscious combined
capacity for awareness and voluntary control over with rationalized, materialistic versions of primi-
their actions — though this concern within tive psychotherapeutic procedures to form what is
comparative psychology has been supplanted to some known as the First Dynamic Psychiatry, whose leader
degree by a series of similar questions having to was the French neurologist and psychiatrist J.-M.
do with the capacity for language. Parallel Charcot. This psychiatry was concerned with demon-
concerns have sometimes caught the fancy of those strable "functional" as opposed to "organic"
in the artificial intelligence movement, who must mental illnesses — that is, those pathological
deal with the question of whether computers will syndromes which appeared not to be associated with
ever possess consciousness in the sense of aware- brain insult, injury, or disease. It attempted to
ness and voluntary control over what they are account for a wide range of phenomena, including
doing. The problems posed by the experience of hysteria, fugue (then called ambulatory automa-
consciousness for contemporary cognitive science tism), and multiple personality; the "magnetic
boil down to questions like these: What is the diseases" of catalepsy, lethargy, and somnambulism
nature of consciousness? What is it good for? (so named because of their resemblance to certain
Are there unconscious mental processes, and if so phenomena of animal magnetism, a precursor of
what are they like and what are they good for? hypnosis); spiritistic practices such as automatic
Finally, who cares? That is, would cognitive writing and crystal-gazing; hypnosis; and suggesti-
science proceed any differently if its practi- bility In the normal waking state. Each of these
tioners did not ask questions like these? Let us phenomena, the school held, represented the power
get some perspective on these questions by turning of ideits to turn into action (one of the meanings
to some early authorities, before examining some of "dynamic" in the psychological sense); and each
more recent theoretical and empirical developments. seemed to reflect a change in consciousness, as
William James devoted the better part of four thought and actions occurred outside phenomenal
chapters of the Principles to the topic of awareness and voluntary control.
consciousness. At the same time, he argued vigo- The First Dynamic Psychiatry, with its empha-
rously against the notion of unconscious thought, sis on unconscious mental contents and processes,
although he did agree that there were brain pro- invoked one or another of two explicit models of
cesses associated with mental activity of which the mind. The point of view known as dipsychism
«e might not be aware. As if in warning to Freud (e.g., Dessoir) held that the mind consisted of two
and theasserted
James other psychoanalysts who ofwere
that the concept to follow,
unconscious layers, each of which in turn consisted of chains 1
of assoclaclons. The "upper consciousness" uas featured prominently In the pages of the then-new
active In Che normal waking stace, while Che Journal of Abnormal and Social Psychology (founded
"lower consciousness" was active in such phenomena and edited by Prince), and was the chief alterna-
as dreams, hysteria, and hypnosis. According Co tive within dynamic psychiacry Co Freudian psycho-
Che "closed" version of dlpsychlsm, the lower analysis. However, IC was a. concepcuallzatlon
consciousness contained mental contents which which was short-lived. The eventual dominance of
passed into it through the upper consciousness: psychoanalysis in clinical psychology and scienti-
unattended stimuli, forgotten memories, and various fic personology led investigators to be interested
daydreams and fantasies. This point of view In different syndromes and phenomena, a different
contrasts with Che less materialistic "open" ver- model of the mind, and the eventual replacement of
sion. In which the lower consciousness uas held to dissociation by repression as the hypothetical
be In direct communication with other minds. mechanism for blocking mental contents from con-
According to polypsychism (e.g., Durand de Gros), sciousness. At the same time, the behavlorlst
each segment of Che anatomy was served by its own revolution in academic psychology removed
mental structures, called egos, each of which consciousness (not to mention the unconscious)
was capable of perception, memory, and thought. from the vocabulary of the science. At fault as
These structures, in turn, were subject to the well were the dissociation theorists themselves,
control of a superordlnace structure which was who often made extravagant claims for the centrall-
identified with normal consciousness. When the ty of their phenomenon and whose Investigations
link between subordinate and superordlnate egos were often methodologically flawed. The final blow
was broken, certain aspects of cognition and to the concept stemmed from the interpretation
action were carried out subconsciously. Clearly, that dissociated streams of consciousness, because
the concepts of dlpsychlsm and polypsychism are at they were Ignorant (Janet's term) of each other,
the root of Freud's first (conscious-preconsclous- should not Influence each other. Numerous demon-
unconscious) and second (Id-ego-superego) models strations of mutual Interference between ostensi-
of the mind. bly dissociated tasks showed the contrary, and
The issues confronted by the First Dynamic reference to dissociation gradually disappeared.
Psychiatry were subsequently taken up by another In part, the Insistence of both early and late
French psychiatrist, Pierre Janet. Following the dissociation theorists of non-interference between
principle of analysis-then-synthesis familiar in dissociated mental activities seems Co stem from
physiology, Janet began by considering the ele- a misunderstanding of James' metaphor of the stream
mentary parts of the mental system. Instead of of consciousness. Following the metaphor, it is
following the lead of the earlier faculty psycholo- sometimes held that two screams of wacer, running
gy, or the chanlcal analogies of the structural- parallel but separated by tall banks, should not
ists, he argued that the elementary structures of affect each other. However, If the two streams
the mind were psychological automatisms; complex originate from the same source, each will certain-
acts, tuned to environmental and personal circum- ly draw some of the flow from the other. Given
stances, proceeded by an Idea and accompanied by a model of attention such as Kahneman's, in which
an emotion. Each of these psychological auto- a single source of attentlonal capacity may be
matisms, by combining cognition, conation, and deployed in multiple directions, James' metaphor
emotion with action, represented a rudimentary would certainly lead one to predict some degree of
consciousness. According to Janet, all of these mutual Interference between simultaneous, thought
elementary automatisms ordinarily were bound dissociated, tasks. In fact, the available evi-
together into a single, united stream of dence indicates that simultaneous tasks performed
consciousness, and operated in awareness and under outside of awareness (for example, in hypnosis)
voluntary control. Under certain circumstances, do interfere with each other, with the extent of
however, one or more of these automatisms could interference a function of the attentlonal demands
be split off — Janet's term was disaggregation of the tasks in question. Where the tasks are
— from the rest, functioning either outside easy, there is little or no Interference; where one
awareness, or voluntary control, or both. or both are difficult. Interference increases
This dissociation view of the unconscious, proportionately. Awareness and control are the
as distinct from the repression view elaborated defining feature of dissociation, while noninter-
by Freud and his followers, was further developed ference Is an open, empirical question.
by the American psychologist and psychiatrist Viewed In these terms, a number of phenomena —
Morton Prince. Prince, following the practice observed in the laboratory, the clinic, and in the
of his day as exemplified by James' ten arguments ordinary course of everyday living — seem to
against the existence of unconscious thoughts, invite a notion such as dissociation. Some of the
reserved the term "unconscious" for Che dormant observations are dramatic, some mundane; the
traces of forgotten memories and unattended percep- quality of some of the research is impeccable;
tual inputs, as well as the strictly neurophyslo- some demonstrations are marred by poor methodology
logical processes associated with mental activity. or contaminated by extraneous social-psychological
Instead, he offered Che cerm coconscloua. referring variables. Some of the results are open to
CO menCal accivity which cakes place oucslde pheno- alternative interpretations, and the possibility of
menal awareness. Prince preferred this term performing a definitive experiment seems slim.
because it connoted mental activity rather than the Some of the claims, in fact, may turn out on close
lack of mencadon (as in Che ordinary-language investigation to be false. But not all of them are
conception of unconsciousness associated with false. To deny some of them is to deny the facts
concussion or coma); and because It permitted the of our everyday experience. In each of these
division of consciousness Into parallel streams instances, some aspect of past or present experi-
wlchouc one or more of Chese being oucslde aware- ence cannot be brought into phenomenal awareness,
ness. Coconsclous mental acclvicies performed or voluntary control has been lost over thought and
oucslde awareness, togecher with unconscious action.
mental concents and brain processes, formed Che Consider, first, the observations of cerebral
subconscious. commlsuotomy patients (and incacc subjeccs run
very conceptualization
popular on both sides
of consciousness
of the Atlantic,
was hand
under literally
special laboracory
does not know
what Che whose
left one
doing: Here is a division in consciousness associ- gestions during (REM) sleep, and continue respond-
ated with a literal division in brain structures. ing on subsequent nights even though they are
Or consider Korsakoff's syndrooe, whose dominant amnesic for their actions, and the suggestions,
feature is an extremely dense anterograde amnesia: during intervening periods of wakefulness.
recent experiments have revealed, somewhat surpri- Given observations such as these, Hilgard has
singly, that these patients can acquire new infor- recently revived the concerns of the First Dynamic
mation, and that this new learning can have an Psycliiatry by proposing a "neodissoclation" theory
impact on subsequent cognition and action — even of divided consciousness. He begins with the
though the patients have no recollection of the assumption that the cognitive apparatus is organ-
learning experience, and cannot voluntarily ized hierarchically, with various subsystems
retrieve the critical memories. Turning from monitoring and controlling thought and action in
neurology to psychiatry, there are the very syn- various domains. Under ordinary circumstances,
dromes that caught the attention of the practi- each subsystem is in communication with each of
tioners of the First Dynamic Psychiatry: hysteri- the others, and with a superordlnate central execu-
cal anesthesias, paralyses, and amnesias, in which tive structure. It is this central executive which
a person complains that he or she cannot remember is the source of our subjective feelings of
certain events from the past, perceive stimuli in awareness and intentionallty. Under certain
certain modalities, or voluntarily move certain circumstances, Hilgard holds, a subsystem (or
portions of the body — all in the absence of any more than one) can lose contact with the central
demonstrable organic brain syndrome; fugue states, executive. In this case, percepts, memories, and
in which a person loses his or her identity as well actions represented in one of the subsystems fail
as the whole of the autobiographical record, Co be represented in phenomenal awareness; or
relocates, and takes up a new life under a new perceptual exploration, memorial reconstruction,
name; and multiple personality, where separate and overt action occur outside the control of the
personalities, each with its own identity, central executive. Despite this loss of communica-
characteristic features, and personal history, seem tion with the central executive, the dissociated
to inhabit the same body, separated by amnesic subsystems can, in principle, continue to interact
barriers and alternating control over overt action with each other. This continued interaction is the
and phenomenal awareness. source of the facilitation and interference effects
In the laboratory, phenomena phenotypically which formed the basis of the empirical critique
similar to the symptoms of hysteria — analgesia of the Initial versions of dissociation theory.
and other negative hallucinations, spanning all the It should be clear that Che subconscious of
perceptual modalities; paralyses; compulsive neodissoclation theory is rather different from Che
automatisms in the form of posthypnotic sugges- unconscious as Ic is conceptualized by other
tions; and poschypnodc amnesia for events and schools within psychology. Neodissoclation theory
experiences transpiring during the state — can differs from psychoanalysis, for example, because
be induced in normal subjects simply by the hypno- the subconscious is not restricted Co primitive
tist's spoken word — provided that the subjects sexual and aggressive Impulses, and those memories
are hypnotizable to begin with. Under more and ideas associated with them. Nor do subcon-
familiar conditions, we have nvmerous experiments scious mental processes operate according to Che
on divided attention in which Information In the Irrational "primary process" principles associated
unattended channel Influences performance outside with the Freudian unconscious (as opposed to the
awareness; and experiments on multiple simultaneous rational, "secondary process" of the ego). Disso-
tasks in which complex activities, executed at an ciated percepts and memories can be closely tied
acceptable level of performance, are unrecalled to objective reality; and dissociated ideas can be
afterwards. Then there are all the experiments on rational and even creative. Equally Important,
perceptual defense and subliminal perception. In rendering something subconscious is not necessarily
the domain of memory, there are of course the motivated by defense against anxiety, as is Che
phenomena of state-dependent retention, context- case with Freudian repression. It can simply
dependent retention, and other manifestations of happen, as in the case of hysteria, fugue, or mul-
the encoding specificity principle. There are also tiple personality; or it can be done for entirely
compelling demonstrations that unremembered adaptive purposes, as in the case of the subjects
experiences can influence perceptual recognition, who voluntarily enter hypnosis or go to a movie
and of significant savings in relearning material precisely so they will become totally absorbed in
which appears, even after sensitive testing, to the action on the screen, forgetting for awhile
have been completely forgotten. their everyday concerns (and even who they are).
Examples of dissociation can also be found in The subconscious of neodissoclation theory
abundance outside the clinic and the laboratory. also differs in important ways from the manner in
One such experience is familiar to all of us: the which unconscious mental contents and processes are
dream of REM sleep, in which vivid images are construed, at least implicitly, in classical
constructed without our intending to do so, and theories of human Information processing. Here
in which complex plots are played out five or more four major trends can be discerned: an identifi-
times a night (on average), only to be completely cation of consciousness with attention, short-
forgotten in the morning. Similarly, there is term memory, or working memory — in other words,
the pavor noctumus (night terror) common in child- what we are aware of apprehending at any particular
ren, which scares the daylights out of their moment; with complex as opposed to simple, or dif-
parents even though the episodes are never remem- ficult as opposed to routine, information-
bered by the children themselves. The sleepwalker processing procedures; with the availability of
carries out complex motor activities while deeply linguistic representations for ideas and experi-
in NREM sleep, and remembers nothing of it in ences; and with declarative, as opposed to procedu-
the morning. (Sleeptalking, by the way, which also ral, knowledge. But the subconscious of neodisso-
occurs in NREM sleep, is a doubtful case of dis- clation theory is not restricted to Che procedural
sociation, because the speech does not seem to be knowledge by which we detect features in perceptual
back there
toor the
been demonstrations
in most
sug- memories,
tasks, anddecode
the judgments,
and encode
It can
also routine
factual knowledge, both semantic and episodic In
nature, concerning the presence of certain stimuli
or the occurrence of certain past events. Nor Is
It restricted to the simple, automatic, and rou-
tine: complex cognitive and behavioral activities
apparently can be performed outside awareness.
Linguistic contents can be rendered subconscious,
and percepts and memories can be subconscious even
though the person's linguistic abilities remain
Intact. Nor, wltiiin the realm of declarative know-
ledge, is the subconscious simply the repository of
unattended perceptual Inputs, weak memory traces,
and the products of early, simple, and automatic
cognitive operations.
Neodlssociatlon theory links a diverse set of
real-world and laboratory phenomena under a unified
descriptive rubric, and challenges cognitive
science to account for them. It comes as no sur-
prise that attention can be divided, though that
fact in Itself poses problems for those informa-
tion-processing theories which are predicated on
the existence of limited-capacity channels or
storage structures. But if attention can be
divided with one stream of complex, deliberate,
cognitive activity proceeding outside awareness,
this seems to cause some problems for the way we
usually think about things. The empirical base
for the theory Is sometimes problematic, but the
phenomena of dissociation are trying to tell us
something about the nature of conscious, subcon-
scious, and unconscious mental processing. If we
do not take these phenomena seriously, and consider
their implications for our understanding of the
cognitive system, our models of the mind may be
led seriously astray. This seems reason enough to
continue to pursue neodlssociatlon theory, and to
incorporate its insights into larger theories, to
Paper a comprehensive
presented view
at the 4th of the
annual mind in order
and the
of disorder.
Cognitive Science Society, Ann Arbor, August
1982. The point of view presented in this essay
developed in part from research supported by Grant
#MH-35856 from the National Institute of Mental
Health, United States Public Health Service. I
thank Patricia A. Register and Leanne Wilson for
their comments during the preparation of this paper,
and Ernest R. Hilgard for promoting the concept of
dissociation. An expanded version of this paper,
with references, is forthcoming in K. S. Bowers &
D. Meichenbaum (Eds.), The unconscious: Several
perspectives (Wiley).
S y m p o s i u m — R e p r e s e n t a t i o n o f

P r o c e s s e s a n d T i m e
Modeling Events, Actions,
and T i m e
James V. Allen
Deparunent of Computer Science
University of Rochesler
Rochester. N Y 14627

This brief note concerns what types of knuwIcJgc one usually involves flipping a light switch, but in some
must possess in order to be able to reason about events and circumstances it m a y involve tightening the Ughi bulb (in the
actions. In particular, in comprehending stories or dialogues. basement), or hitting the wall (in an old house). Although w e
many inferences are m a d e based on what events and actions have knowledge about how the action can be performed, this
are described. These range from inferences about the temporal does not define what the action is. The key defining
ordering of events to inferences concerning the beliefs and characteristic of turning on the light seems to be that the agent
motivations of the actors. Here I will concentrate on the is performing some activity which will cause the light, which is
nature of events and actions and discuss their relation to off w h e n the action starts, to become on when the action ends.
temporal reasoning. The references below provide more detail T h e importance of this observation is that we could recognize
on all these issues. an observed pattern of activity as "turning on the light" even
The formalism for actions and events used in most natural If w e had never seen or thought about that pattern previously.
language understanding systems is based on case grammar. With this model, it is theoretically srniple to describe two
Each action is represented by a set of assertions about the actions occurring simultaneously. The temporal condition;) for
semantic roles the noun phrases play with respect to the verb. each will be asserted to hold over the same time interval. It is
Such a formalism is a start, but does not explain h o w to then u p to the reasoning component to infer any interactions
represent what an action actually signifies. If one is told that a that m a y arise. While this has not solved anything by itself, at
certain action occurred, what can one conclude about how the least the complex problem can be expressed in the temporal
worid changed (or didn't change!). O n e possibility for such a logic, and reasoning techniques can then be investigated.
mechanism is found in the work on problem-solving systems With respect to modeling time, I want to make Just two
(e.g.. [Fikes and Nilsson. 1971]), which suggests one c o m m o n basic claims. T h e first is that representations based on
formulation of action. A n action is a function from one world assigning dates for each time are unworkable. The second is
stale to a succeeding world state and is desc'^-d by a set of that the underlying logic of time should be based on the
prerequisites and effects, or by decomposition into more notion of time intervals rather than time points.
primidve actions. While this model is extremely useful for
There are many difficulties that arise in systems based ou
modeling physical actions by a single actor, it does not cover a
date lines. In such an approach, each time is represented by a
large class of actions describable in English. For instance,
value (e.g., a number) and relationships between times can be
m a n y actions seemingly describe non-activity (e.g., standing
computed by s o m e operation on the values (e.g., numeric
still), or acting in some non-specified manner to preserve a
ordering). O n e problem is that dates are not often supplied.
state (e.g., preventing your television set from being stolen).
M u c h temporal information in English is supplied only on a
Difficult problems also arise in this model concerning the
relative basis (e.g., E occuned before E ) , both by the explicit
simultaneous occurrence of actions in domains with more than
mention of such relationships and by tense. For example, in
one agent. For example, consider a simple blocks world with
the sentence
one block and two robots. Let there be two actions, P U S H R ,
push the block to theright,and P U S H U push the block to "We found the letter while John was away,"
the left W e would like to define the effect of these actions in
the temporal connective "while" indicates that the time of the
terms of the block moving. But if the two robots perform a
find event occuaed during the time that John was away, and
P U S H L and P U S H R simultaneously, the block does not
the past tense indicates that both events occurred in the past
move. Yet, w e still want to say that each robot pushed the
(i.e., before now).
block. If w e cannot express simultaneity of actions, the best
w e could d o to model this situation would be lo have the The other major difficulty with date-based systems is that
block oscillate as the robots pushed alternately. there can be considerable uncertainty in our temporal
The approach suggested here does not attempt lo answer knowledge. For instance, w e might know that either event E
what an event or action actually is. Whatever an event is, the occurred before event E \ or vice versa. Hut in any case, the
only way w e can reason about one is by considering how the .times of E and E" did not overlap. O n e can only capture such
world changes (or remains constant) during some U m e inlerval information with a partial ordering relationship: no dates can
in which the event occurred. Thus it is crucial that the be assigned that capture these constraints. This is not to say
temporal model in the logic be general enough to capture the that dating is not a useful technique when it is possible, it just
scope of possible events. Actions are then defined as a cannot be the foundation of the representation.
subclass of events that involve agents and are described in a Turning lo the time interval/lime point conirovciby, wc
similar manner, fhe notions of prerequisite, result, and can easily observe that both appear to be referred lo in
methods of performing actions do not play a central role in English. Thus, w e can say,
this study. While they are important for reasoning about how "We found the letter at 12 o'clock."
to attain goals, they don't play an explicit role in defining " W e found the letter yesterday."
when an action can be said to have occurred. T o make this
point clear, consider the simple action of turning on a light. The most straightforward approach to dealing wiih time ilica
There are few physical activities that are a iieccsbaiy part seems lo be to introduce points in lime and then define
of performing the action of turning on a light. Depending on intervals from those points (e.g., [McDermott, 1981; Bruce.
the context, vastly different patterns of beliavior can be 1972]). 1 do not use this scheme for two reasons. The first is
classified as the same action. 1 or example, turning on a light
that such a representation is too uniform and does not References
facilitate structunng knowledge in a way convenient for
typical temporal reasoning tasks. The second is that it Allen, J.F., "An interval-based representation of temporal
encourages one to think of time as being isomorphic to the knowledge," Proc., 7th IJCAI, Vancouver, B.C., 1981.
real line, which is a serious mistake. Allen, J.F., "What's necessary to hide?: Reasoning about
The central issue concerning the first point is the action verbs," Proc., 19th Annual Meeung, Assoc.
importance of the during relation for reasoning. A major pari Computational Linguisucs, 77-81, Stanford U., 1981.
of our temporal knowledge appears to be of the form Bruce. B.C., "A model for temporal references and its
"event £• occurred during event L." application in a question answering program," Artificial
Intelligence 3, 1972.
Our knowledge of the during relation allows u highly Pikes, R£.. and N.J. Nilsson, "STRIPS: A new approach to
structured representation of time. In pariiciilar. a c o m m o n the apphcation of theorem proving to problem solving,"
way of inferring that some condition P holds during an Artificial Intelligence 2. 189-205. 1971.
interval T is to show that P holds in an interval that contains
T. For instance, I might know that m y office is locked today McDermott. D., "A temporal logic for reasoning about
because it has been locked all week. processes and plans," Research Report 196, I)ept.
Computer Science, Yale U., March 1981.
Furthermore, such a during hierarchy allows reaboniiig
processes to be locaUzed so that irrelevant facts are never
considered. For instance, if one is concerned with what is true
"today," one need consider only those intervals that are during
"today." or above "today" In the during hierarchy. If a fact is
indexed by an interval wholly contained by an interval
representing "yesiterday," then it cannot affect whai is true
On the second issue, some annoying characteristics arise
from allowing zero width of time points. For instance, two
intervals that meet must either have a point in c o m m o n or
have a point between them. Thus to describe an event
consisting of a light being transformed & o m being ofT to being
on, either the interval where it is off meets the interval where
it is on, and thus there is a point where the light is both on
and o S , or the interval where it is off is strictly before the
interval where it is on, and thus there is a point between the
two intervals where the light is neither on or off. This can be
avoided by a technical trick such as ueating all intervals as
open on their beginning and closed on their end. but such
tricks simply emphasize the unnaturalness of the approach. In
an interval-based system, such issues need not arise: two
intervals m a y meet without having any point in common.
Given this interval-based representation of lime, what is
the equivalent of time points? For instance, we otlen talk of
the beginning or ending times of events. ITiere is no reason to
assume, however, that the beginning and ending tunes are
instanuneous points. O n e might suggest that there is a
m i n i m u m size e of intervals, such that aU intervals of size less
than or equal to e are considered to be points, llie
consequence of this would be that two such point intervals
could then only be related by the relations < and =. This
approach is useful but only if there is not one fixed value for
e. for the size at which an interval is considered to be a point
depends on the reasoning task being done. For instance, the
smallest time intervals we care about in everyday life are
probably of the order of seconds, as physicists or computer
scienusts, w e m a y consider times on the order of nanoseconds.
Thus the interval size that we want to consider ai points varies
depending on the task as well as the proximity to the current

Johan de Kleer and John Seely Brown

Cognitive and Instructional Sciences
3333 Coyote Hill Road
Palo Alto. Califorma 94304

Our long-range goal is to develop a model of how a penon

acquires an understanding of mechanisnc devices such as physical
machines, electronic and hydraulic devices, or reactore. W e lay out a
framework for invesQgating the structure of what we call mechanisiic O cuJVtK
mental models: people's mental models of physical devices. Doing so 1 _ A
involves developing a precise notion of a qualitative simulation. The
concept of qualitative simulation derives from the c o m m o n inuition
of "picturing in one's mmd's eye, how the machine operates." Figure 1 : Buzzer
Although one would intuitively expect qualitative simulanons to
be simpler than quantitative simulations of a given device, they turn The buzzer's qualitative simuladon might be described as: The clapper-
out to be equally complex, but in a different way. These complexities switch of the buzzer closes, which causes the coil to conduct a current,
arise, in part, frxnn the fact that devices may appear nondeterministic thereby generating an electromagnetic field which in turn pulls the
and undeiconstrained when the quantities and forces involved in their clapper arm away from the switch contact, opening the switch, shutting
malceup are viewed solely from a qualitative perspective. Therefore, if off the magneticfield,allowing the clapper arm to return to its closed
the qualitative simulaaon of the device is to behave detenninistically. position, and thereby start the whole process oyer ag/aiii
additional knowledge and reasoning must be used to disambiguate The simplicity of the qualitative simulation as expressed in the
these "apparent" ambiguities. preceding example is deceptive. Qualitative simuladon encompasses
It is surprisingly difficult to construa mental models of a device a variety of ideas which need to be carefully differentiated. For
that are capable of predicting the consequences of events not con- example, we must distinguish simulation as a process from the results
sidered during the creation of the model. Thus, the process for con- of that process. A simulaaon process operates on a representation
structing a good mental model involves a different k m d of problem- describing the device, producing another reprcsentabon that describes
solving than the process for "running" the resultant mental model how the device functions. One source of confusion is that this
a distinction that we find crudal for understanding how people use latter representauon can likewise be "interpreted" or simulated, but
mental models. In &ct. simply clarifying the differences between the doing so will produce very little more than what is already explicitly
work involved in constructing a qualitative simulation — a process we represented in the functional representauon produced by thefirstkind
call envisioning — and the work involved in simulating the result of of simulatioa'
this construction — a process we call running — Qim out to have both W e need to distinguish four related notions which foim die basic
theoretical and practical ramifications. distinctions for a theory of qualitative reasomng. The most basic,
device topology, is a representation of the structure of the device (Le.,
of its physical organization). For example, the steam plant's structure
consists of a steam generator, turbine, condenser, their connecting
A Basis for Mechanistic Mental Models pipes, etc. The second, envisioning, is an inference process which.
given the device's structure, determines its function. The third, causal
Complex devices, such as machines, are built from combinations
•model, describes the functiomng of the device (i.e., a description of
of simpler devices (components). Let us assume we know the behaviors
how the device's behavior results from its consntuent components
of the components, as well as the way in which they are connected
which is stated in terms of how the components causally interact).
(o form the composite device. The behaviors of the components are
The last is the running of the causal model to produce a specific
described qualitadvely. such as "going up" or "going down." "high"
behavior for the device, by giving a chain of events each causally
or "low." The qualitative simulation always presents the events in tht
related to the previous one. Thus, both the structure and functioning
functioning of the machine in their causal order. Figure I illustrates
of a device arc represented by some knowledge-representadon scheme
a conventional door-buzzer (for the moment ignoring the button that
activates ihe buzzer). The buzzer is a simple device, but complex
enough to use for illustrating ideas of qualiiauvc sunulatioiL
^ e repcuuve opening and dosing of the switch (i.e., i(s vibrauon) produces an audible
^Vof£ that this litter kind of simulnoon is just one of the kinds of inference nicchanisms
'Tho paper is on abndged uidrevisedversion of dc KIcer & Brawn (82). Lhat can use or inicrprci" the funciionalrepresentauon.Others can inspect it in order
10 answer such ciuc^tions as Could i cause y to happen?"
(device topology and causal model respectively), with Che former being "Running" the resulting causal model is closest co che original
the input to (he envisioning process and the laaer being its output; psychological innjicion of "picturing, in one's mind's eye, how che
this output causal model is. in cum. then used in che running. The machine operates." By running che model, one, in essence, does a
example of qualitative simulation presented earlier is ambiguous as to straightforward simulauon of the machine; Che running itself does
whether it refcn to che envisioning, che causal model, or che ninning. not have co determine or "prove" che causal or temporal ordering of
Envisioning, Le« determining che functioning of a device solely events, as the envisioning process already has done so, and encoded
from its stnicture often requires some very subtle reasoning. The (he information in (he causal model which serves as (he input data for
cask, in essence, is co figure out how che device works given only che running process.
ics stnKture and che knowledge of some basic principles. Stnicture
The simplicity and elegance of che running process is (he result
descnbes (he physical organization of (he device, namely che constituent of che complex problem-solvmg (Le., envisioning) chat constructed
components and how they arc connected, but it does not describe how iL That our intuition that 'picturing, in one's mind's eye. how che
the componencs function in che particular device. The "behaviors" of machine operates" is simple, is manifested by this running process.
each component are described assumug nothing about the particular However, chat sense of simplicity is deceptive, for the tunning is not
context in which the component is embedded (i.c the description is possible without the more complex problem-solving which preceded
context-free). These betuviors form a compoaent model (or schema) It. removing all the ambiguities about how che machine might be
which characterizes all (he potential behaviors of the component; the fimctioning.
envisioning process instantiates a specific behavior for each component Understandably, the problems that arise in constructing causal
from these models. These componenc models are che basic principles models and the mechanisms that sufiice in solving these problems
which che envisioning process draws upon to derive the functioning are important for cognitive psychology and artificial intelligence. For
from che suucture. psychology, they are important because they provide a framework
T o determine che functioning of (he overall device each com* for analyzing the "competency" involved in determining how a novel
poncnt's model must be examined and an individual specific behavior machine functions. Inasmuch as envisioning is restricted (o being
instantiated for iL Thus, the functioning of the entire device is deter
based solely on structural evidence, it becomes an interesting inference
mined, in part by "glmng together" the specific behaviors of all of
strategy in its o w nrightfor artificial intelligence applications, especially
its components. The problem for envisioning is detennining for each
given the desire for artificial inceUigence systems to b« robust, and to
component which behavior, given aU che possible behaviois its model
be capable to deal with novel sioiations. The resulting models are more
characterizes, is actually being manifested.
likely to be void of any implicit assumptions or built-in presuppositions
W h a t makes che problem-solving effort involved in the scructure- based on how the device was intended to behave,
co-fimction inference process difficult is that the behavior of the overall
device is constrained, not only by local interactioos of its component AMBIGUITIES ASD ASSUMPTIONS
behaviors, but also by global interactions. Therefore, in principle, the
behavior models of che components which are specified qualitatively
Origiii of AmbiguitiM
may not provide enough information co identify the correct fiinctioning
of the device. For example, if values are described qualitatively, often In general, ambiguities originate from the fact that che information
fine-grained distinctions cannot be made between thenL Thus, in Che available to the qualitative analysis underdetertnines or only partially
case of che buzzer, the envisioning may not be able co determine wliich characterizes the actual behavior of che overall device. There are chiee
is greater, che force of che magneticfieldor che restoring force of che reasons for Chis underdetermination. Thefirstand most obvious is
spring. Knowing which is greater may, in faa, be crucial co deducing that the quantities referenced by the component models are qualita-
che correct functioning of che device. tive and thus fine-grained distinctions cannot be made between the
attribute values or component sutes. Second, because the implicit
In order to describe bow the resultant behavior derives from
time progression in the simulation is qualitative, it is not always pos-
che behaviors of che constituents, first, each important event in Che
sible to determine che actual ordering of events. A n d che chird reason.
overall behavior must be causally related to preceding events. Then.
not directly related co che qualicative nanire of the models, comes from
each c^isal relauonship must be explained by some fragment of the
che limitations on che kinds of information capwred by the models.
component model of one of its components. The example describing
Figure U is. at best, an abridged description of the buzzer's function. Because envisioning tries co identify a global flow of acnon by piecing
It causally relates each event co che preceding one. but fails co state together local cause-effect rules of the component m o d e K a component
any rationale for these causal coimecnons. Because it is impossible to model encodes only those aspects of the component's behavior that
tell, a priori, whether the component models lead to unique behavior. can be used in such a fashion. However, our understanding of a given
(he problem-solver must entertain (he possibility (hat (he structural component often involves more knowledge than is (or. perhaps, could
evidence is undeiconstraimng. Therefore the envisioning must (ake be) encoded in such mechanistic rules. For example, in modeling
into account (he possibility that one structure may have multiple the internal operadon of a p u m p we know from the laws of physics
possible fiinctionmgs among which the envisioning cannot, in principle. that Suid is conserved in passing through the pump. But, because
distinguish. this piece of knowledge is a constraint, it cannot be represented by
any cause-effea rule: the inability to encode it can lead to a given REFERENCES
component model being underdetermined. de KIccr. J. and IS. Brown. Assumptions and Ambiguities in
Mechanisuc Mental Models," to appear in Meniai Models, edited
Origin of Assumptions by D. G c n m c r and A. S. Stevens, Eribaum, 1981

In the buzzer example, because of tbe qualitative nanire of the

attribute values, the envisioning process cannot determine whether the
spring is stronger than the magnetic field. In this "impasse," it is
forced to consider two hypothetical situations: one in which it assumes
the spring is stronger than the magnetic field and one in which it
assumes the spring is weaker than the field.
Impasses occur when envisiomng cannot evaluate a transition
condition (e.g., the condition of the switch being open) or invoke an
attribute equation (e.g.. that offieldstrength bemg proportional to coil
current) to deiermme the value of an unknown attribute. In order to
proceed around unpasses, the envisioning must introduce assumptions
about the truth or falsity of conditions or about tbe values of unknown
The buzzer example can be used to illustrate an impasse which
arises from the envisioning being unable to determine whether a
transition condition holds. In this impasse, the envisioner introduces
an assumption that the condition "force from the coil >restoringforce
of the spring" is true, and then proceeds to analyze the new resulting
state. O f course, the resulting causal model win then contain two
accounts of the device's functioning: one in which the clapper rises
and one in which it does noL Additional knowledge and reasoning
strategics must then be used to verify or rejea the various assumptions
that were created to enable die envisioner to proceed around such
impasses. These strategies combined with a much more extensive
analysis of the kinds of assumptions needed in order to construct a
causal model have been detailed in the expanded version of this paper.
\ Nolo Cuiiccming
Qujilirativc IVoccss Theory
Ken Korhus
M i l Al Lab
545 I'cchiioloiB' Square
C;iiiil.ri(li;c. Mass. 02139 U S A
I. Iiitroductloo 1 A n Example
Many kinds of changes occur in physical situations. Hiings Ilicre arc several kinds of reasoning Uiat can be performed
move, collide, tlow. bend, heat up. cixil down, stretch, break, ;ind boil. using Qualitative Process dieory, including reasoning about die limits of
Hicsc and the oiher things that happen to cause changes in objects over processes ("What might happen if diis valve is lefl open?") and
lime .ire intuitively characterized as processes. M u c h of formal physics consequences of alternate siluations ("How would die turning up die
consists of charucterizaiiuns of processes by dilTcrcntial equations stove affect die heaung of die ketde?") as well as explaining some
which describe how the piiramctcrs of objects change over time. Rut problems involved in causal reasoning. Several examples of c o m m o n
tlie notion of process isricherand more structured tlinn this. W e often sense phenomena have been examined in Uiis context, including
reach conclusions about physical proccss.'s based on very little modelling a boiler, motion, materials (saying diat you can push with a
infomiauon. For example, wc know diat if we heat water in a scaled string but not pull with it), and an oscillator. A n informal example will
container the water can eventually boil, and if wc continue to do so the illusu-ate itsflavor.Here is a simple problem involving physical systems
container can explode. T o undcrsund c o m m o n sense physical diat we solve easily:
rcasonmg we must undcistand h o w to reason qualitatively about Imagine looking al a large lank, partially filled with water, iou can see
priK-csscs. their effects, and their limits. I have been developing a iwo pipes leading into it. and you note that the level in the lank it
theory, called Ou.ilit.iiive Proccxs ihet^rv. tor this purposcfKorbus, 1981, dropping. Your goal is lo figure out why this is happening.
1982]. I expect this dicory, when ftilly developed, to provide a In Q P theory terms, why diis is happening" means finding a
reprcscnt.niional Iramcwork for understanding h u m a n c o m m o n sense set of processes which are causing die changes in die situation. (In the
physical reasoning. It should also be useftil for constructing computer complicated physical systems which comprise much of our technology,
programs that reason about complex physical systems as well as diis is much harder dian die simple example depicted here, because die
c o m m o n sense reasoning. Programs tliat explain, repair and operate reladonship between what we can observe (through instruments) and
complex sy^iciiiN bucli as nuclear power plants and stcain machinery die processes which serve as an explanation is m u c h less direct). ITie
will need to draw the kinds of conclusions this theory sanctions. reasoning goes as follows:
QualilatiNC rc.nsoning about quantities is a problem that has PI No process affects level directly, but level is qualitatively
!oi!g plagued Art^'icial Intelligence and Cognitive Science. M.iny proportional lo Amount-of fluid.
schemes liavc been tned. including simple symbolic vocabularies [2] The only processes which affect Amount-of a contained fluid are
( T A I L V K R Y T A L L etc.),realnumbers, intervals, fuzzy logic and so boiling, evaporation, andfluidflow.
fonh. N o n e arc very satisfying. Phc reason is that none of Uie above [3] No heat source is visible, so boiling can be luied ouL
schemes makes distinctions that arc relevant to physical reasoning.
Reasoning about processes provides a strong constraint on the choice of [4] The time scale is sbon. so evaporation can be ruled out
rcprescnudon for quantities. Processes usually start and stop when
ordcr:ngs between quantities change. For example, when two objects [5] By exclusion, fluid flow must be die source of die influence.
with unequal temperatures are brought into contact there will be a heat [6] Fluid flow requires a fluid padL
flow from one to the other uhich will stop when the temperatures are
cquaL In Qualitative Process theory the value of quantities are [7] Only two pipes are visible, so assume diose are die only fluid
represented by a partial ordering of other quanuties detcnnined by the connections to die tank.
domain physics. Therepresentationappears both useful and naturaL
[8] Only two fluid flows arc possible, one dirough each pipe. Fluid flow
O P theory is mainly concerned with die form of physical
can be measured: in diis case both flows are into die tank.
theories and only indirectly about Uicir specific content For example,
heat flow processes which don't conserve energy and transfer "caloric [9| Therefore die influence of die fluid flows is posiuve.
fluid" can be wnttcn as well as the classical physical dcscriptioiL
Newtonian. Aristotelian, and Impetus theories of motion can all be [in] Tlicreforc die level of die tank should be increasing, not
encoded. Thus Q P theory provides a language for writing physical dixrcasing.
theories. In panicular. the primiuvcs are simple processes (such as [11] liiiher (1) Other processes affecting amount-of exist
flows, state changes, and mr.-ion), die means of combination are (2) F.vaporation or lloiling are occuring
scquentiality and shared parameters, and die means of abstraction are (3) Measurements arc wrong
naming these combinations, including encapsulating a piece of the (4) Odicrfluidpadis exist
process history la k m d of beha\ioral description, see (Hayes. 19791) for \\2] Pragmatically, (4) is die most likely - e,g., a large leak in die tank.
the situation as a new process.
The basic Qualitative Process dieory is not intended to capture Knowing what can be measured .nnd die pragmatic
die fiill range of qualitauve reasoning about the physical world. Instead information used in ruling out cvaporadon and in occcpung the leak as
it is concerned with describing die weakest kind of information diat still die best prospect arc not part of Q P dieory, but instead illustrate die
allows uscliil conclusions to be dniwn. lliere are two reasons why diis interaction of die dieory with other kinds of world knowledge. Note
weak level of description is interesting. First, conclusions from weak that die key to die dcducuon is die assumption of a finite vocabulary of
information arc often required to drive die search for conclusions from processes liiat could cause the observed change. Hayes (Hayes. Liquids)
more deuiilcd information (an il'iustration is (dcKlecr. 197S)). More suggests rcTSoning by elimination is a powerful technique in c o m m o n
importantly. I believe dial die ba:.ic dieory can be used to write what sense reasoning; organizing physical knowledge around a v(x:abulary of
corresponds to people's c o m m o n sense physical knowledge. To capture priKcsscs provides funhcr oppununity to do so.
more sophisticated kinds of phys cal reasoning (for example, how an 3. Current State of the llicory
engineer makes estimates of circu.t parameters or stresses on a bridge) The current state of die dieory is described in [Forbus. 1982J.
extension theories containing more dcuiled reprcsentiitions of quantity, Furdicr dieoretical developments are being carried out in die context of
functions, and processes will b<; needed. Kxainplcs of extension reasoning about simple fluid and mechanical systems. An
10 could include order of in igiiitude estimates and numbers. By
providing a shared b;isic theory, future studies of more sophisticated
domains m.ny yield a way to classify kinds of physical reasoning
according to die extension dieorio Uiey require.
implcmcntauon is underway.

4. References
Clement. John "A Cunccptuol Model Discussed by Galileo and Used
Iniuiiivcly by Physics Sludcnts" to appear in MfilUfll MllUfiJS. D.
Gcnuicr and A. Stevens, cdiiois.
dcKlccr. Juhun "Qualitauve and Quantitative Knowledge in Classical
Mechanics" TR-352. Mil AI Lab. Cambndge. Massachusetts. 1975
Forbus. K. "Qualitative Reasoning about Physical Processes"
Proceedings of lJCAI-7.1981
Forbus. K. "Qualitative Process Theory" M I T AI Lab M e m o No. 664,
February, 1982
Hayes. Patrick J. "Naive Physics 1 - Ontology for Liquids" M e m o ,
Centre pour les etudes Scmantiques et Cognitives, Geneva, 1979
McQoskey, M . "Naive Theories of Motion" to appear in Mental
Models. D. Gcntncr and A. Stevens, editors.

S y m p o s i u m — M e t a p h o r
The Preconceptual Basis of Experlencial Metaphor
Mark Johnson
Department of Philosophy
Southern Illinois University at Carbondale

Standard nodels of metaphoric comprehension share and values, and (ill) personality traits).
at least the follouing set of basic assumptions: (1) I am claiming that understanding a metaphor in-
Meaning is conceptual structure. (2) Comprehending a volves more than grasping conceptual structure—it
metaphor of the form "A is B" requires a grasp of the also involves preconceptual elements that are neither
appropriate conceptual structure for the "A" and "B" discrete predicates nor structured relations. Such
(topic-vehicle) components, and it also requires the elements are a basic part of our ordinary experience
ability to map the B^ domain onto the A domain in a without which no metaphor could have the power it
contextually appropriate fashion. (3) The mapping or does to shape our understanding, action, and language.
projection procedure depends principally on underly- If this analysis is correct, it calls for a rethink-
ing similarities between the two domains. Versions ing of certain fundamental assumptions guiding work
of this position differ as to the nature of the map- on metaphor in cognitive science.
ping mechanism. Some treat the metaphoric projection
as a simple transfer of discrete properties or rela-
tions from the B^ domain over to the A domain, with
appropriate changes being made to apply the transfer-
red predicates to the new domain. Others argue that
a more complex model is needed, one in which the en-
tire system of predicates for the B domain, with all
of its complex internal relations, must somehow be
projected as a whole in such a way as to restructure
Che conceptual system for the A domain.
It is commonly believed by those who operate with
some version of this standard model that the chief
problem posed by netaphor for artificial intelligence
is to discover the way in which contextual clues de-
termine the precise nature of the projective process
of metaphoric understanding. While I agree that this
is the main difficulty, I want to suggest that it is
less amenable to solution than moat cognitive scien-
tists believe. The reason for my pessimism is that,
contrary to the accepted view, understanding a meta-
phor is not Just a process of grasping certain con-
ceptual structurings. In the metaphors of ordinary
and technical discourse alike, there Is also a pre-
conceptual basis in experience that gives the meta-
phor the meaning it has and that cannot be reduced
to concepts or conceptual structure (as mental repre-
My argument is based upon an analysis of some of
the preconceptual factors involved in the comprehen-
sion of what I call "experiential" metaphors. An
experiential metaphor is a process of experiencing,
conceptualizing, and calking about one domain of ex-
perience as it is structured in terms of another do-
main of a different kind. Such metaphors are basic
processes of everyday experience, and they are not
mere linguistic ornaments or rhetorical modes of ex-
pression. The experiential metaphor MARRIAGE IS A
BUSINESS PARTNERSHIP, for example, is one of several
metaphors in American culture that structures the way
some people understand, act out, and reason about
their marriages. It is not a matter of mere words
that ue use to calk about marriage; rather, it is one
possible structuring of marital relations that pro-
vides coherence, order, and significance in the lives
of chose who live by che metaphor.
phor is more than a conceptual structuring of some
aspects of one's marriage. It involves non-struc-
tural, preconceptual elements without which the meta-
phor would have no significance for us. These pre-
conceptual elements in experience consist of various 12
capacities, skills, values, and purposes in which the
conceptual structures are rooted and from which they
take cheir nourishment. With reference to the BUSI-
NESS PARTNERSHIP metaphor I Identify four such ele-
ments: (1) General human purposes, (2) Cultural
(4) Individual andcharacteristics
(1) individual practices,
purposes, (3) Theoretical
and patterns paradigms,
Individual (includ-
T o w a r d s a Comoutatlonal M o d e l
of kietaphor
in C o m m o n S e n s e R e a s o n i n g

Jaime G. Carbonell
Carnegle-Metlon University
Pittsburgh, PA 15213

1. I n t r o d u c t i o n that such reasoning is seldom necessary and when applied

The theory that metaphor dominates large aspects of human requires a more concerted cognitive effort than mundane
thinking, as well playing a significant role in linguistic metaphorical inference.
communication, has been argued with considerable force
[10, 8. 3. 1|. However, the validity of such a theory is a matter of 3. Towards Metaphorical Reasoning: The
continuing debate that appears neither to dissuade its proponents
nor convince its detractors. Being among the proponents, I
Balance ly/ietaphor
Consider a prevalent metaphor reasoning about imponderable
propose to develop a computationally effective, c o m m o n sense
or abstract entities as though they were obiects with a measurable
reasoning system based on underlying metaphors. I claim that if
weight. O n e of several reasoning patterns based on this simple
such a system exhibits cognitively plausible c o m m o n sense
metaphor is the balance principle. The physical analog of this
reasoning capabilities, it will demonstrate the utility of
reasoning pattern is a prototypical scale with two balanced plates.
metaphoncal reasoning. Moreover, if the model can account for
Large numbers of metaphors appeal to this simple device coupled
observed instances of naive human reasoning t>etter than existing
inference systems, it will provide convincing evidence m favor of with the processes of bringing the system into (and out of)
equilibrium. First, consider some examples of the basic metaphor,
the metaphorical reasoning theory. This brief paper investigates
aspects of the metaphoncal rezisoning phenomenon and in which ihe relevant aspect of an abstract concept maps onto the
weight' 0/ an unspecilied physical object.
describes the initial steps towards developing a computationaf
Arms control is a weighty ii
The worries of a nation weigh heavily upon his shoulders.
2. Experiential Reasoning vs Formal
Systems The Argentirte air force launched a mmssiv* attacK on ttie
Humans learn from experience to a degree ttiat no fonnaf British fleet. O n e frigate was heavily damaged, but only
system. Al model, or philosophical theory can match. T h e light casualties were suffered by British sailors. The
statement that the human mind is (or contains) the sum total of its Argentines payed a /leavjrtoll in downed aircraft
experiences is in itself rather vacuous. A more precise formulation
of expenence-based reasoning may be structured in terms of IMot tieing in the mood for heavy drama, John went to a
coordinated answers to the following questions: H o w are light comedy, which turned out to be a piece of
experiences brought to tiear in understanding new situations? meaningless fluff.
H o w is long term memory modified and indexed? H o w are
Pendergast was a real /ieairyMre>g/i( in the 1920s Saint
inference patterns acquired in a particular domain and adapted to
Louis political scene.
apply in novel situations? H o w does a person "see the light" wtien
a previously incomprehensible problem is viewed from a new The crime weighed heavily upon his conscience.
perspecove? H o w are the vast maionty of irrelevant or
inappropriate experiences and inference patterns filtered out in The weight of the evidence was overwhelming.
the understanding process? Answering all these "how" questions
requires a process model capable of orqanizing large amount of Weight clearly represents different things in the various
knowledge and mapping relevant aspects of past experience to metaphors: the seventy of a nation's problems, the number of
new situations. S o m e meaningful starts have been made towards attacking aircraft, the extent of physical damage, the emotional
large-scale eoisodicbased memory organization [14.15,12, 9J affect on auaiences of theatrical productions, the amount of
and towards episodic based analogical reasoning [5,4,2], political muscle (to use another metaphor), the reaction to violated
Bearing these questions in mind. I turn towards the issue of moral principles, and the degree to which evidence is found to be
c o m m o n sense reasoning In knowledge-nch mundane domains. convincing. In general, more is heavier: less is lighter. O n e may
Mv central claim is that reasoning in mundane, recurrent argue that since language Is heavily endowed with words that
:..^„tion; o t^jilili.,...-;;^ diiierent ircm rt„:,oning in more abstract describe v/eight. mass and other physical attnbutes (such as hight
and exponentially unique situations (such as some mathematical and onentation[10)), one borrows such words when discussing
or puzzle-solving domains). The former consists of recalling more abstract entities [13] •- tor lack of alternate vocabulary.
cippropriaie past experiences and inference patterns, whereas the Whereas this argument is widely accepted, it falls far short of the
latter requires knowledge-poor searcn processes more typical of conjecture I wish to make.
past and present Al problem solving systems. Since computer Conjecture: Physical metaphors directly mirror the
programs perform much tiefter in simple, elegant, abstract unaeriying inference processes. Patterns ot inference valid
domains m a n m "scruffy" exoenence-nch human domains, it is lor physical attributes are mapped invariant and
evident that a fundamental reasoning mechanism is lacking from reinstantiated in the target domain of the metaphor.
the Al repenoire The issue is not merely tnat Al systems lack In order to illustrate the validity of this coniecture consider a
experience in mundane human scenarios - they would be unable c o m m o n inference pattern based on the weight of physical
to benefit from such experience if it were encoded in their
knowledge base. I postulate that the missing reasoning method is
one of metaphor-based transfer of proven inference patterns and Mass IS virlually synonymous witti weigtit in naive reasoning.
experiential knowledge across domains. This is not to say that
humans are largely incapable of more formal reasoning, but rather
objects: The inference pattern is the balance principle mentioned inference patterns in scientific reasoning that transcend the
earlier as applied to a scale with two plates. The scale can be in traditional boundaries of a science. For instance, the notion of
balance or tipped towards either side, as a function o> the relative equilibrium (of forces on a ngkl obiect, or of ion transfer in
weights of objects placed m the respective plates. Inference aqueous solutions, etc.) is, in essence, a more precise and
consists of placing obiects m ttie scale and predicting tho general formulation of the balance metaphor. Reasoning based on
resultant situation -- no claim is made as to whether this process recumng general inference patterns seems to pervade every
occurs in a propositional framework or as visual imagery, although aspect of human cognition. These patterns encapsulate sets of
I favor the former. H o w could such a simple inference pattern be rules to be used in unison, and thereby bypass the combinatorial
useful? H o w could it appfy to complex, non-physical domains? problems in traditional rule-based deductive inference. The
Consider the following examples of metaphoncal communication inference patterns are frozen from experience and generalized to
based on this inference pattern: apply in many relevant domains.
The jury found the uretght of the evidence favoring the I have started working on a computational model that acquires
defendant. His impeccable record weighed heavily in his and generalizes recurring inference patterns from prior
favor, whereas the prosecution witness, being a confessed experience [G], bul let us focus on the equaily basic issue of how
con-man, carried little weight with the jury. O n balance such patterns may be used in the reasoning process.
the state failed to a m a s s sufficient evidence for a solid Conceptually, the process may be divided into three stages:
The SS-20 missile tips the balance of power in favor of the 1. Index the relevant inference patterns appropriate to the
Soviets. situation at hand. The establishment of the appropriate
metaphor is the really difficult part. This is why it is much
Both conservative and liberal arguments appeared to carry easier to understand someone's descnption of observed or
equal weight with the president, and his decision hung on expenenced events (the metaphor is explicitly referenced by
the balance. However, his long-standing opposition to the choice of words), than to generate appropriate action
abortion tipped the scale In favor of the conservatives. " the typical distinction between planning and plan
The Steeters were the heavy pre-game favorites, but the 2. Instantiate the inference patterns in the specific situation.
Browns started piling up points and accumulated a Computationally, the process of instantiation and the
massive half-time lead. In spite of a late ralty, the steelers process of searching for appropnate inference patterns are
did not score heavily enough to pull the game o«it two aspects of the same mechanism.
The job applicant's shyness weighed against her, tjut her 3. Carry out the inferences stipulated In the retrieved patterns,
excellent recommendations lipped the scales in her favor. and check whether additional inference patterns are
In each example above the same basic underlying inference invoked as a result of the expanded knowledge state.
pattern recurs, whether representing the. outcome of a trial, At the present stage in the investigation. I am searching for
statements of relative military power, decision-making processes. general inference patterns and the metaphors that giveriseto
or the outcome of a sporting event. The inference pattern itself is them, both in mundane and in scientific scenarios. As these
quite simple: it takes as input signed quantities - whose patterns are discovered, they are cataloged according to the
magnitudes are analogous to their stated "weight" and whose situational features that indicate their presence. The basic
signs depend on which side of a binary issue those weights metaphor underlying each inference pattern is recorded along
correspond -- and selects the side with the maximai weight, with exemplary linguistic manifestations. The intemeil structure of
computing some qualitative estimate of how for out of balance the the inference patterns themselves are simple to encode in an Al
system is. Moreover, the inference pattern also serves to infer the system. The difficulty arises in connecting them to the external
rough weight of one side if the weight of the other side and the worid (i.e., establishing appropnate mappings) and in determining
resultant balance state are known. (E.g., If Georgia won the the conditions of applicability for each inference pattern (which
football game scoring only 17 points, Alabama's scoring must are more accurately represented by continuous functions than
have been really light) simple binary tests). For instance, it is difficult to formulate a
The central issue in my discussion is that this very simple general process capable of drawing the mapping between the
inference pattern based on a physical metaphor accounts for very "weight" of a hypothetical object and the corresponding aspect of
large numbers of inferences in mundane human situations. Given the non-physical entity under consideration, so that the balance
the existence of such a simple and widely applicable pattern, why inference pattern my apply. It is equally difficult to determine the
should one suppose that more complicated inference methods degree to which this or any other inference pattern can make a
explain human reasoning more accurately? It is my belief that useful contribution to novel situations that bear sufficient similarity
there exist a moderate number of general inference patterns such to past experience [4],
as the present one. which together span most mundane human 5. F u t u r e Directions
situations, fy/loreover. the tew other patterns I have found thus far if one lends credence to the metaphorical reasoning
are also rooted on simple physical principles or other directly hypothesis, several avenues of continued research suggest
experienced phenomena. However, since the current study is themselves.
only in its initial stages, the hypothesis that metaphorical inference • Continue the development of a computational model to test
predominates human cognition retains the status of a conjecture, the theory of metaphorical inference and thereby force a
pending additional investigation. I woukj say that the weight of the finer-grain analysis of the phenomenon.
evidence is as yet insufficient to tip the academic scales. • Examine the extent to which linguistic metaphors reflect
4. Requirements on a computational
underiying inference patterns. The existence of a number
model generally useful inference patterns based on underiying
Metaphorically-based general patterns of inference do not metaphors is not incompatible with the possibility that the
appear confined to naive reasoning in mundane situations. vasi maiority of meljpho.-s i=rnain niiire linguistic devices,
Gentner (7) and Johnson (8) have argued the significant role that as previously thought. In essence, the existence of a
metaphor plays in formulating scientific theories. In our phenomenon does not necessarily imply its universal
preliminary investigations. Larkin and I [11] have isolated general
presence. This is a matter id be resolved by more Yale University, Nov. 1980.
comprehensive future investigation.
10. Lakoff, G. and Johnson, M., Metaphors We Live By.
• Investigate the close connection bet\A/een models of
Chicago University Press, 1980.
expenentlal learning and metaphorical inference. In fact, my
earlier investigation of patterns of analogical reasoning in 11. Larkin. J. H. and Carbonell, J. G., "General Patterns of
learning problem solving strategies first suggested that the Scientific Inference: A Basis tor Robust and Extensible
inference patterns that could be acquired from experience Instructional Systems," 1982. Proposal to the Office of
coincide with those underlying many c o m m o n metaphors Naval Research.
12. Lebowitz, M., Generalization and Memory in an Integrated
• Exploit the human ability for experientiallybaaed
Understanding System, PhO dissertation, Yale University,
metaphorical reasoning in order to enhance the educational Oct. 1980.
process. In fact. Sleeman and others have independently
used the balance metaphor to help teach algebra to young 13. Oftony, A. (Ed.), Metaphor and Thought, Cambridge
or learning disabled children. Briefly, a scale is viewed as an University Press. 1979.
equation, where the quantities on the nght and left hand
sides must balance. Algebraic manipulations correspond to 14. Schank. R. C, "Reminding and Memory Organization: An
adding or deleting equal amounts of weight froin both sides Introduction to lulOPS," Tech. report 170, Yale University
of the scale, hence preserving balance. First, the child is Comp. Sci. Dept, 1979.
taught to use the scale with color-coded boxes or different 15. Schank, R. C, "Language and lulemory." Cognitive
(integral) weights. Then, the transfer to numbers in simple Science. Vol. 4. No. 3,1980 . pp. 243-284.
algebraic equations Is performed. Preliminary results
indicate that children learn faster and better when they are
able to use explicitly this general inference pattern. I foresee
other applications ot this and other metaphorical inference
patterns in facilitating instruction of more abstract concepts.
The teacher must make the mapping explicit to the student
in domains alien to his or her past experience. As discussed
earlier, establishing and Instantiating the appropriats
mapping is also the most problematical phase from a
computational standpoint and therefore should correspond
to the most difficult step in the learning process.
6. References
1. Burstein. M. H., "Concept Formation Through the
Interaction of Multiple Models." Proceedings of the Third
Annual Conference of the Cognitive Science Society, 1981

Cart)one«, J.G.. "A Computational Model of Problem

Solving by Analogy.* Proceedings of the Seventh
International Joint Conference on Artificial Intelligence,
August 1981 ,pp. 147-152.
Cart>oneil. J. G.. "Metaphor An Inescapable Phenomenon
in Natural Language Comprehension," in Knowledge
Representation lor Language Processing Systems. W.
Lehnert and M. Ringle, eds.. I^ew Jersey: Eribaum, 1982.
Cartsoneii J. G.. "Learning by Analogy: Formulating and
Generalizinci Plar.^ from Pist E/penence." m Machine
Learning. R. S. Michalski. J. G. Carbonell and T. M.
Mitchell, eds.. Palo Alto, CA: Tioga Pub. Co.. 1982.
Carbonell, J. G., "Invariance Hierarchies in Metaphor
Interpretation," Proceedings of m e Third Meeting of the
Cognitive Science Society. August 1981 , pp. 292-295.
Carbonell, J. G-, "Acquinng Problem Solving Skills by
Analogy." Proceedings of the Second Meeting ot the
American Association lor Artificial intelligence. 1962 ,
Centner, •., "The Structure of Analogical Models in
Science," Tech. report 4451. Bolt Beranek and Newman,
Johnson, M., 'Metaphorical Reasoning," 1982.
Unpublished meuiuscnpt
Kolodner, J. L., Retrieval and Organizational Strategies in
Conceptual Memory: A Computer Model. PhO dissertation,
Naomi Qulnn
Duka University

Lakoff and Johnson, In Mecaphors We Live By nature of the relationship they have, or alterna-
(1980), frequently allude to metaphors "in our tively, in terms of the effort involved in
culture." This paper explores the way in which sustaining that relationship. However, given
culture can be said to constrain metaphorical individuals are very likely to employ metaphorical
thinking in one domain, that of American marriage. models of both classes. When this is done, the
It undertakes systematic analysis of a. sample of metaphorical model selected from the class of
metaphors used by 22 American interviewees, spouses effortful activities matches an entailment of the
in 11 marriages, over an average of 15-16 hour-long metaphorical model which characterizes the nature
Interviews per individual. Superficially, the of the relationship. The mapping Is one of goal
particular metaphorical expressions used by a given implementation: that is, given some entailment of
individual would seem to vary widely. But these the relationship conveyed in one metaphorical model,
expressions can be shown to cluster around one or how can such an entailment plausibly be implemented?
another of a small number of underlying metaphori- Thus, for example, the husband with the model of
cal models to which that Individual consistently marriage as BUILDING A DURABLE PRODUCT (an effortful
returns. Thus a husband who conceptualizes his activity) is implementing, in this enterprise, the
marriage as BUILDING A DURABLE PRODUCT is able to goal of making a permanent marriage. Permanence is
express this underlying metaphorical model in terms entailed by BEING A COUPLE (a dual relationship),
of a number of different concrete products or a metaphorical model central in his thinking about
general types of products, sometimes switching from the nature of his marriage and others he knows
product to product in a single utterance: a metal (Qulnn 1981). For this husband, being a couple
in "we forged a lifetime proposition," an entails being permanently coupled together, hence
unspecified construction of the sort one might "durably built."
build in one's home workshop in "marriage is a do- The husband who regards marriage as TRAVEL (an
it-yourself project," something capable of effortful activity) means, by keeping his marriage
structural improvement in "our marriage was on the track and running it down the middle, that
strengthened," an edifice in "we made that the he regulates the proportion of time he and his wife
cornerstone," once again an edifice and then some- spend together and apart, or as he puts it, the
thing made out of a malleable material, perhaps proportion of time their "paths run together" and
clay. In "they had a basic solid foundation in "run apart." Hla metaphorical model for their
their marriages that could be shaped into something marital relationship is one of TWO PATHS CROSSING
good," and something like a car built out of (a dual relationship). Achieving a balance between
cannibalized parts, which then takes on the "crossed" time and separate time, which he views as
properties of a. chemical such as epoxy glue in "we the central entailment of marriage as TWO PATHS
have both looked into the other person and found CROSSING, he then conceptualizes in terms of the
their best parts and used these parts to make the necessity to stay on the path, keep on the track,
relationship gel." Another husband who conceptu- and steer a correct course.
alizes marriage as TRAVEL sometimes speaks of the Still another husband views his marriage as a
marriage as a train or trolley capable of "getting SPATIAL RELATIONSHIP (a dual relationship) in which
off the track," other times as foot travel in "he's spouses must be "pretty clear where each of us are"
running the same path I was before I got married." in some kind of uncharted territory, and "try to
and "If I weren't married I'd be running down the get a good sense of where we are" or else "we might
sane line," and still elsewhere as some kind of satellite far enough away so we're not sure what's
maneuverable vehicle in "I Just observe others' in between us" in what appears to be outer space.
marriages and try to run mine down the middle." If two people are constantly shifting position
Moreover, analysis reveals that the vast vis-a-vis one another, as in this SPATIAL RELATION-
majority of these stable metaphorical models them- SHIP model of marriage, the overriding concern is
selves fall into two broad classes: metaphors of to keep in contact. This is met, by this husband,
marriage as some kind of effortful activity—e.g., with an INQUIRY model (effortful activity), which
WORK, BUILDING A DURABLE PRODUCT, A QUEST, AH involves "space to kind of work out where each of
INVESTMENT, GROWTH, A STRUGGLE, A JOURNEY, TRAVEL us were," "a lot of searching," "miles of talking,"
— o r marriage as some kind of dual relationship— -and "communicating." These husbands all seem to
e.g., A PARTNERSHIP, TWO PATHS CROSSING, MUTUAL PAR- agree that marriage requires some effortful activity.
ENTING, BEING A UNITED FRONT, BEING A PAIR, BEING The particular effort required depends upon a prior
ONE PERSON, BEING A COUPLE, A SPATIAL RELATIONSHIP. conceptualization of the nati-re of the relationship
Thus the superficially variable metaphors which itself. In each case, the metaphorical model of the
Interviewees employ can be seen to be highly con- relationahip is problematic in some entailment, the
strained. What constrains them is apparently some problem solution becomes a goal of the marriage, and
kind of (still deeper) underlying folk theory about a solution for this marital goal is couched in a
the nature of the marital relationship, which says further metaphorical model. Whether this view of
that such an enduring attachment between two people marriage aa problem and solution is universal cross-
takes effort to achieve or insure. culturally, or whether it is a distinctively Ameri-
Individuals are free to conceptualize their can way of viewing marriage, perhaps certain other
marriages in terms of any kind of experience drawn relationships, and even other aspects of life, I
from either or both of the two classes, dual rela- can only speculate.
tionship and effortful activity. They may also What do these observations suggest for a theory
choose to foreground metaphor from one of these of metaphorical understanding? First, ongoing
to the neglect
marriage of the
primarily or other—understanding their
entirely in terms of the metaphorical understandings are relatively stable
and chese stable uaderscandings are baaed in under-
lying metaphorical models. Metaphorical expressions
which instantiate a given model can be varied at
will to take advantage of different properties of
concrete objects or events. Thus MARRIAGE IS
understand his marital axperience in terms of
conemstones, reassembled parts, and the gelling
process with equal facility.
Second, these underlying metaphorical models
themselves cannot be anything at all. They are
constrained to members of those classes which are
culturally appropriate source domains for the target
experience (to use Carbonell's [1981] terms).
Members of a culture share knowledge of these
appropriate source domains. A considerable
economy of learning and memory is achieved in this
organization of cultural knowledge by metaphorical
class that would be lost if target experiences were
assigned directly to culturally permissible
metaphors or metaphorical models. Within classes,
an Individual has latitude in selecting whatever
metaphorical model does the best Job of character-
izing, for that person, the target experience.
Third, underlying metaphorical models cannot
be studied in isolation from one another. In ongoing
understanding, they frequently bear relationships to
one another. Here I have given an example of
metaphorical models which are mapped onto entail-
ments of other metaphorical models by way of goal
implementation. Elsewhere, Johnson (1982) has
provided a hypothetical example of a different kind
of mapping, which we might distinguish as substitu-
tion. While metaphorical models, as I have claimed
here, are relatively stable understandings of
experience, it often happens that one such modal
ceases to adequately capture experience for its
user. Another model which shares multiple entail-
ments with the earlier one may then be substituted
for it; the shared entailments serve as bridges.
If time allowed, I would give additional examples
of such substitution froa my material. For
instance, the husband who conceptualized marriage
as BEING A COOPLE felt that he and bis wife were
growing closer over time, and spoke of his more
recent marital experience as BEING ONE PERSON.
Given goal impleaentatlon, substitution, and other
possible relationships between metaphorical models,
it becomes critical to study the vinderstanding
process in the context of life story discourse.
Carbonell, Jaime, Jr.
1981 Metaphor: an inescapable phenomenon in
natural language comprehension. Unpub-
lished ms.
Johnson, Mark
1982 Metaphorical reasoning. Unpublished ms.
Lakoff, George and Mark Johnson
1980 Metaphors We Live By. Chicago: University
of Chicago Press.
Quinn, Naomi
1981 Marriage is a do-it-yourself project: the
organization of marital goals. In
Proceedings of the Third Annual Conference
of the Cognitive Science Society, Berkeley,
California, August 19-21. Pp. 31-40.

Metaphoric Gestures
David McNeill
University of Chicago

An analysis of the internal structure motion, and in the sentence are referred to
of saying, for example, "that's ok" as by the subject of a verb of motion ("they
based on pointing and saying "ok," suggests wanted to get"); and the motion of these
a form in which this (virtual) action could participants in the gesture was of small
be expressed—namely, as a pointing extent and ineffectual, and in the sentence
gesture. Such a gesture would be regarded are referred to by the subject of the verb
as a second manifestation of the internal "want." All of these parallels are
action structure—the utterance of "that's explicable if the gesture and utterance
ok" being the first (first in communicative were joint manifestations of the same
importance). The same can be said of other internal structure—a synthesis based on
utterances. The externalization of action the idea of placement and movement of
structures in gestures offers a way of objects. This idea is a metaphor for
studying the internal organization of pursuit and inaccessibility. (It is well
language actions that is separate from to remind ourselves that the relationship
speech. The gesture and the speech can be between the structure of language actions
compared in a relationship that is and that of language objects—these being
comparable in its ability to bring out two completely different perspectives—is
details to triangulation. anything but clear; therefore it is not
Internal thoughts of actions— particularly interesting to ask how
manipulations and movements of objects in "thoughts based on actions such as placement
the world—seem to play a metaphoric role of objects translate into deep structures
in language actions. In producing speech a or other linguistic object configurations.)
concept or meaning is shown through a Gesture evidence reveals a very
(virtual) action—this imaginary widespread use of metaphoric thinking in
manipulation or movement of objects. In performing language actions in which
the following example the concepts of thoughts related to actions are used to
pursuit and inaccessibility are presented show meanings of a non-action kind.
in a complex gestural image of moving but Mathematics discussions are
non-closing objects. This image accompanied by a flow of gesture which show
immediately presents a global and undivided mathematical ideas in the form of actions.
picture of the conceptual content, while The mathematical meaning of a dual is that
concurrently the content is segmented into each concept is replaced by its converse;
words and arranged across time in the for example, the dual of upward is
speech channel (the fact that the gesture downward. The following examples (2-4),
image arises first shows that it is not a taken from non-consecutive places in a
response to the words). technical mathematics discussion, each
(1) Speech: they urn wanted to get contain a gesture in which a hand rotates
where Anansi was through the air from one orientation to the
Gesture: both hands held apart opposite orientation; the gestures
in the air, right hand therefore show the concept of a dual in the
flutters back and forth action realm.
(where the underlining (2) Speech: this gives complete
shows the temporal duality
extent of the gesture). Gesture: right hand palm rotates
The synthesis of thoughts on which upward
this language action was based (as revealed (3) Speech: when you dualize
in the gesture) was a (virtual) placement Gesture: right hand palm rotates
of two objects, one in motion, but without downward
closure. This image shows directly the (4) Speech: the powers of x kind of
concepts of pursuit and inaccessibility. give a dual
The utterance of "the wanted to get where Gesture: right hand palm rotates
Anansi was" is an expression of the same front to back
internal structure, as numerous detailed Another mathematical concept is that
parallels of form between the speech and of a limit, and in the following examples
gesture channels show. For example, the the hands move toward some boundary marked
participants (referred to by the pronoun by the other hand or a sudden stop; thus
and proper name) correspond to the two these gestures also are images of a
hands (that is, the gesture was two handed mathematical concept in the action realm.
rather than one handed). The two hands (5) Speech: it's an inverse limit
were held apart at spatial extremes, and in Gesture: right hand flattens;
left hand moves uo to
the sentence appear at temporal extremes
(rather than together as would have been 18
possible in a frame such as "the sons
[coreferent of "they"] and Anansi couldn't
get together"); one participant is not in
motion, and in the sentence is referred to
in a stative locative construction ("where
Anansi was"); the other participants are in
right hand
(6) Speech: the inverse limit
oTT..(trails off)
Gesture: right hand goes down,
then up as to a
(7) Speech: which is a limit, a

Gesture: right hand moves down,

then up as to a
Example (5) also included a second
gesture that showed the concept of an
(5') Speech: it's an inverse limit
Gesture: right hand moves in a
tight loop
The concept of finiteness is shown by
enclosing or pinching down on a space by
curling the fingers and hands; thus here
too is a mathematical concept in the
action realm.
(8) Speech: through the finite
Gesture: fingers curl inward
(9) Speech: to get the finite group
Gesture: fingers curl inward
(10) Speech: some finite group
Gesture: forms a two handed
bounded shape with
palms facing and
fingers curled
A rule of gesture production is that
new movements indicate changes of meaning;
and so a gesture can indicate the emergence
in discourse of a new element of meaning
("information focus"). Thus in (2), for
example, the new element was the concept of
duality, and the other examples can be
interpreted in a parallel way.
Utterances are structured to make
salient the same elements of meaning. This
is another parallel that suggests a common
source for gesture and speech. In (2),
"that gives complete duality" was
structured and pronounced to achieve the
same effect as the gesture: reference to
the concept of duality was held off until
the final sentence position (the position
of the rheme) where it was given main
stress, and was introduced in full lexical
form. On the other hand, the sentence
topic was announced first with a pronoun,
and was weakly stressed. The transitive
sentence form also enhanced the information
focus of duality. Internally the model for
(2) seems to have been that something (the
sentence topic) was pushing forward the
example the gesture demonstrated of duality
(hence the use of "gives").
by George Lakoff
Ualvarslcy of California ac Berkeley

Johnson and I (1980) have argued that meta- standing?

phors are essentially conceptual In nature, rather - What does it mean to live by a metaphor?
than linguistic, and that a metaphor provides a - Which metaphors do we believe, and which don't
way of understanding one kind of thing In terns of we believe?
another. Since we base our actions in our under- - Are some metaphors more essential than others
standing, and since actions are real, it follows in defining a concept?
that reality, especially on the social, interper- And finally:
sonal, and emotional domains, is structured accord- - To what extent does a given metaphor "create"
ing to our metaphors. Though I think this is an the structure of a concept it defines, and to what
essentially correct view, it is certainly over- extent does it merely "decorate" an already given
simplified and in need of more detailed study. structure?
Here are some of the questions that I think In addition, I will review very briefly some
need to be answered: results on image-based metaphorical concepts.
- Which conceptual metaphors do we live by, and These results suggest to me that even spatial
which do we use "merely" for the sake of under- understanding may not be universal.

S y m p o s i u m — C o n t r o l o f A r m s
Peter H. Green*
Conputtr Science Oipartnent
Illinois Inititut* of Tochnolojy
Chicigo, Illinois iOi\i

Hon can our nervous systens control all the vari- 'virtual a m * in such a way that a satisfactory bal-
ables needed to 9uid* our arnsT How can ue represent listic novenent exists. Thus, a conplicated physical
the abstract pattern of an action such as handwritin; arn behaves like a fanily of easily controlled virtual
50 that it nay be realized in any of an infinity of arns.
variants—lar^e, SHall, horizontal, vertical, vith the Anong the points to be discussed: Using nonentun
hand weighted, or even by holding the pencil station- saves energy, and it sinplifies control. Novefients
ary and noving the paper? (lost researchers who try to nay be controlled by sending new paraneters to systens
represent novenents of artificial arns by neans of that control the nusdes, rather than by controlling
conputer pro^rans have chosen, in the interest of sup- the Nuscles directly. The principal task in tracking
posed conputational sinplicity, to use the smallest a noving object, rather than being to mninize instan-
nunber of "degrees of freedon*, or independent joint taneous errors, nay be to synchronize an internal pat-
novetients that will allow desired hand novenents. I tern generator to the noveitent. The present style of
will discuss the opposite idea: nanely, the idea that control identifies sinilar novenents as cousins, rath-
a large nunber of "redundant" degrees of freedon, when er than regarding then as unrelated canputations. The
used in the style that I will discuss, can sinplify sane overt nuscle noveoent can be nore or less diffi-
the control task, in that, if there are enough ways of cult, depending upon the higher up patterns in this
noving, a recipe involving just a few of then can usu- hierarchy fron which it is derived.
ally be found that will approxinate any desired nove- (See Greene, P.H. (1»72), Problens of organization
nent. In particular, the presence of "redundant" de- of Notor systeHS, in Rosen, R. and Snell, F.H., eds.,
grees of freedon allows us to rely nore on ballistic ErS3rSSi.iS.IbjarSii£3l-_lioioay^__Vaii_2, New York:
(free-swinging) novenents than is generally done in Acadenic Press and Greene (1782), Uhy is it easy to
research on artificial arns, so that physics, rather control your arns? j8SI£Sal_a£.!!aiaE_£a]»iEaif to »P-
than coflputation, accounts for nuch of the trajectory. pear.)
Conputations are required to set up the constraints
defining and initializing a low-degree-of-freedon

Internal Directional Reference Frames for
Motor Coordination

C.C. Boylls
Rehabilitative Engineering
Research and Development Center
Palo Alto Veterans Administration Medical Center
Palo Alto, California 94304

Several decades ago, Graham Brown (11) found that the biasing, nor would it define the conditions that
spontaneous walking of a high-decerebrate cat can be presumably spur the olivocerebellar system into
continuously transformed from rectilinear locomotion establishing a particular directional reference frame.
into either circling or uphill/downhill progression by A speculative approach to the last question is
appropriate changes of head position. The cat's suggested by neurophysiological studies of the
performance thus carries with it an attribute of olivocerebellar system and its role in regulating eye
"spatial directionality" which can be independently movement (ref. 1 for review). In brief, activation of
regulated by the CNS; and the method of regulation the appropriate (anatomically) part of the system
relies. In this instance, upon postural biases created Institutes a seconds-long nystagmus of the eyes
by tonic neck and labyrinthine reflexes. seemingly equivalent to the olivocerebellar postural
Recently, experiments using decerebrate cats similar biasing of the skeletal muscles described above. This
to Graham Brown's have indicated that activity within nystagmus also resembles the phenomenon of optokinetic
the olivocerebellar system of the brainstem Is after-nystagmus (OKAN) that occurs in humans and
associated with postural alterations resembling those animals following exposure to whole-field motion of
elicited from neck and labyrinths (4,5). These, too, the visual world. It may come as no surprise,
bias the locomotor musculature so as to influence the therefore, that the olivocerebellar system has proved
overall directionality of walking in a wide variety of to receive retinal image-motion cues which are nearly
ways. However, there is one area in which the optimal for optokinetic eye movements. What Is more
directional control exerted by the olivocerebellar interesting is that, in stationary human subjects, the
system differs considerably from that seen by Graham development of OKAN is associated with illusory
Brown: It has "mefflory", in that a posture adopted by sensations of self-motion or "vection", which, in
an animal as a function of olivocerebellar activity is darkness following exposure to the moving visual
retained for many tens of seconds after that activity stimulus, persist for prolonged periods of time (7).
ceases. By contrast, the postures of Graham Brown's The rationale for this persistence, or "memory", would
animals reflect only the current position of the head, appear to involve an appreciation for momentum: The
without any apparent recollection of previous subject feels accelerated to some velocity by the
positions. The directional skews associated with head moving visual world, and has no reason to feel
movement can thus be changed in "real time" from step decelerated when that world is no longer visible.
to step, while olivocerebellar skews establish an While appropriate studies seem not to have been done,
enduring postural context within which many steps (or it seems reasonable to suppose that humans and animals
other activities) may occur. It thus is tempting to experiencing vection will alter their motor behavior
hypothesize that the olivocerebellar system exists in as a function of this sensation. Just as they would
the CNS to regulate, via postural mechanisms, an were they experiencing actual self-motion. Because of
internal directional reference frame within which the long time-course associated with vection, such
motor actions are elaborated and, perhaps, evaluated. motor adjustments will likely take the form of
But then, why should such a faculty exist? "static" postural biasing altering the directionality
The idea of an internal directional reference for of movement. Might this be the sort of directional
movement was first derived theoretically from skewing produced by the olivocerebellar system? Might
consideration of CNS mechanisms to simplify the the perception of self-motion along particular
controllable degrees of freedom in the skeletomotor trajectories be associated with the creation of
system (2,8; P.H. Greene, this volume). The technique olivocerebellar directional reference frames for
for doing this is to create functional dependencies movement?
(e.g., fixed ratios) amongst movement parameters The arguments above have been helped somewhat by the
affecting different joints, as is frequently demonstration that vection sensations (and
encountered experimentally (10,9). One particular accompanying "OKAN") can be released by proprioceptive
form of functional dependence employs so-called cues from the limbs ( 6 ) — which, besides providing a
"muscle linkages" (3) of synergists at different role for the massive somatosensory input to the
joints, the activities of which covary in some olivocerebellar apparatus. Indicates that self-motion
prescribed manner (cf., ref. 12 for experimental cues derive from multisensory processing. Those cues
examples). Actions carried out with such a linkage probably also owe themselves to knowledge of efferent
are characterized by a distinct directional skew that command signals, since the quality of self-motion
becomes quite apparent as the covarying parameters of illusions depend upon a subject's assumptions about
the linkage are altered. Graphic illustrations of movement he or she is producing voluntarily.
such a process may be seen In, for instance, Graham Fortunately, it appears possible to go back to Graham
Brown-like changes in the coactivation of human leg Brown's cat and its olivocerebellar system to see
musculature (elicited with galvanic labyrinthine whether the directional skewing it participates in can
stimulation) as a continuous function of neck position be triggered by those conditions leading to vection in
(13). Consequently, one might well see olivocerebellar humans. Wor1( is now underway toward that end.
directional biasing as just another way to References
parameterize muscle linkages and simplify the motor 1. Barmack, N.H. Immediate and sustained influences of
control process. But this would provide no facile
22 for the extended time-course of such
visual olivocerebellar activity on eye movement.
In: Talbott, R.E., and Humphrey, D.R. (Ed.),
Posture and floveflient. Raven (New York), 1979:
2. Bernstein, N.A. On the Construction of Movement,
Medgiz (Moscow, 1947).
3. Boylls, C.C. A theory of cerebellar function with
applications to locomotion. II. The relation of
anterior lobe climbing fiber function to locomotor
behavior in the cat. In: COINS Technical Report
75-1, Dept. Computer * Information Sciences, Univ.
Massachusetts (Amherst), 1976.
4. Boylls, C.C. Prolonged alterations of muscle
activity Induced in locomoting premammillary cats
by microstiroulation of the inferior olive. Brain
Res., 197fi, 195: 445-450.
5. Boylls, C.C. Contributions to locomotor
coordination of an olivocerebellar projection to
the vermis in the cat: Experimental results and
theoretical proposals.. In: Courville, J., Lamarre,
Y., and de Montigny, C. (Ed.), The Inferior Olivary
Nucleus: Anatomy and Physiology, Raven (New York),
1980: 321-348.
6. Brandt, T., Buchele, W., and Arnold, F.
Arthrokinetic nystagmus and eao-motion sensation.
Exp. Brain Res., 1977, 30: 331-338.
7. Brandt, T., Oichgans, J., and Koenig, E.
Differential effects of central versus peripheral
vision on egocentric and exocentric motion
perception. Exp. Brain Res., 1973, 16: 476-491.
8. Greene, P.H. Problems of organization of motor
systems. Prog. Theor. Biol., 1972, 2: 304-338.
9. Kelso, J.A.S., Holt, K.G., Rubin, P., and Kugler,
P.N. Patterns of human interlimb coordination
emerge frDm the properties of non-linear, limit
cycle oscillatory processes: Theory and data. J.
Mot. Behav., 1981, 13: 226-261.
10. Lacquaniti, F., and Soechting, J.F. Coordination
of arm and wrist motion during a reaching task. J.
Neurosci., 1982, 2: 399-408.
11. Lundberg, A., and Phillips, C.G. T. Graham Brown's
film on locomotion in the decerebrate cat. J.
Physiol. (Lond), 1973, 231: 90P-91P.
12. Nashner, L.M. Fixed patterns of rapid postural
responses among leg muscles during stance. Exp.
Brain Res., 1977, 30: 13-24.
13. Nashner, L.M., and Wolfson, P. Influence of head
position and proprioceptive cues on short latency
postural reflexes evoked by galvanic stimulation
of the human labyrinth. Brain Res., 1974, 67:

Conscioua and unconscious coi.ponents
of intentional control.
Bernard J. Baars and Diane N.
State University of Mew York at Stony Brook.

How is intentional action controlled? Other Figure 1

papers in this sw.posiur.s provide evidence for
a style of motor control in which executives Globally Distribute) In form it Ion;
issue very general cotrjr.ands, which are interpre-
ted "distributively" by intelligent specialized
sub-systems, which are sensitive to local context. CONSCIOUS
LiKewise, there are classical suggestions that
conscious con-.por.ents of intentional control serve
an executive function, but without controlling
motor systeirs in great detail: instead, the sub-
systeir.s controlling actions interpret very siir-ple
conscious contents intelligently, with a view to
local context (, 1890). We suggest that there
is much to be said for' view of intentional
control; ^Jrther, his view fits a conception of
conscious processes advanced by Baars (in press), Syttam*
Spaclall imd
suggesting that conscious representations are
global , coherent , and informative in a (UacanMlaiM )
r.ervous system consisting of distributed special-
ists which control all information processing
details (Fig-jre 1).
Table 1: Capability Constraints Note that conscious contents are globally
Dn a theory of conscious contents. available, but most detailed information process-
Conscious Processes Unconseious processors ing is performed locally by a large set of
specialized, distributed processors. Ihe special-
1. Computationally Highly efficient in ized processors maintain the processing initiative.
Lneffie lent. specialized tasks.
Now consider the facts shown in Table 2 about
2. Great range, A Limited domains &
contrasts between conscious and unconscious aspects
relational capacity. relative autonomy.
of intentional activities.
3. Apparent inity, Very diverse, parallel,
seriality, 4 limited and together have
great capacity. Table 2
; Conscious eonsponents Unconscious components (»)

Table 1 shows a set of widely-accepted facts Probleir. assignment Problem incubation

about conscious vs. unconseious processes which Problem solution (.ihal)
fit this fieneral view. Like conscious processes, Goal representation Goal execution
entirely global processes are !ear.putationally Goal feedback Open-loop adjustment
inefficient because they require the cooperation of future actions.
or tacit ccnsent of maiy other processes to remain
Biofeedback signal System, controlling
i^lcbal. TViey have great range' of possible content
siroe any specialist, or set of specialists has
potential access to the global data base, and
great relational capacity, for the san-.e reason. Seriality of non- Parallelisr. of automatic
Global representations, like conscious contents, automatic tasKS. tasks.
have apparent jr.ity' because internal contradic-
tions woula ur.ply cotr.petition between different Stimulus for reflexes Detailed control cf
prccesscrs, which '.jculc destabilize the global and externally-driven reflexes and automatic
representation; !-.ence any oom.peting representations automatic tasks. tasks.
rr.ust ^ displayed serially , and the global
component would seem to have limited capacity.. Intentional modulation
Siff-.ilarly, the unconscious processors of Table 1 of reflexes and autom.a-
resemble the specialized processors of Figure 1. tic tasks.
Though this is only a first-approxination m.odel of
conscious v3_. unconscious activity, it will serve (•) Some of these may be mor.entarily conscious,
as a basis for approaching conscious vs. unconscious but too briefly to be retrievable subsequently.
components of intentional activity. Note first that in classical problem-solving
tasKS, the stage of . problem.-assignr.ent — the
accur.ulation of constraints on a possible solution
24 — is conscious; however, all the detailed pro-
cesses working toward a solution operate urcon- autonomous systeir.s with fewer global messages.
sciouslv, while t^e solution itself con- Schneider (1980) has found that tasKS which are
scious unexpectealy, as ar. "nha!" experience. In initially slow, serial and capacity-lljr.ited become
intentional problem solving, the very fact increasingly fast, parallel and unli^.ited as they
that a goal is trade conscious sen/es to trigger becoff.e autom.atic with practice. This is alm.ost a perfec
'jnccnscious systeir.s able to contribute to this goal. characterization of the difference between global and
This fits the rough T.odel of Figure 1, since distri- local processes in the current model.
buted specialists can be triggered hy a global Coff.petitlon:
display of a goal. These specialists then work One of the most important properties of the model
locally on a solution, and can return a solution is that it permits competition; there is m.uch reason
to the global display when they reach it. In the to think that corr.petition plays a central role in the
ciassioaj. prcblea.-solving case, the differences control of intentional activity (Norman 4 Shallice,
between conscious and unconscious parts are quite 1980). One can in-.agine a nur.ber of different kinds of
obvious; however, xuch the sarr.e corr.ponents coff.petition in this model: also operate in other cases of intentional 1. Conflicting intentions: intentions may be incompa-
control, where they ir.ay occur much more quickly jini tible. Ln this, often the mismatching components
less discretely. seem, to become conscious.
For example in biofeedback training, a conscious 2. Conflict between superordinate and subordinate compo-
feedback signal is triggered by an otherwise uncon- nents of a single intention. This is typically the
scious neural process. In itself, this is sufficient case with psychopathologics (see Table 5 ) .
for intentional control of the unconscious process 5. Conflict between an intention and its execution.
to develop. The model suggests that the feedbacK is Slips can be defined as actions that violate the
"broadcast" globally, throughout the nervous system, actor's own expectations (Baars 4 Mattson, 1981).
so that one subsystem out of maiy millions that can Slips often become conscious, perhaps because
control the feedback can "decide" to act whenever global broadcasting helps to reeouple a previously
the feedback occurs. In this fashion, sensory feed- deeov^sled goal cor.ponent, whose absence penritted
back can come to control otherwise totally unrela- the slip to occur.
ted neural processes: thus, a feedback click can U. Conflict between intentions and external reality.
cocr.e to control a .single TOtor unit (Basmajian, And of course, som.etimes the m.eans needed to carry
iQ'^j), and the tast of saccharin can come to elicit out an intention are unexpectedly unavailable.
suppression of ircir.-jne function twder \ Cohen, 1982). Table 5
The "executive ignorance" of conscious pro-
cesses is not llir.ited to new or exotic intentional Perceived intentional vs. unintentional activities.
control tasks. Williaa James (1890), among others,
has pointed out that "we" do net know in any detail Intentional: Unintentional:
how we do anything . One can account for this Sense of some Sense of no
ignorance by assuring that we do not need to know conscious control conscious control (»)
anything: we can just know the goal consciously,
and :jnccnscious hut very intelligent specialists Most ordinary actions, Actions: compulsions,
will take care of execution of the goal. thoughts, iirages, and undesired habits, slips,
flote also in Table 2 that feedback from an feelings. ties, speech defects,
intentional action is ecnscious, a fact that pre- and addictions.
sur.ably permits 'jnconscious Ijr.proven-.ents in planning Thoughts and images:
and execution to take place, in preparation for the phobic, obsessive,
next tur.e that the action will be performed. Ihis hallucinatory, anxiety-
is especiallv true if there is a mismatch between provoking, depressive.
the intended action and its performance. Feelings: anxious,
But the case has so far been over simplified. depressive, etc.
In fact, we cannot think of an action as being
controlled by a single goal. Baars 4 Mattson (1981) Effect of "paradoxical Resisted unintentional
maintain that an intention is indeed a multi- intention" on unin- activities.
leveled gcal structure'^, of v*-.ich only a few goals tentional activities
tend toFurther,
be conscious.
The multi-leveled
all intentional
actions Success in well-known Failure in well-known
be separated
of a continuous
into presuppositions
mixture of conscious
of the and tasks tasks (TOT phenomenon)
and subordinate
most Skeletal Reflexes, autonomic func-
the conscious
tend to goal.
be largely unconscious, m.uscle control tions, and automatic pro-
while these components that are new or involve som.e cesses cued externally.
choice-point may be conscious. Thus, in skilled
typing, we may be conscious of non-routine starting Internally motivated Externally coerced actions.
points of action, of input and output, and of actions. Actions triggered by direct
attempts to override, modulate, or interrupt the brain stiir.ulation (Penfield 4
typing task. Generally we seem, to be unconscious Roberts). Slips induced
of the mapping between letters and finger-strokes, experim.en tally.
of the details of motor control, and of highly Activities whose pace Activities that are forced
repetitive input or output. is unforced. at a pace faster than normal.
nS we acquire proficiency in a task, it tends (•) Some processes which do not yield a sense of
to becom.e less and less conscious — in terms of 25
the m.odel, it tends to be consigned to specialized,
conscious control may in fact be triggered by
brief conscious contents that cannot be retrieved.
We suggest the following general conclusions,
based on the material presented in Table 3:
Intentional activities appear to be triggered
by conscious contents. Intentions are violated not
only when the action is jnexpeeted, but also when
the subordinate systeir appears to resist control
— e.g. when it takes longer to find a certain
word than one expects. This suggests that inten-
tions carry infonr.ation about the typical duration
and difficulty of a known task. Further, it also
suggests that "rr.ental effort" occurs not as a
function of the coir.plexity of a task, but rather,
as a function of th.e deRree of perceived resis-
tance to the intention, car.pared to the expected
d'jration and difficulty of the tasK. This view
tr.ay also help explain the related case of
perceived coercion (a case of unintentional-
r.ess which is r.ot just a political fact, but
also occurs very often in our educational system).
Such perceived coercionft-an:an outside
source tr.ay bring about a great deal of internal
cctr.petitior. between systems attempting to exert
executive control in a way that is insensitive
tc the den'.ands of the subordinate system. Gne
implication is that intentions, too, have their
own "ecology": a successful intention must fit
into the system as a whole, or competition will
occur which will increase the perceived effort
in carrying out the intention.

Baars, B.J. Conscious contents provide the

.-ervous systetr. with coherent, global infonr.ation.
In R.J. Davidson, G. Schwartz, i 0. Shapiro (eds.)
Corsciousness i self-regulation (Vol. j). N.Y.:
PI en or., in press.
Baars, E.J. 4 Mattson, M.E. Consciousness and
intention: n framevcrk and some evidence. Cognition
j^ Brain Theory, igfll, 4(3), 2U7-265,
Basmajian, J.V. Control and training of indi-
vidual T.ctor units. Science. 1963, 141, 4iW-441., W. The principles of psychology. N.?.:
Holt, ifigo.
Schneider, W. 4 Fisk, A.D. Dual task autom.atic
and controlled processing of temporal and spatial
patterns. (Tech. Rep. 8002) Univwsity of Illinois,
February, 1980.

S u b m i t t e d P a p e r s
How do Children Learn to Judge Gramnatlcality?
A Psychologically Plausible Computer Model
Mallory Selfrldge
Department of EE and CS
University of Connecticut
Storrs, CT. 06268

1.0 Introduction children, and learns a subset of the word meaning

and syntax which children leam. After learning,
If a young child is asked whether the sentence CHILD can correctly understand utterances which it
"ball me the throw" sounds "silly" or "ok", chances previously misunderstood, and can generate English
are the child will respond "silly." Encouraged to describing events it "observes."
"fix it up," the child may well generate "throw me child's language comprehension process is a
the ball." Such behavior was reported by Gleitman version of the CA program [4] which incorporates
et al. [7] for children of two-and-a-half and five mechanisms derived from Wilks' [I6] preference
years. It implies that by these ages children have parsing. CHILD'S analysis process combines Con-
acquired at least some ability to Judge a ceptual Dependency (CD) [ U ] word meanings to form
sentence's grammaticality. Further, Gleitman et a CD representing the meaning of the entire
al. report that by age five, children's judgements utterance. Understanding begins when the meanings
increase in sophistication. Thus children's abili- of input words are placed in a short term memory.
ty to judge granmaticality apparently Increases CHILD then retrieves semantic requirements associ-
as they learn language. ated with those slots but specific to that parti-
Unfortunately, little is known about the cular word. It searches the short term memory for
mechanisms responsible for the development of such a word meaning which best satisfies those features,
abilities. Pinker [lO], reviewing language acqui- and fills the empty slot with that meaning. The
sition models, reports no work in this direction. syntactic features are formed from the positional
Anderson's [l] model of language learning does predicates PRECEDES and FOLLOWS. These relate
not address learning to make granmaticality judge- the position of a candidate slot filler to either
ments. Recent research (e.g. [2,3,8]) on syntac- the word they were stored under, a filler of
tic recognition and learning has not been integra- another slot in that word's meaning, or a lexical
ted into a model of child learning. The question function word. Each slot in the meaning of a
ranains: "how do children leam to make gramnati- word has a collection of features describing where
callty Judgements?" in the input a filler is expected to be. In order
This paper addresses this question by proposing to understand different voices, these features
a three stage model, implemented and tested in the are organized into disjunctive "feature sets."
CHILD program [12,13,14,15]. During stage one Each set characterizes one order in which slot
CHILD knows word meanings but not syntax, and can fillers appear. During understanding, feature set
understand sentences, but cannot tell that word selection Is performed by considering which set
order is incorrect. During the second stage, CHILD most successfully characterizes the input.
has learned active syntax, and notices incorrect CHILD learns syntax by acquiring syntactic
word order for active sentences. During the third features and build dlsiunctlve feature sets. After
stage, CHILD learns passive syntax, and notices having understood an utterance, CHILD examines a
incorrect word order for both active and passive record of the input, and examines the meaning of
sentences, this progression corresponds generally every input word. It then examines every empty
to Gleitman et al.'s finding that as children slot in each such meaning. It accesses the record
learn more language their ability to make grammati- of the input to find where in the input the filler
cality judgements increases. for that slot occurred. It creates a description
child's mechanisms may provide part of the of this position using PRECEDES and FOLLOWS. CHILD
answer to the problem of how children learn to make must then decide whether this description consti-
grammatical!ty judgements of sentences with incor- tutes a new feature set or should be merged with
rect word order. These mechanisms have been an existing feature set. CHILD'S strategy is based
developed to account for a number of different data on a suggestion by Iba [8]. CHILD compares the
about child language learning [l4], and their ex- features extracted from the current input with any
tension to the problem of granmaticality judge- existing feature sets: the position description
ments has been straightforward. The CHILD model is merged with a previous set only if one is a sub-
suggests that children learn to make such Judge- set of the other. Otherwise, the description Is
ments almost entirely as a side-effect of mechan- learned as a new feature set.
isms whose primary function is directed elsewhere. CHILD notices that a sentence is ungrammatical
This paper describes the CHILD program, and pre- if any syntactic features within the selected
2.0 The
sents CHILDoutput.
sample ProgramThe question of learning
feature set characterizing the position of a slot
to make grammatlcality
CHILD is a computer model Judgements is considered,
of the development of filler are not true of the position of that filler.
and several
language predictions and
comprehension are generation
described which may
abilities CHILD uses these features to generate an explana-
written orin deny
and currently running on a tion of
3.0 why thetosentence
Learning was ungrammatical,
Make Grammatlcality and
DEC VAX 11/780. It begins with world knowledge and uses its language
The following generation
example abilities
is edited [6] to
from a complete
language experiences similar to those received by generate a correct version based on the sentence's
understood meaning according Co whatever word mean-
ing and syntax it knows about at that stage of
run of the program during which it learns meaning learns passive syntax for "throw," and creates a
and syntax for all the words it knows. The exam- second feature set for the new syntactic features.
ple begins after CHILD has learned meanings for
"throw", "me". "Child," "Mom," "on." "table," and
CHILD hears: "the ball was throw n to Mom by
"ball." For this example, the meaning of "throw"
has been simplified from a complex CD into $THROW
child's understanding is:
and processing of "the" has been ignored. As
shown below. CHILD is initially given an ordinary
sentence which it understands correctly. The CHILD learns syntax of "throw"
second sentence has incorrect word order. CHILD order is: OBJECT "was" STHROW "to" TO "by" ACTOR
understands this sentence correctly, but falls to ATTEMPTING MERGE OF NEW FEATURES WITH EXISTING SET
notice the incorrect order. MERGE FAILS
:HILD hears "throw me the ball" CREATING NEW FEATURE SET
CHILD understands
(STHROW ACTOR (CHILD) OBJECT (BALL!) TO (PARENTl)) Once CHILD has learned passive syntax (reported
in more detail in [l5]). It can then Judge passive
:HILD hears "ball me the throw" sentences. It correctly Judges the passive
pHILD understands sentence which it previously Judged Incorrect. The
KSTHROW ACTOR (CHILD) OBJECT (BALLl) TO (PARENT!)) second sentence below is an incorrect passive, and
CHILD correctly understands it, prints out the
Transition to the second stage occurs when reasons It was Judged incorrect, and generates its
CHILD learns active syntax for "throw." This corrected version.
occurs when it hears an example sentence whose
CHILD hears
interpretation is unambiguous, and has heard the
"the bjll was throw n on the table by Child"
word a number of times without modifying its mean-
CHILD understands
ing. Given this sentence, CHILD notes the posi-
tions of the fillers (summarized in linear order
here), and stores them in a feature set under the
CHILD hears: "the ball to Mom throw n by Child"
word "throw."
child's understanding is:
CHILD hears: "throw me the ball"
CHILD'S understanding is:
CHILD learns syntax of "throw" "the ball" should precede "was"
order is: STHROW TO OBJECT "thrown" should precede "to Mom"
CORRECTION: "the ball was thrown to Mom by Child"
CREATING NEW FEATURE SET As shown above, CHILD does progress through a
series of stages which generally correspond to data
reported by Gleltman et al., during which it learns
Having learned active syntax for "throw," CHILD to make increasingly accurate and complex graamad-
uses this knowledge during stage 2 understanding. callty Judgements. Initially knowing no syntax,
It notices when the word order of a sentence is all sentences are Judged correct. After learning
Incorrect, as the first sentence below shows. active syntax, it successfully Judges active
CHILD prints out the reasons it thought the sen- sentences, but Judges passive sentences as if they
tence was incorrect, and generates a correct should have been active. Upon learning passive
version from the understood meaning of the sen- syntax, CHILD Judges both active and passive
tence. However, CHILD also decides that a pas- sentences correctly. In the complete run, CHILD
sive sentence with correct order is incorrect, as also learns to understand noun phrases, preposi-
the second sentence below demonstrates. tional phrases, and adverbial phrases, and learns
CHILD hears "ball me the throw" to make Judgements about sentences containing
CHILD understands theseHow
4.0 Do Children Learn to Make Gr< ticality
"throw" should precede "ball" child's answer to this question depends upon a
"throw" should precede "me" number of factors: a) the representation of
CORRECTION: "throw me the ball" language syntax as a set of independent features
CHILD hears characterizing the position in the input where a
; "the ball was throw n on the table by Child" slot filler may occur; b) learning of syntactic
CHILD understands features while learning to understand; c) the
(STHROW ACTOR (CHILD) OBJECT (BALLl) evaluation of syntactic features and semantic
TO (TOP VAL (TABLED)) preferences as a necessary part of understanding.
INCORRECT SENTENCE NOTICED: Given these mechanisms, children make grammati-
"Child should precede "throw" cality Judgements by analyzing syntactic viola-
"throw" should precede "ball" tions occurring during understanding. They
CORRECTION: "Child throw ball on table" generate correct versions of incorrect sentences
by applying their language generation ability to
The transition to the third stage occurs when CHILD the understood meaning of that sentence. Thus
children acquire the ability to make grammatl-
28 cality Judgements as a side effect of acquiring
syntactic features needed for understanding.
This account of learning to make grammatlcallty [7] Gleltman, L.R., Gleltman, H. and Shipley, E.F.
judgements makes several predictions. First, this (1972). The Emergence of the Child as Gram-
model predicts that people's judgements of incor- marian, Cognition, 1-2/3:1-16A.
rect sentences will not merely be "grammatical" or [8] Iba, G. (1979). Learning Disjunctive Concepts
"not grammatical," but rather judgements as to the from Examples. M.I.T. A.I. Memo #548, M.I.T.,
relative gransatlcalness of a sentence. This Cambridge, Mass.
prediction follows from CHILD'S generation of a
number of different reasons for a sentence's [9] Marcus, M. (1980). A Theory of Syntactic
Incorrectness. Second, this model predicts that as Recognition for Natural Language. M.I.T.
a child learns Increasing amounts of syntax he Press, Cambridge, Mass.
will find certain sentences Increasingly ungram- [10] Pinker, S. (1979). Formal Models of
matical. This is because newly learned syntax Language Learning. Cognition. 7:217-283.
becomes available to Judge graimatlcality, and
thus the number of violated syntactic features [11] Schank, R.C., (1973). Identification of
Increases. Third, this model predicts that before Conceptualizations Underlying Natural
learning passive syntax children will judge non- Language. In R.C. Schank and K.M. Colby
reversible passive sentences to be ungrammatlcal. (eds.) Computer Models of Thought and
This is because at this stage they are using active Language. W.H. Freeman and Co., San Francis-
syntax to understand passive sentences. Later, co.
when they have learned passive syntax, they will [12] Selfrldge, M. (1980). A Process Model of
no longer judge non-reversible passive sentences Language Acquisition. Computer Science
ungrammadcal. Technical Report 172, Yale University, New
Clearly, this work, has not completely solved Haven, Ct.
the problem of how children learn to make gram-
matlcallty judgements, since there are certainly [13] Selfrldge, M. (1981a). Why Do Children Say
a large number of complex syntactic constructions "Goed"? A Computer Model of Child Genera-
tion. Proc. Third Annual Meeting of the
which CHILD cannot handle. In addition, it Is not
Cognitive Science Society. Berkeley, CA.
even clear what exactly constitutes such Judge-
ments, since Gleltman et al. report that children [14] Selfrldge, M. (1981b). A Computer Model of
think sentences are "silly" for a number of reasons Child Language Acquisition. Proc. 7th Int.
not discussed here. It is hoped, however, that Joint Conf. on Artificial Intelligence.
this approach will prove a promising direction for Vancouver, Canada.
further research, since it Is grounded in mecha- [15] Selfrldge, M. (1982). Why Do Children Mis-
nisms which manifest and explain a number of other understand Reversible Passives? The CHILD
psychological data. Program Learns to Understand Passive Senten-
Thanlcs to Rich Cullingford for sponsoring this ces. Submitted Co the 3rd Annual AAAI Con-
paper, and to Peter Selfridge, Oliver Selfridge, ference, Pittsburgh, Penn.
Don Dickerson, Jason Engleberg, and Marie [16] Wllks, y,, (1973). Parsing English II. In
Blenkowskl for helpful discussions of this work E. Chamlak and Y. Wllks (eds.) Computational
and for conmenting on drafts of this paper. Semantics. North-Holland Publishing Co., NY,

[l] Anderson, J.R. (1981). A Theory of Language

Acquisition Based On General Learning Princi-
ples. Proc. 7th IJCAI, Vancouver, Canada.
[2] Baker, C.L. and McCarthy, J.J. (1981). The
Logical problem of Language Acquisition.
M.I.T. Press, Cambridge, Mass.
[3] Berwick, R.C. (1977). Learning Structural
Descriptions of Grammar Rules from Examples.
Proc. 5th IJCAI, Cambridge, Mass.
[4] Bimbaum, L. , and Selfrldge, M. (1981). Con-
cepcualal Analysis of Natural Language, in
Inside Computer Understanding: Five Programs
plus Miniatures. Schank R. and Riesbeck C.K.
(eds.), Lawrence Erlbaum Associates, Hillsdale,
[5] Chomsky, N. (1965). Aspects of the Theory of
Syntax. M.I.T. Press, Cambridge, Mass.
[6] Cullingford, R.E., Krueger, M.W., Selfrldge, M.
and Bienkowsky, M. (1981). Automated Expla-
nations as a Component of a CAD System. IEEE
Trans. SM&C. Special Issue on Human Factors
and User Assistance in CAD, December, 1981.

Robert Cunmins
The University of Wisconsin—Milwaukee
Eric Dietrich
Martin Marietta Corporation

This work was supported In part by a grant from the Phase One: Pre-conventional Communication.
National Science Foundation, and by the Institute of Bennett takesTrom Grice the ro I lowing conditional.
Cognitive Science, University of Colorado (Institute (GO If S utters E, intending thereby to
of Cognitive Science publication no. ). get A to believe that p, and relies for
ABSTRACT the achievement of this upon the Grician
Mechanism (GM), then S means by E that p.
PATHFINDER is a system that solves coordination Here is what we shall understand by the Grician
problems that require acquisition of a convention Mechanism.
governing the intended meaning of a symbol. LEADER (GM) A recognizes S's intention to get A
blazes a trail through a maze by leaving symbols in to believe that p, and is led by that
the various paths, and FOLLOWER must find LEADER by recognition, through trust in S, to
discovering the Intended meanings of these blazes. believe that p.
PATHFINDER is the first step In a project to design This is a simplified version of Grices more recent
a system that can solve a variety of coordination accounts, but we require only a rather crude
problems of the sort implicated in language sufficient condition at this stage of the account.
acquisition. Solving certain coordination problems Bennett claims that (GO could be satisfied by
is conmuni eating. Since coordination problem pre-linguistic S and A, i.e., by S and A who share
solution can become conventional (as David Lewis has no conventional means of conmunication. We agree
shown), communication can become conventional, and with this assessment for reasons that will emerge
that is language in its most general form. As later. For now we shall simply assume that
conventions are acquired, more sophisitcated pre-linguistic S and A could satisfy (GO—though
coordination problems can be solved, and more perhaps only rarely and in rather special
sophisticated conventions can be acquired. circumstances—and that (GO does In fact formulate
Eventually, it should be possible to acquire a sufficient condition for coranunication between S
conventions governing identifiers and general terms, and A.
and this will enable use of a first order language Phase Two: Conventional 1zation. The second
via a recursive procedure adapted from Tarski by phase oT~Bennett's account imports Lewis' treatment
Cummins. of conventions to show how a convention could emerge
PATHFINDER: INVESTIGATING THE between S and A governing S's communicative actions.
ACQUISITION OF COMMUNICATIVE CONVENTIONS For present purposes, the crucial feature of Lewis's
theory is this.
The PATHFINDER project is a study of the (L) When a group achieves coordination In
acquisition of the capacity to communicate by means a certain situation by acting in a certain
of convention-governed symbols, and of the knowledge way, and they act that way because (1)
structures required for such conwuni cation. The they wish to achieve coordination, and
project revolves around a series of PATHFINDER (ii) each actor knows, and knows the
programs, each of which contains two others know, that that is how coordination
programs—LEADER and FOLLOWER—which together solve has been achieved in the past, then the
coordination problems in a way that requires group has a convention governing that
acquisition of conventions governing the meaning of situation.
a symbol. We begin by sketching the theoretical (L) applies to cases Involving coordination of
background, then turn to PATHFINDER itself. action, whereas our problem involves coordination
In 1973, Jonathan Bennett (Bennett, 1973, 1976) between S's action and A's beliefs. But (L) is
outlined a two phase account of language acquisition easily extended to accomodate this fact because the
based on the pioneering work of Grice on meaning sorts of reasons A can have for adopting £ belief so
(1957, 1969) and Lewis on conventions (1969). In as to coordinate with S are Hie same sorts of
phase one, he explains along Grician lines what we reasons A will typically have for acting so as to
shall call pre-conventional communication: cases in coordinate with S. In particular, A can have as a
which a speaker S performs some action and thereby reason for adopting the belief that S intends A to
communicates with an audience A in a way that believe that p in uttering E the fact that A knows,
doesn't depend on the prior existence of any shared and knows that S knows, that in the past S's
rules or conventions. In phase two, he imports intention in uttering E has been to get S to believe
Lewis' account of conventions to show how that p. If A is then led, through trust in S, to
pre-conventional cases could lead to the believe that p, we have a case that satisfies (GO
establishment of a convention between S and A with because S's utterance of E is governed by a
the result that S's act-type comes to have a convention existing between S and A. This yields
conventional meaning. Since Bennett's work in this the following account of conventional meaning.
area has not received the attention it deserves (CM) Utterance-type E conventionally means
outside of philosophy, (especially in AI and that p when uttered by S to audience A if
cognitive psychology) we begin with a brief review (a) in the past, S has uttered tokens of E
of his two-phase account. to A only when S meant that p, and (b)
30 this fact is mutually known to S and A,
and (c) because of this mutual knowledge
it continues to be the case that when S
utters tokens
by A of
to mean,
E, Sthat
p. and is
We can put the pre-conventional case and the language.
conventional case together in an obvious way.
Suppose S intends to get A to believe that a coconut Investigating Convention Acquisition. The
is about to fall on A, and S goes through a certain acquisition and use of communicative conventions has
performance that results in A recognizing S's not been very extensively investigated by
intention and.via trust in S, adopting the belief researchers in artificial intelligence or cognitive
that A is aboutTo be hit by a coconut. Here we psychology, presumably because the requisite
have a pre-conventional case in which connunication theoretical background has seemed lacking. However,
occurs only because conditions are especially putting Grice's account of communication together
propitious, and because S's performance has a with Lewis' account of conventions yields a powerful
certain natural suggestiveness. Next time, however, theory of the acquisition of communicative
the mechanism of convention will set in, and, as conventions. Extending the account to apply to
repetitions occur, the special conditions favoring acquisition of conventions governing identifiers and
the original success will no longer be necessary. general terms makes it possible to use the recursive
S's performance can be streamlined by a process akin apparatus of Tarski's theory of truth definitions to
to stimulus substitution to the point where it need generate meaning conventions for every sentence in a
have no special features beyond the fact that A and first order language having a finite number of
S perceive it to be of the same type as its semantically primitive terms. The upshot is a
predecessors. Thus, the account allows for the fact theory of language acquisition for first order
that a sign may, so far as its physical languages. This theory, however, is incomplete or
characteristics go, have any meaning whatever. vague at several critical points. (1) The theory
Extending the Account. As it stands, the tells us what it is to be a party to a comnunicative
account just sketched Fiasn't a chance of being a convention governing a symbol with a propositional
full-scale theory of comnunicative conventions, for meaning, but it does not tell us how humans can or
it begins and ends with sentence meanings—meanings do actually solve primitive communicative convention
have the form "that p" where p is a proposition. acquisition problems. (2) The theory tells us what
Since there cannot be infinitely many meaning it is to be a party to a coimiunicative convention
conventions, it follows that the account just governing an identifier or general term, but it does
rehearsed runs afoul of the fact that a natural not tell us bow humans can or do acquire such
language contains infinitely many non-compound conventions on the basis of simpler shared
sentences having distinct meanings. coitmunicative conventions, viz., conventions
This defect has been repaired in Cummins governing symbols with propositional meanings.
(1978), by Introducing Grid an meanings for We propose to meet point (1) by adding the
identifiers and general terms. Here are the hypothesis (i) that primitive coninunication problems
relevant conditions. can be solved, and appropriate conventions acquired,
(ST) There is a convention whereby N in the course of solving simple coordination
refers to x in S's language if (a) in the problems that contain the conmunlcative problems as
past S has uttered N only when intending sub-problems. The problem analyzed by PATHFINDER Is
to identify x, and (b) this fact is just such a containing problem. We propose to meet
mutually known by S and S's audience, and point (2) by adding two hypotheses: (11) that the
(c), because of this mutual knowledge it power of a group of agents to solve coordination
continues to happen that when S utters N S problems increases as that group acquires
identifies x. coitmunicative conventions; (ill) that solving
(P) There is a convention whereby G means relatively more Embedding
PATHFINDER: complex containing
yellow in S's language if (a) in the past problems
in other enables agents to Problems.
Coordination acquire relatively more
S has uttered 6 only when he/she/it meant sophisticated communicative
communication problems conventions.
are ditticult to solveIt isin
yellow, and (b) this fact is mutually these three hypotheses
part because that attitudes
propositional the PATHFINDER PROJECT
are hidden. It
known to S and S's audience, and (c), is
is primarily
for a to speaker-audience
investigate. pair to
because of this mutual knowledge it determine whether or not they have succeeded. This
continues to happen that when S utters G, difficulty can be overcome by embedding primitive
S means, and is understood to mean, communication problems in other non-coimiunicatlve
yellow. coordination problems that are more tractable. If S
We can now state a relation between these meanings and A are engaged in some cooperative activity, the
and satisfaction conditions, and import the standard success or failure of their efforts to communicate
recursion on the latter, to generate conventional will be more or less obviously reflected in the
meanings (though not meaning conventions) for an success or failure of that activity.
infinity of non-compound sentences. In PATHFINDER, LEADER and FOLLOWER must solve
(S) 'The i-th member of the sequence f is such an embedded coordination problem. LEADER
red' gives the satisfaction condition for blazes a trail through a maze by leaving symbols in
a token consisting of the general term G the various paths, and FOLLOWER must find LEADER by
applied to the i-th variable iff the (or discovering the intended meanings of these blazes:
a) conventional maning of G is 'red'. LEADER must enable FOLLOWER to find LEADER. In the
(S) allows us to go back and forth between process, they must solve a primitive corimunication
satisfaction conditions and conventional meanings. problem. For example, in the level-one version of
If we start with cases for which conventions exist PATHFINDER, FOLLOWER may learn that when LEADER
for the primitive general terms, we get satisfaction marked a path "Y", LEADER meant that that path is to
conditions for those terms by moving from the be avoided. Suppose FOLLOWER locates LEADER by
meaning to the satisfaction part. We can then use avoiding paths marked "Y". Then LEADER and FOLLOWER
the standard recursion to get a satisfaction will have solved their main coordination problem,
condition for any first order combination of the and they will have solved a primitive communication
primitive general terms. Then, moving from the 31
satisfaction part of (S) to the meaning part, we get
conventional meanings, though not meaning
well-known for for
that complex general interms.
thiseach sufficies
sentence fix It the
toa first-order
problem as well. Most importently, however, they faces a relatively simple but non-trivial task. A
w i n have solved a primitive convention acquisition maze that is general along all four dimensions
problem: both know that "Y" means "avoid this specified above will evidently require a highly
path". This convention can be used in the solution "experienced" LEADER-FOLLOWER team, a team that, we
of other related coordination problems, thereby suspect, will have to share several powerful
increasing the power of LEADER and FOLLOWER to solve conventions to be effective.
such problems, and hence increasing their power to Sumnary. The PATHFINDER project is designed to
acquire other conventions. For example, it is investigate the following strategy for language
evidently easier for FOLLOWER to grasp an identifer acoulsition. S and A, given some shared knowledge
in the context of an already understood instruction. and goals, but no shared conventional means of
"Avoid Y at zz," links use of the identifier to conmunication, solve a coordination problem such as
solving the embedding coordination problem (find that faced by LEADER and FOLLOWER. Several
LEADER), thereby making it possible for LEADER and successes produce a shared convention. Now that S
FOLLOWER to recognize successful conmunication, and and A share a convention, they can solve more
hence to acquire a convention governing use of the diffecult coordination problems, hence acquire more
identifier. Conventions are a special kind of sophisticated conventions. Eventually, S and A will
knowledge that increase capacity to solve be able to acquire conventions governing identifiers
coordination problems far more effectively than and general terms, and hence, by a recursive
other types of shared knowledge. Advanced process, a first order language. Since solving
LEADER-FOLLOWER pairs will come to share conventions certain coordination problems is communicating, and
governing such things as the identifiers, general coordination problem solution can become
terms, and syntactic rules of a relatively conventional, communication can become conventional,
sophisticated language. and that Is language. Standard approaches to the
Preliminary research has suggested a list of problem of symbolic communication have emphasized
parameters of two types, intrinsic and contextual, acquisition of knowledge of a language. Yet it
the values of which define a relative level of seems clear that learning a language is neither
sophistication. The coordination problems analyzed necessary nor sufficient for communication.
by PATHFINDER are significantly different from each Knowledge of a language is a means to understanding
other depending on the type of maze FOLLOWER faces a speaker, or communicating with an audience.
(intrinsic parameters) and Uie amount and type of Language use and understanding is not likely to be
knowledge, including conventions, shared by LEADER properly understood if it is studied Independently
and FOLLOWER (contextual parameters). This is of the cognitive task that motivates it. The
especially significant given the hpypothesis that present project, in emphasizing the acquisition of
the capacity of two parties (LEADER and FOLLOWER, communicative conventions, focuses on the cognitive
SPEAKER and AUDIENCE) to solve coordination problems task which language learning subserves and thereby
should increase as simple problems are solved and avoids studying language acquisition 'out of
conventions are acquired for future use. context". REFERENCES
Intrinsic Parameters. FOLLOWER will eventually
have T o Face mazes STat vary in at least the Bennett, Jonathan (1973). "The Meaning-Noainallst
following ways: (1) number of branches per node; Strategy." Foundations of Language 10,
(ji) number of symbols per branch (including 141-168.
blanks); (ill) complexity of symbols—e.g., context Bennett, Jonathan (1976). Linguistic Behavior.
sensitivity and reference to other parts of the Cambridge: Cambridge University Press.
maze; (1v) noise—e.g., symbol-like objects in the
maze not left by leader. Cunmins, Robert (1978). "Intention, Meaning and
Contextual parameters. To solve the Truth-Conditions." Philosophical Studies 35,
coordination probTem se? by a relatively general 345-360.
maze, LEADER and FOLLOWER will have to share some Grice, H. P. (1957). "Meaning." The Philosophical
knowledge. The amount and type of shared knowledge Review 66, 377-388.
are contextual parameters of the coordination
Grice, H. P. (1969). "Utterer's Meaning and
problem, for they specify the cognitive context in
Intentions." The Philosophical Review 78, 147-177.
which the coordination problem is attacked. These
include: (1) previously acquired conventions, if Lewis, David (1969). Convention. Cambridge:
any, (11) mutual knowledge of capacities—e.g., can Cambridge University Press.
LEADER cut down a tree, and does LEADER know
FOLLOWER knows this? (ill) mutual knowledge of what
is likely to be a natural rather than an artefactual
feature—e.g., that pine cones are noise in a
forest, but possible blazes in a building; (iv)
mutual antecedent knowledge of the terrritory; (v)
mutual knowledge of behavioral and cognitive
tendencies. These parameters are best thought of as
"passed" to LEADER and FOLLOWER from containing
systems that specify the goals (blaze trail; find
LEADER), contain records of mutual knowledge, and
handle general reasoning and decision making,
including when to give up, or to give up trying hard
and just "try something" (a conmon strategy in
The level-one version of PATHFINDER (which has
already been implemented), involves a maze in which
all branching is binary, there is at most one symbol
per branch, and noise is limited by the assumption
that only
are the symbolsInencountered
significant. a level-oneat the first
maze, node

Paul 0. Scott
Department of Computer and Communication
University of Michigan
April 1982

1: INTRODUCTION researchers interested in machine learning. I

am not aware of any program which explicitly
incorporates play as a learning activity
If you ask a layman what he means by the although I think it would be fair to describe
term 'play' he will probably reply "activities the behavior of AM (Lenat 1976) as playing with
which are useless but fun" or something very numbers. Otherwise AI programs seem to be based
similar. If you ask a developmental on the assumption that learning must be either a
psychologist the same question you will probably classroom experience (learning with a teacher)
get much the same answer although he is likely or an apprenticeship (learning while doing the
to phrase it differently:- task). This paper is intended to exhort both
"Play consists of behaviors and learning theorists and Al workers to take play
behavioral sequences that are organism more Although
seriously. the cognit ive development
dominated rather than stimulus hypothesis provides an expl anation of the
dominated, behaviors that appear to be function of play it does not constitute a
intrinsically motivated and apparently complete theory. Such a theor y must provide an
performed 'for their own sake' and account of how play activi ty is instigated,
that are conducted with relative motivated and rewarded. It must explain the
relaxation and positive affect." content and structure of pla y activities. The
Weisler and ncCal1 (197&) cognitive development hypothes is only provides a
Patterns of behavior which appear to have no framework within which more complete theories
external purpose but are nevertheless enjoyable may be developed. The res t of thi s paper i s
for the participant present something of a devoted to sketching the outli nes of one such
biological paradox. The majority of activities theory.
which are accompanied by positive affect clearly 2: A THEORY OF PLAY
promote, either directly or indirectly, the
participant's homeostatic or reproductive goals. If play is a method of building a cognitive
An adequate theory of play must resolve this representation then any theory of play must make
paradox by attributing a function to play. some assumptions about the nature of the
A number of theor i es have been advanced representation which is built. I therefore
which attempt to do this by suggesting what the begin the development of the theory with the
organism may gain by engaging in play. Space following postulate:-
does not permit a discussion of the relative Play is an activity directed towards
merits of these theories but see Weisler and building a representation of the world
McCall {I976) and Gilmore (I966) for reviews. in terms of the organism's abilities
Fortunately one particular theory appears to to do things to or with the entities
enjoy almost universal support. This we shall which it encounters in the world.
call the 'Cognitive Development Hypothesis'. This hypothesis makes a strong claim about what
Its basic premise is that the organism learns is learned during play. It asserts that the
something through the process of play which is organism is attempting to discover what it can
of value in later life. This theory has been do rather than what it should do. That is, it
advanced in a bewildering variety of forms which is not primarily concerned with learning what
largely reflect the enormous range of things actions have desirable outcomes. It Is of
which a child learns. Taken together these course possible, and indeed probable, that the
various theories amount to a claim that play is organism will obtain information about what it
the fundamental learning strategy by which should do as a side effect of trying to discover
children acquire mastery of themselves and of what it can do, but the claim made in the
the perceptual, motor, cognitive and social hypothesis is that such information is not the
skills which they will need throughout life. goal of play behavior. Note that this does not
The cognitive development hypothesis imply that the organism will not be trying to
provides an explanation of the function of play determine the consequences of its actions but
and hence resolves the paradox. It is very only that it will not be directly concerned with
widely accepted by developmental psychologists, the values of those consequences.
primato1ogists, pediatricians and laymen. This form of representation in which the
Strangely it has received little acknowledgment world is modelled in terms of how it relates to
from learning theorists. Thus a large and the organism's behavioral capabilities has some
reputable text on learning theory (Hilgard and obvious merits. For example, it is an essential
Bower 1975) contains no index reference for prerequisite for any kind of problem solving
play. Piaget does assign a relatively minor behavior since it enables the organism to
role to play in his model but regards it as a generate alternative courses of action in a
particular case of assimilation rather than a 33
fundamental learning strategy. Play has been
equally ignored by artificial intelligence
given situation. However, since it is most li: A SIMPLE IMPLEMENTATION
readily understood in terms of simple motor
responses to a given event there is a serious
danger of underestimating its power and In order to clarify the ideas discussed In
generality. It is therefore worth pointing out the preceding section by providing a concrete
that it strongly resembles Gibson's notion of example and to demonstrate that such hypotheses
'affordances' (Gibson 1977). It is also closely can indeed lead to a successful learning
related to the pragmatic theory of meaning due program, I shall now describe a very simple
to Peirce (1878) and subsequently elaborated by concept learning program which learns by
James. Dewey and Mead among others. For this playing.
reason we shall refer to it as a 'pragmatic The organism in this case is a LISP program
representation'. Object-based programming called PAN. PAN operates in a simple
languages such as SIMULA and SMALLTALK represent blocksworld type of environment. In this world
entities using what is essentially a pragmatic are numerous objects which each have the
represenlat ion. properties of color.size.texture and shape.
Each of these properties may take one of several
3: IMPLICATIONS OF THE HYPOTHESIS discrete values. PAN Is able to apply three
types of action to these objects. It can push
them, kick them and pick them up. However these
We now explore some of the implications of actions wilt only result In the object moving in
the hypothesis that play is a strategy for certain cases. For example the operation of
building a pragmatic representation of the picking up might only result in the object
world. In executing an ordinary goal oriented moving If the object were small. Initially PAN
task the organism is attempting to effect some does not know what classes of objects its three
change of state in Its world. In doing this it kinds of action will succeed on. Thus PAN's
uses knowledge of the properties of the world. task is to discover the equivalence classes of
In play the organism Is attempting to effect kickable objects, pushable objects and objects
some change of state within its own which may be picked up. It must experiment
representation of that world. In doing this it entirely without external guidance until it is
will use knowledge regarding that confident that it can predict the applicability
representation. Thus it can be seen that the of an action to an object.
goals of play are metagoals and hence that play PAN does th i s by deveI op i ng a cI ass
involves access to metaknowledge. hierarchy. Initially it possesses only one
What sort of metaknowledge would bt class - the class 'Things'. All objects are
relevant for the development of a pragmatic instances of this class and all actions are
representation? If the organism is to discover initially attached to this class. As part of
what it can do then it presumably needs to have the attachment of an action to a class PAN
some representation of what it does not know it stores an estimate of the probability that the
can do. That Is the metaknowledge mutt action can be applied to a member of that class.
represent the organism's ignorance. Such a This probability estimate is revised every time
representation could be used to determine the PAN tries to apply that action to an instance of
course of play behavior. Thus the organism that class. If this probability estimate is
would in effect conduct experiments whose very large or very small then PAN is relatively
purpose is to reduce its own ignorance of its certain about the applicability of the action to
capablities in a manner loosely analagous with instances of the class and hence has no need to
scientific research. conduct further experiments. If however the
The Introduction of the concept of probability is in the region of 0.5 then PAN is
metaknowledge raises the spectre of an infinite highly uncertain and further development of the
regress. Where does the metaknowledge come class hierarchy is needed. The actual measure
from? Is it necessary to play at playing In of uncertainty used in the system is Shannon's
order to discover how to play? The threat of an information function (Shannon and Weaver, IJfcg).
endless regress can be avoided if the same In fact any function of the probability which
activities which provide information for the was unimodal in the interval 0 to I with a
pragmatic representation of the world also maximum at 0.5 and a value of 0 at 0 and 1 would
provide the Information needed to build a model serve. The use of the Shannon function has the
of the organism's ignorance. This constraint Is advantage of allowing one to interpret it as the
not only satisfiable but also explains one of informational value of a new subclass rather
the basic empirical findings regarding play and than being a meaningless number. The initial
exploratory behavior: the probability that a probability estimate assigned to each action for
child will play or explore is related by an its attachment to the class Things is 0.5 and
inverted-U curve to the novelty of a situation. hence they each have an uncertainty of 1.
In a highly familiar situation the child will A cycle of the system is called an
have a detailed pragmatic representation and experiment. In each experiment the system finds
correspondingly low ignorance and thus there is the action which is attached to a class with
little to be gained by play. Conversely in a highest uncertainty. An instance of that class
totally unfamiliar situation the child will have is selected and the action applied. If the
virtually no pragmatic representation and hence action is successful then, apart from the
have no knowledge of its own Ignorance. In such increase in the estimated probability, nothing
circumstances he or she would essentially not else happens and the system begins a new
know how to play. Only in the Intermediate case exper iment.
in which a partial pragmatic representation If the action Is unsuccessful then one of
exists Is the child an Ie to construct two processes may occur: a new subclass may be
potentially useful play activities. created or the action may be detached from the
34 class. If the uncertainty exceeds a certain
threshold then PAN will attempt to construct a are developing a much larger version of the
subclass of the class in which the action has system in which objects may possess relational
Just failed and then attach the action to the attributes and actions may change those
new subclass. (The action remains attached to relations. PAN is however only a simple
the original class). It does this by repeatedly instantiation of the use of a play based
selecting instances of the original class until learning strategy.
it finds one on which the action succeeds. It The pragmatic representation takes the form
then selects a random attribute of that of a class hierarchy with actions attached to
Instance, for example its color or its size, and classes. The metaknowledge of its own ignorance
uses that as the criterion for membership of the takes the form of the associated uncertainties.
new subclass. The action is then attached to The same experiments which lead to alterations
the new subclass with an initial probability in the pragmatic representation also change the
estimate which is identical to the current representation of ignorance.
probability estimate for the attachment of the Because PAN operates in a very restricted
action to the original class. universe it eventually learns all that can be
This newly created subclass may or may not learned. Generally we should not expect this to
contain a higher percentage of objects to which happen. As the pragmatic representation becomes
the action may be applied. If it does then the richer the organism has more things to be
subclass is clearly useful and hence will be uncertain about. Hence the process of building
retained. If it does not then the probability the representation becomes a never ending search
estimate will eventually fall below that of the for something even better while retaining the
corresponding estimate in the parent class. best that has been achieved so far.
When this happens the action is detached from
the subclass. If any class has no actions
attached then it is removed. Hence only useful •j: REFERENCES
classes are retained. In this way the system
develops a hierarchy of classes as it attempts
to reduce its uncertainty. The system is able Bruner.J.S., A.Jolly and K.Sylva 1976
to learn both conjunctive and disjunctive "Play - Its Role in Development and
concepts and will eventually reach a stage when Evolution"
all uncertainties are below threshold. In this Penguin Books Ltd., Harmendsworth, England
situation PAN announces that it is bored and Gibson,J.J. 1977
halts. Note the system does not necessarily "The Theory of Affordances"
find a minimal set of classes to represent the In "Perceiving, Acting and Knowing: Toward
concepts it is discovering. This could be done an Ecological Psychology" Eds. R.Shaw and
at the expense of more elaborate rules for J.Braniford, Erblaum, pp 67-82
modifying the hierarchy. It does however Gilmore.J.B. 1966
achieve a correct if redundant representation. "Play; A Special Behaviour"
In this respect its behavior resembles that of In "Current Research in Motivation", Ed.
human beings. R.N. Haber, Holt, Rinehart and Winston, pp
The above account is simplified in one 3k3-35'»
respect. Once subclasses have been constructed Lenat,0. 1976
any given object may be an instance not only of "AM: An Artificial Intelligence Approach to
a given class but also of one or more of its Discovery in Mathematics as Heuristic
subclasses. Thus, if PAN is doing an experiment Search"
which involves applying an action to an Instance Doctoral dissertation, Stanford University,
of some class, the particular instance selected July 1976.
may also be a member of a subclass to which the Peirce.C.S. I878
action is also attached. In these circumstances "How to Make Our Ideas Clear"
the experiment is effectively transferred to the Popular Science Monthly, January 1878, pp
subclass which has the highest probability 286-302
estimate. The result of the experiment modifies Reprinted in "Charles S. Peirce: Selected
the probability estimates of both the subclass Writings"
and the parent class. However a second Ed. P. P. Wiener, Dover, 1966
probability estimate is also kept for each Shannon,C.E. and W.Weaver I9I.9
attachment which is a measure of the proportion "The Mathematical Theory of Communication"
of attempted applications which were not passed University of Illinois Press, Urbana,
down to a subclass. This second probability Chicago, London
estimate, called usage, is multiplied by the Weisler,A. and R.B.McCal1 1976
Shannon information function in determining the "Exploration and Play - Resume and
uncertainty. This is analogous to Shannon's Redirection"
measure of the entropy of an information source. American Psychologist, ^ , pp U92-508
The reason for this modification is that if
it were omitted the uncertainty of parent
classes would remain high even when the
appropriate subclasses had been constructed.
This would lead to endless redundant
experimentation. The modification described 35
ensures that a class with successful subclasses
will have low uncertainty values despite not
having probability esimates close to 1.
As indicated earlier PAN is only intended
as a demonstration that the play theory can be
used without
works as the the
basis of a learning
assistance system which
of a teacher. We
An Sxperiaental Architecture chat supports
;ion-Tenporal Prediction
Paul Robertson
University of Taxas at Dallas
Matural Sciences i riathematics
Box 638, Hichardsoa
Texas 75080. USA

Abstract (2) Many of the effects for which recency

:. constructive theory of aemory organisation was proposed can be adequately explained
has been developed, based upon the principle of without reference to "tiaie' or 'trace decay'.
aon-temporil prediction. The theory predicts much (5) :uiny problems that at first appear to be
;f ;ha experimental findings on recall and temporal in nature can be expressed in terms
far-grtirj; and proviiss a computational foundation of the non-temporal paradigm. It is not known
for 33ae of the intuitive notions of the society of whether all situations can be transformed in
air.d theory. This paper describes an experimental this way. It may be that learning for
architecture :hat is being used to study this form 'temporal' situations is Itself a learned
of learning. The architecture is a highly strategy, there is soma evidence to support
distributed system that achieves "structural" this conjecture.
lear-ing through the Tpplloation of a particularly (4) It is possible to solve the problem of
?ov«rful forii of natural constraint. non-temporal prediction computationally in
•';rvori9 - Non-T»aporal Prediction, Distributed terms of a highly distributed architecture of
Pr-'nlem solving, Society of Mind, Models of simple processors.
^earning, Jkill Acquisition. Learning by Modification
Introduction The notion that learning usually takes the
Progress in 71.31 techniques along with the form of modifying an existing skill is intuitively
^nergsnce if sone highly distributed architectures attractive. Many attempts at capturing this
such 23 Fahlsan [ 1 ] and Hlllis [ 2 ] has awakened intuition computationally have been tried, STP.IPS
an interest in exaatniag what can be done with [ 3 ] employed an augmented triangle table that
certain architectures based on siaple ' n«uron like' allowed old plans to be "modified' to suit new
processors, such as Hiaton [ 5 4 ] and Peldman situations, an idea recently extended by Carbonell
L 5 ]. X theory of learning based on the principle [ 9 ]• ainsky [ 10 ] discussed a form of learning
of r.on-teaporal prediction has been developed that in which new agents arise by 'splitting off from
is coaplaiely data-driven [ 6 7 ]. In this paper, old ones, with only small changes and essentially
we describe an experimental architecture that is the same data connectlona. Tha mechanism presented
being used to study this fora of learning. in this papar follows the spirit of Mlnsky's 'agent
Kon-Temporal Prediction splitting' but differs In detail. The architecture
Learning and n e m o r y can be view-ed as presented differs la that Instead of splitting a
nechaaisras for the acquisition of knowledge. single process (by copying) and then modifying the
•<=.owlsdge itself can be viewed as a neans of copy, it supports multiple copies of (almost)
predicting events in the world. Our survival is In ideatical agents. Learning involves taking a
a lir;? part dependant on our ability to 'predict' 'suitable subset' of these agents and modifying It.
the world. It is supposed that learning has evolved Before describing tha architecture Itself, we
to meet this need. :<aking predictions about the should maka a few points regarding the slgniflc<uici>
world can be classified into two broad categories. of this difference.
Pirst, there is the class of predictions that are (1) It seems llkaly that natural systems
tise related. An understanding of 'Gravity' might such aa the human brain can support this form
be classified in this way, to understand 'gravity* of 'redundancy'.
is to pr°-::.c-. chat when a thing is dropped it will (2) Having multiple copies latroduces a
fall CO Che ground (or the class of predictions of degree of 'fault tollerence', In particular,
which that i3 a simple example). This form of the 'Grandmother Problem' does not arise.
prediction is tise related because the two defining (3) Most significantly, having many copies
;-v2r.c3 \-.ho 'Iropping and the hitting on the floor) means that a data driven mechanism can be
are iispar-.te in ciae. fne second category, to utilized to achieve the 'split' instead of
w:-.ich this paper is specifically addressed, needing a cop down decision to split.
concerns pra-.iic t-.cns chit ire unrelated to time. Understanding Discontinuous Changes In Capability
This -:ind of prsdictiin concerns the classification Instead of having a single agent that can
-: -ivenrs. '->aT':, learning t.he concept of an 'arch" perform a given task, the architecture supports
.Ziyi 13 T.aicir.i-^ i prediction about what objects many such agents, 'ie will refer to a set of similar
-•onstituts 'ar-n'. Vhen examples of arches that agents as a process-set. The agents of a
;onfc.-a to c.iis prediction are encountered they process-sot compete to influence the state of the
^ 1 1 be recognised aa such, just as dropping an system. Each agent provides its own prognosis and
cb.:ect that subsequently falls to the ground is some indication of how reliable It believes this
reco.^nised ao indicating the presence of "gravity". prognosis to ba (based on a simple probabilistic
Th-? iifference, is that the second category is analysis). One agent's prognosis will be chosen as
inr?l£ce<i to tiae. There are several reasons why it the most credible alternative. The computation of
.3 iisfff il to make chis fora cf distinction. credibility will also bo computed on the basis of a
36 I I'an/ theories of learning a.nd forgetting simple probabilistic analysis. Instead of
3r° ba.-.ed upon the notion of trace decay- hypothesizing that when a thing is learned its
Pi eoeacy ? x r l T i a 3 c e r t a i n o b s e r v a b l e strength gradually increases, or when it is
ph9no:ienon, 'out la diffi-ult to justify forgotten, it gradually decreases (trace decay),
coaput 11 :.ons II7 and ^-.ves rise to some this model of learning distinguishes several phases
sorious probler-.s when dealing with predictive of learning. First, the agent is generated in
3i-uations Tf vastly disparate tines. a^-ent
(we will
the agentthe
be refined
of for
'discover Its awn boundaries and be able to Computation perfortied by a Creator
accurately compute the reliability of its own The creator aonitors both the inputs Local to
prognosis), "ir^lly, the agent aust be iiaeovered the region and the number of processors that
by other agents already in the system. Tnis final respond to the input. If too few processors respond
stage is one in which the agents credibility is to an input, the creator selects the processors
ccaputed as the result of a probabilistic analysis, that are least successful and re-programs them so
and corresponds closely to the notion of forming as to increase the process-set cardinality. The
K-lines expounded by Hinsicy L II ]. When a new and creator is continually performing the fallowing
r.scessary agent is created, its success causes Its sequence of computations.
credibility to rise until enough saaples have been (1) Compute the activity of the inputs to
obtained to raise its credibility to a level above the region. This involves counting the number
that of the previous 'favorite' agent for this of active inputs locally. Let the activity be
task. At thia point, the new agent will suddenly be denoted by activity.
laed in place of the pr«!viou3 favorite, giving rise (2) Compute the response size. This involves
to an observable i iscontinuous change in counting the number of processors in the
perfornanoe. region that responded to the inputs. Let the
The Sjcperlaental Architecture response size be denoted by response.
The eiperiaental architecture can be described (7) Compute the expected response size. In
at several levels. At one level, is the general the present system, the expected response
aystam topology defined by a number of intuitive size is a linear function of the activity.
connectivity restrictions described in [ 7 J and in (4; If re3ponse<expected, re-program
nore detail in ^ 6 J. Space prevents a discussion response-expected processors. This involves
of this aspect of the architecture. The heirarchy choosing the required number of processors,
can be decomposed into neighborhoods of agents that the least successful ones are chosen first.
will, for the purposes of this paper be totally Each processor keeps a record of its success.
connected i the overall heirarchy allows the In our implementation, each region keeps a
conneuti/ity coui|u.axxty to bs ^apt linear despite sorted list of processors, when n new
the total connectivity within neighborhoods, processors are required, the first n are
'urthernore, the connectivity within a neighborhood taken from this sorted list. In a truely
-an be relaxed [ 2 ] without loss of generality). A parallel system such as might be found in
".eighborhood contains two computationally distinct Biological systems, this process can be
components. The processors, that nay b« programoed achieved simply by broadcasting a re-program
to compute a predictive rule, and the creators that command to all processors and using a system
program processors for the purposes of generating of inhibition to prevent re-programming of
new agents (learning) and repleaiahiag proceea-sat the better processors (for a development of
3ize when process splitting has resulted in an this Idea see [ 6 ]).
insufficient process-set cardinality « ®t^Processing Inputs
(houselceeiilng). 'io will discuss the creator and the It is convenient to describe the operation of
processor objects separately. A programmed the processors in two stages. Plrst, how each in^ut
processor will be refered to as an agent. to a processor is handled on an individual basis,
NiimBn li \»tn$i»t
The Anatomy of a Seighborhood and second how these- inputs are combined to form a
Consider a neighborhood to be a two prognosis.
dimensional sheet of processing elements. Bach
processor in the region recleves an input from ACTIVE
outside the neighborhood, being totally connected
each processor also recleves inputs from the rijjoi wtiq n rmq Ijncti
outputs of every other processor».,„in the
nei-shborhood. Figure 2
'Jf I ;nijo)n9o« 3ach input to a processor is processed by a n input
weighting function, figure 2 illustrates the
O '''Ociitai
function of this process. Sach input weighting
function (corresponding to input.) samples its
Figure input whenever the process is active. In this way,
The neighborhood it-,elf is divided Into smaller the input weighting function computes for its
ovjriiping 'regions' (3ee Figure I). Each Region input, the credibility that that input is
contains a ainijle creator and a large number of indicative of the event being diagnosed — the
processors. The creator has access to all local probability that the input will be active when the
inputs to the processors within the region, and the event Once
is diagnosed P(input.
the ir.pits 1
have been weighted according
outpurs of each processor in the region. The Other Processor
to their functions.
credibility, they can be combined to form
creator can cause one or nore of the processors in the prognosis.
its region to be re-programmed. 37
C u M i u SiCliOft 5. Robertson,?.
put I Process Dependant Localized Memory
University of Texas at Dallas Technical
'•i^tm Report.
7. Robertson,?.
rttn )-Q npHi : Son-Temporal Prediction: A Distributed
System For Concept Acquisition
10 111 ?f«CUU(| Proceedings of the Fourth National
Conference of the CSCSI/SCEIO 1982
UAX OF • Xln-t a. Flkea.R.E. * Wilsson.N.J.
STRIPS: A new Approach to the Application of
Theorem Proving to Problem Solving
*i^inimum ivwit ir» Artificial Intelligence Journal, Vol. 2,
: .VF » nfui •.'••.-h'lfig funeiian no.3/4. 1971
9. Carbonell.J.C
Figure 3 A Computational Model of Analogical Problem
Figure 3 illusrrates the basic operation of the Solving
•iiagnosia part of a processor. The value of success Proceedings of IJCAI-7. 1981.
is adjusted whenever the agent Is active. Space 10. Mlnslcy,M.
prevents further development of this idea here, Plain talk about Neurodevelopmental
however, low success values Indicate failure in Epistemology
ispleffleating a predictive rule, and such processors Proceedings of IJCAI-5. 1977.
•*iil ie re-programmed by their creator when a new 11. Minsk? ,M.
agent is required. The inherent limitations of '<-Lines: A Theory of Memory
simple Linear Threshold devices such as the In 'Perspectives on Cognitive Science'
prognosis function, are used as a pomrfUl natural Donald A. Morman ed.
constraint. This guarantees that most agents that
are created -fill aventuallr die (success will fall
until it is eventually re-programmed). This gives
rise to a very eceoomical use of processors without
the need for a knowledge driven resource
(processor) allocation system (these ideas are
developed in detail in [ 6 ]).
Oue to a lack of space, many significant
details and much of the theory had to be omitted.
Zzperlmants with a LISP based iapleaantatloa of the
system outlined in this paper have been
encouraging. Complex structural descriptions can
been learned by the system. The syste* is robust in
that usually, no agent Is so important that its
removal will be critical (due to duplication), and
a high degree of noise can be tolleratad. An
analysis of the systaas noise immunity can be found
in L 6 ]• -• Is interesting that as the regions
approach saturation (most p r o c e a s o r s are
successfully programmed as agents), it becomes
increasingly difficult to learn a new rule. This is
because, before a new agent can achieve a
respectable success it is re-programmed by its
creator because it is still the least successful
agent. Only intensive training will result in the
new agent being learned, and this will be at the
cost 3f one of the other successful agents. Pull
details of the architecture, and Juatlfieation of
its design can be found in [ 6 ].
1. ?ahlman.3.S.
;IET1: i System for Representing and Using
Real-Vorld Knowledge The S.I.T. Press.
CamoriJge :?assac hus e 11 s 1979. ISBN
2. Hillis.W.D.
The Connection Machine
rf.I.T. AI Memo 646 September 1981.
3. Hinton.S.
A Parallel Computation That Assigns
38 Cinonicil 0bj9ct-3ased Frames of Reference
?roceedin?3 of IJCM-7 1981.
I. Hinton.G.F. i Anderson,J.A. (eds)
Parallel models of associative memory.
Hilladale, SJ: 3rlbaum, 1981.
5. Feldman,J.A.
AIn ^onnectioaist
L '<• ] abo'/'e. Model of Visual Memory
John M. Morris
Measurement Conoept Corporation
Rone NY 13140

Some of the earliest work In the logic of shooting, or that an International incident
events appears In Heapel [2]. Here are some became a killing. If (1), (2), (3)> and (4)
examples of what he meant by "event": "the above were identical, then relationships among
first solar eclipse of the twentieth century," them should be symmetrical; but they are not.
"the eruption of Mt. Vesuvius In A.D. 79." For this reason they are not descriptions of
"the assassination of Leon Trotslcy," "the the same event.
stock market crash of 1929." The events are The important thing is that it will be
whatever these phrases refer to. impossible to specify an event unambiguoualy
Events occur in both time and space, but the simply by specifying the objects and the
edges of the event may be fuzzy. An event lllce portions of space and time in and to which it
the collapse of the German economy during the occurred. Since an indefinitely large number
1920s or an Increase In tension between Russia of events may occur at the same point in space
and China is not the sort of thing that can be and time, we need additional specifications in
confined to a definite region of space-time. order to describe an event uniquely.
Still, even though the location is vague or Distinguishing among events is Important for
fuzzy, it always makes sense to ask uliscfi ani current events analysis, because different
when an event Is located. The German banks, events will have different consequences. The
bankers, and householders that fell victim to psychological state of an isolated Russian
the economic collapse were located in Germany; soldier is likely to be unimportant to the
and the Increase in tension between Russia and current affairs historian; but the outbreak of
China Includes editorials, posters, speeches, a war along the Russian-Chinese border is of
military movements, and the hearts and minds major importance. An effective systen for
of people at definite points within the two current events analysis will identify the
countries. Similarly, it makes sense to event in terns of its relevance to the
inquire when an event occurs, even when the histM-lan's goals.
time boundaries are fuzzy. So we can always Suppose that, following the incident, a
Include a place and time reference in our Chinese radio broadcast Is heard to
descrlptioos of events, even though the edges characterize the shooting as "inhuman
of the events may be blurred. butchery" and to describe the Incident in
A major problem in the development of a logic other emotionally loaded terms. We can say (1)
of events has been a criterion of identity for that the Chinese reported on the shooting, and
events, that Is a way of telling when two (2) that the Chinese attacked the Russians as
descriptions refer to the same event. A single "butchers." Precisely the sane broadcast, at
set of objects in a single space-time segment precisely the same tlm*, used the sane set of
may be Involved in an Indefinitely large words to perform both of these actions. But
number of events. A Russian soldier near the the event reported as (2) is more significant
Chinese border squeezes the trigger of hla for the historian than the event reported as
rifle. Among the many events which occur are In the
(1). Fromsymbolism developed point
the historian's Jaegwon
by of view Kim
these: (1) various neurological and [3,1]
and (2)anare
event is represented
different events. by an expression
physiological events in the Russian's body, of the form:
together with physical processes associated
C(*1 xn,t), P«]
with the firing of the rifle, and the
resulting physiological processes in the body where (x^^ ^^.^ x^) is an ordered n-tuple of
of the Chinese soldier who is killed by the concrete objects, F'' is an n-adic empirical
bullet; (2) an attack on a Chinese outpost; attribute, and t is the time at which (x.,
(3) from a psychological point of view, a •••> *„) is said to exemplify the attribute
Russian soldier's expression of his boredom, P". The n-tuple of objects may be written in
frustration, and contempt for the Chinese; (4) vector notation as X The event is said to
the first incident in a major Russian-Chinese "exist" if and only if X does exemplify P" at
war. time t. (The place can oe Included among the
Some people, like Anscombe, would prefer to «i.)
say that only one event has occurred and that Thus [(x.,x2,t),P^] might signify the event of
we have given four different descriptions of an Israeli F-l Phantom-II aircraft flying over
it. Goldman and others have shown that these the Suez Canal at 1:06 a.m. on August 4, 1982.
cannot be regarded as a single event [1]. His Here, x^ represents the aircraft, x, the Suez
proof, which is very simple, is this: We may Canal, t the time, and p2 the attribute of
say that the Russian, in this example, overflying. (The superscript "2" indicates
expreaaed his boredom by firing his rifle; we that it is a two-place predicate.) It may seem
say that the shooting nonatitutad an attack on somewhat strange to speak of an event like
the Chinese outpost; and we say that the "overflying" as an attribute, but this
killing hename an international incident generalization makes the symbolism applicable
because of later reactions to it. We would not to states, conditions, and other qualities, as
speak in this way if all of these were 39
descriptions of the same event, because the
converse of these statements would not be
true. We would not say that the soldier fired
his rifle by expressing his boredom, or that
an attack on the outpost constituted the
uaXl as to events. determine, from a general description of an
event, which properties are going to be
A problem of particular Importance for the
significant ~ which properties are
designer of an event logic will be that of
"constitutive" of the particular event, and
deterolnlng when two descriptions refer to the
which are merely "exemplified" by the event.
sane event. In the example Just given, when we
It is Just conceivable, for Instance, that the
receive a dozen reports of an F-^ flight over
historian is collecting the names of Soviet
the Suez Canal, ue will want to know whether
officers that begin with the letter "A" -- for
there was Just one flight or a dozen flights.
some obscure reason we can only guess at —
Goldman and Kim propose a rather strong
and the Important information is the first
criterion of Identity for two events:
letter of the name of the new Field Marshall.
[(x,t),P] = [(y,t'),Q] if and only if x=y,
(This would be part of the historian's "user
t=t', and P=Q. This makes "flies over the Suez
view," the viewpoint from which ha or she
Canal" a different event from "threatens
would want to look at the data.) The first
Egyptian frontiers." From a pragnatio point of
letter of the name would be constitutive of
view, the role of these two desoriptiona In an
the significant event (in the sense that it
information system will be different, and we
would be that which makaa it signfleant), and
will take them as representing different
the political attitudes of the Marshall would
events, even though the physical objects and
then be nothing more than irrelevant noise.
their raw, physical motions are the same.
The problea Is in distinguishing the
The description of the flight as a "threat"
significant or constitutive features of an
depends on the context of world events in
event. For human observers there is little
which it takes place. Although the flight is
difficulty in locating Just those features of
located in the area of the canal, its
an event which are relevant to their
Hignifioanoe is Dot located there at all. The
interests. One fascinating characteristic of
significance of the flight is In the various
human perception is the way in which humans
government officials whose attitudes make it a
fail to notice elements In a situation which
threat. It would not be a threat if it were
have no Interest for them. For an automated
not for these attitudes. The claim that the
information system, however, the problem of
threat is located only along the flight path
relevance becomes acute, because the machine
is what Whitehead called the "falUcy of
has no Interests of its own. We must be able
simple location*.
to tell the machine how to locate those
A complete analysis of the logic of events
features in the information which will be
will provide us with rules for going fron one
useful in discriminating among relevant
event description to another. We will want to
patterns of events RS'ESEKCES
the logic
bow to works,
go from
In summary, the problea for anaiva^a is
plane flies hypothetic
over Suez event.Canal"
Let us "Israel
to suppose 1. Goldman, Alvln
determining thoseI., i Ty^mn,.^ramong
features, nr u»m»r,
the infinite
a Soviet Egyptian
officer at frontier."
the Chinese border,
number of features which can be extracted1?rO.
Action. Englewood Cliffs: PrenUce-Hall, from
General Sayevare events
that iscan,
in the to
the world around
2. Hempel, Carl,AaPMta
which Of
will be significant
In itself,
this for
eventa does
for the goals N«»
BrelanatlOB. of the
Xork:ourr«it historian
Free Press, — such
any clear
that significance
tension is for thebetuea
rising two
as the Jaegwon,
3. Kim, detection of and
"Events a Their
potential world
countries.if we add the Information that
Descriptions: Some Considerations," Eaaavs in
Andronovich is noted for his outspoken
anti-Chineae attitudes, then his promotion Honor of Carl G. HenpalrMicholaa Resoher, et.
becomes a significant predictor for future al., editors, Dordrecht: D. Reidel Publishing
Soviet-Chinese relations. At least two events Co., 1970, pp. 19«-215.
have taken place: (1) a Soviet officer named 4. Kim, Jaegwon, "Causation, Hemic
Andronovich has been promoted; and (2) Subsumption, and the Concept of Event," The
anti-Chinese attitudes have been encouraged in Journal of Philosophy, Vol. UCX, no. 8, April
the USSR. 1973, pp. 217-236.
Now, if we know that Andronovich is 5. Morris, John M., "The Heed for Context in
anti-Chinese in attitude, then we know that he Event Identity," Third Annual Conference of
belongs to the class of anti-Chinese Soviet the Cognitive Science Society, 1981, pp.
officials. Our event logic should permit us to 197-199.
say that anything which happens to Andronovich
is also an event which happens to an
anti-Chinese official of the USSR. From this,
it should be possible to derive the more
general event, in which anti-Chinese attitudes
havean been
In encouraged.
automated systen Finally,
for current from events
event, it should be possible
analysis a central problem will be to to predict
deterioration of Soviet-Chinese relations. The
role of the logical apparatus is to provide
the hypotheses upon which the historian can
predict the deterioration of relations.

ABSTRACT name 'fruit'. The owner label defines the parent

node(s) of the head node. This label resolves
This paper introduces a new method of any ambiguity created when two or more fuses have
knowledge representation called a fuzzy semantic the same head node name. For example, if two
network (FUSEN). FUSENs were created to model fuses have the head node name of 'color', one would
continuous or fuzzy knowledge using concepts look at the owner label to see what they referenced.
from artificial intelligence, fuzzy set theory, There could be fuses concerned with automobile
and cognitive psychology. colors, leaf colors, or colors is general. In
FUSENs have the ability to model three theories Figure 2 the owner label is '()' or null. This
from cognitive psychology: the theory of natural means this fuse is about 'fruit' in general.
categories, the family resemblance theory, and Each sub-node is a different attribute of
the feature-set theory. They can also perform 'fruit'. The weights associated with each sub-
as most of the knowledge structures from artifi- node reflects how strongly that particular attri-
cial intelligence and as a fuzzy set structure. bute is associated with 'fruit'. The link labels
Presented is their structure and several examples define the domain over which the sub-node is
illustrating their use. defined. In Figure 2 'red' and "yellow' are de-
INTRODUCTION fined as colors of 'fruit'.
The weights are viewed as frequency counts.
To have a complete understanding of an entity In Figure 2 the head node weights of 137 states
one must be aware of how it acts, what rules apply that 137 instances of 'fruit' have been observed.
to it, and in what situations one might expect The ratio of the sub-node's weight to the head
to find it. For example, it is possible to des- node weight is that sub-node's association
cribe the color, shape, size, and subparts of a strength. 'Red' has an association strength of
'dog'. It is easy to define the sets to which 66/137 or 48.2%.
'dog' belongs and the memebers of the set called Figure 3 shows a fuse representing a set of
'dog'. But, the concept of 'dog' is not complete apples attributes. The type label is '(attrib)',
unless one knows what 'dog's do and how they act. so the syntax of this fuse is the same as that
There should be specific memories of 'dog's. of Figure 2.
There should be anticipations of what to expect NATURAL CATEGORIES
from 'dog's in general and from specific 'dog's
The theory of natural categories was developed
in particular. There must also be an under-
by Rosch [ANDE80]. Natural categories are levels
standing of time, space, and the physical reality
of abstraction that people seem to naturally
in which 'dog's operate. A complete concept of
develop and use. Rosch feels categorization occurs
a 'dog' includes all of this knowledge.
to go beyond insignificant individual differences
FUSENs divide this complex knowledge into
and to obtain the most information from the
four separate classes: entities and categories;
smallest amount of categorization.
actions and processes; literal and deep sentences;
Figures 2 and 3 can be used as an example
and rules and hypotheses. This paper examines
of natural categories. According to these
the first of these classes and briefly discusses
figures, a certain object that is small, red,
the relationships between FUSENs and three theories
and sweet can be seen as an apple or a piece
from cognitive psychology: natural categories,
of fruit. Since these attributes match both
family resemblance theory, and feature-set theory.
the 'apple' and the 'fruit' fuses a computer
algorithm would say the object is both an apple
Figure 1 shows the graphical representation
and a piece of fruit, which is correct. But,
of FUSENs. The owner label defines the owner of
in conmunicating with humans, the algorithm
a head node and the type label defines the associa-
will have to pick the most appropriate level of
tion existing between the node. The weights
abstraction or as Rosch called it, the 'basic'
represent the association strengths between nodes.
A head node can be associated with any number of
The way the algorithm can find the basic
sub-nodes. Each instance of a head node and its
level is to look at the head node weight. The
sub-nodes is called a fuse. All nodes of a fuse
h iqhest weight is the most frequently conceptu-
can be sub-nodes or head nodes of other fuses.
alized concept or the basic level. In this
Figure 2 is a fuse representing a set of
example the object
FAMILY would be called
RESEMBLANCE an 'apple'.
attributes for the category 'fruit'. This is de-
terminded by examining the head node name, 'fruit'; The family resemblance theory was also
and the type label '(attrib)'. The type label is developed by Rosch [ANDEBO]. This theory states
a reserved work, denoted by the surrounding 41
parentheses, describing the relationship between
the sub-nodes and the head node. '(Attrib)' defines
all the sub-nodes as attributes of the head node
that every category is defined by an open-ended SUMMARY
set of attributes or features. Natural cate-
gories have no fixed boundaries. For any parti- This paper briefly introduces a new method
cular category there might not be even one attri- of knowledge representation called a fuzzy
bute in common with all the category members. semantic network. The theory is based on the
An entity is judged to be a good member of a cate- idea that knowledge can be represented by the
gory if it has many attributes overlapping with associations between symbols and that these symbols
the attributes of the category. and associations can be explicitly represented
The FUSEN structure models this theory very by a semantic network. Using semantic networks
well. The 'fruit' and 'apple' fuses show how the as a base, a general method of knowledge
concept is defined by a set of attributes. The representation was developed to include ideas
number of sub-nodes and their weights are dynamic from many areas: artificial intelligence,
and can constantly change as new examples of the mathematics, psychology. It is hoped that when
category are observed. If a green fruit is the complete syntax is developed FUSENs will be
observed, the sub-node 'green' with a weight of able to represent most any kind of semantic
1 will be added to the 'fruit' attribute fuse. knowledge..
In addition the 'fruit' head node weight will be
incremented by 1. [ANDE80] Anderson, J.R. 1980. Cognitive
FEATURE-SET THEORY Psychology and Its Implications.
Feature-set theory [AN0E80] assumes people San Francisco, CA: W.H. Freeman.
recall how frequently they have seen all the vari-
[SPRA82] Sprague, K.W. 1982. 'Fuzzy Semantic
ous attributes of a concept. The more frequently
Networks'. Gainesville, FL: MS
seen attributes have a higher correlation or
thesis University of Florida
association strength with the category.
This is exactly how fuses work. Figures 2
and 3 show two categories. The association
strengths for each sub-node reflects how strongly
it is associated with the head node. Notice that
'red' is more strongly associated with 'apple'
than 'fruit', and 'tart' is more strongly
associated with 'fruit'.
Sprague [SPRA82] has shown how fuses can
also perform as many other knowledge structures.
In particular he discusses production rules,
semantic networks, expert knowledge systems,
frame theory, fuzzy sets, and stimulus-response
Figures 2 and 3 on following page.

'head node . — type label

owner label name ' — ^head node weight


sub-node node weight

FIGURE 1. Diagram of FUSEN structure

__^ (attrib)
fruit 137





Figure 2. Example of fruit attribute fuse



Figure 3. Example of apple attribute fuse


James A. Galamboa
John B. Black
Yale University
Cognitive Science Program

Studies of text comprehension (Bower, Black, and and Black, 1081.

Turner 1970, Mandler and Johnson 1077, Schank and
Abelson, 1077) have relied on the notion of a script or We are also concerned with how the components of a
schema. A script represents world knowledge about script become available when the script is accessed. T h e
c o m m o n activities, events, and situations. It includes question here is whether accessing the script makes all its
information about the components of these activities and components immediately available or whether some
the relations among the components. In this paper w e components have a more prominent status. In other
examine scripts for c o m m o n activities (e.g., cashing a experiments (Galambos and Rips, 1082), w e have defined
check, or going to restaurants) as thejr exist prior to their a measure of prominence called eentnlity. T h e
instantiation in prose. T h e questions addressed in this centrality of an action is a measure of the importance of
paper are: H o w is the script knowledge structure the action to the performance of the main goab of the
accessed? and once the script is accessed h o w are its activity. For example, in the restaurant activity the
components m a d e available? In other words, since action E A T T H E M E A L is highly central Our hypothesis
context is so important in comprehension, w e want to is that central actions should have a greater availability
know h o w w e get a context and once w e have it, h o w than less important actbna when using an accessed
does it hel^? context to aid comprehension.
Schank and Abelson discuss how the script knowledge Note that it is possible to select actions in such a way
structure is activated during the comprehension of that these two dimensions are independent. T h e
narrative. Cleariy the easiest way to invoke a particular distinctive seeing the head waiter action is not
knowledge structure is to refer to it by name. Thus if particularly central to dining at restaurants, and the
the narrative explicitly mentions a situation, the retrieval central eat the meal action is not particulariy distinctive
of the knowledge structure should be straightforward. (since eating can occur in m a n y other contexts; a plane,
T h b can be done by a title of a passage, or by setting at home, a picnic, etc.). In terms of these dimensions our
statements. W e are interested in cases where the context hypotheses are that the distinctiveness of an action
is not given explicitly. Implicit reference to the activity should determine whether or not the script is accessed.
can be m a d e in a number of ways. For instance a goal T h e centrality of an action should influence whether the
mentioned in the narrative can serve as access cue for the action becomes available when the script has been
script typically involved in accomplishing that goal W e accessed. W e designed a reaction time experiment in
are concerned with a different case where the order to test these hypotheses.
presentation of one of the aetiona in the script leads to
the accessing of the script itself. Thus on encountering The subjects' task was to decide whether or not two
the sentence: presented action phrases were components of the same
activity. T h e first phrase was presented on a C R T screen
John walked through the door for 1500 msec. This phrase then disappeared, and the
second phrase was presented. T h e second phrase
and saw the head waiter.
remained on the screen until the subject responded. T h e
in a narrative, the restaurant script should be activated response latency was measured from the onset of the
to contextualize subsequent sentences. second phrase to the subject's response.
W e test the claim that component actions will serve as
access cues for their scripts if those actions are Four actions from each of 22 activities were chosen to
diatinetivt to the script. A n action is distinctive to a sample the combinations of high and low leveb of both
script if that action is performed in few if any other centrality and distinctiveness. Thus from each activity
scripts. Thus, for the restaurant script, the action S E E one action (Hi-C/Lo-D) was high in centrality in the
T H E H E A D W A I T E R is highly distinctive, since it occurs activity and low in distinctiveness, a second action (Lo-
in few if any other activities. T h e action of walking C/Hi-D) was low in centrality and high in distinctiveness.
through the door occurs in so m a n y activities that it is T h e third action (Hi-C/Hi-D) high in both centrality and
extremely low in distinctiveness to the restaurant script. distinctiveness, and the fourth was Lo-C/Lo-D. For
This aspect of script structure has been developed and example the four actions selected in the activity of
examined in Galambos, 1081 and 1082, and Galambos cashing a check were:
Action TjFpi Action ( /Hi-D). T h e prediction was confirmed. T h e
difference between the two sets of means was significant
\min f"(l,35) = 7.31, p < .02!. T h e context accessed by a
Hi-C/Lo-0 vri t* joar signttur* distinctive first action does help subjects to confirm that
Lo-C/Hi-0 the second action is in the same script.
raeord tht taount
O u r second prediction involves the centrality of the
Hi-C/Hi-0 go to b*nk second actions following distinctive fust actions. Central
actions are the main goals and components of the
Lo-C/Lo-0 Mit in I IR« activity. This prominence should be represented in the
organization of the underlying knowledge structure.
Twelve pairs of actions were constructed for each W h e n the script is accessed by a distinctive first action,
activity by combining the four types of actions in all central second actions should be confirmed more quickly
pairs at each order. These twelve conditions were as components of that script compared with less central
equated for length and word frequency. T h e sequential second actions. This prediction is tested by a comparison
presentation order of the two actions matched the real of the fust three and second three means in the list above
order of the actions for exactly half of the trials in each ( /Hi-D - > Hi-C/ vs. /Hi-D - > Lo-
C/ ). In thb case the M i n F' was not significant
but the F for the subjects was 4.83 which was significant
Stimuli were constructed for each subject so that all at the .04 level for one and 23 degrees of freedom [for
12 conditions and all 22 activities were equally materials, F[l,2l) = 2.03, p < .18). Thus the claim that
represented, but each action was presented only once. central actions are more available than non-central
There were an equal number of negative trials using actions when the script is accessed also received a certain
actions not involved in the positives. Twenty-four Yale amount of support.
undergraduates participated in the experiment.
It is possible to examine more fme-grained predictions
T h e mean R T s for each of the twelve (positive) for these data. Perhaps the purest test of our
conditions were: assumptions can be obtained by comparing conditions Lo-
C/Hi-D - > Hi-C/Lo-D and Hi-C/Lo-D - > Lo-C/Hi-D.
T h b compares the same actions in different presentation
Condition Htan order. Clearly the preferred order b when the dbtinctire
HI-C/Hi-0 -> Hi-C/Lo-0 873 (non-central) action b presented before the central (non-
Lo-C/Hi-0 -> Hi-C/Hi-0 880 dbtinctive) action. T h e Hist action accesses the script
Lo-C/Hi-0 -> Hi-C/Lo-0 9S4 and since the centrality of the second action makes it
more available for confirmation. T h e reversed order
Hi-C/Hi-0 -> Lo-C/Hi-0 898 should be m u c h more difficult since the script b not
Hi-C/HI-0 -> Lo-C/Lo-0 1059 accessed by the Hrst action and second action b not
Lo-C/HI-0 -> Lo-C/Lo-0 963 prominent in the script. There b a very large difference
(160 msec) in favor of the optimal order of these two
Hi-C/Lo-0 -> Hi-C/Hi-0 986
action types. T h e point b that the optimal order b
Hi-C/Lo-0 -> Lo-C/Hi-0 1124
facilitative because it exploits the functional organization
Hi-C/Lo-0 -> Lo-C/Lo-0 1081
of the knowledge structure.
Lo-C/Lo-0 -> Hi-C/Hi-0 1193 T h e results of thb experiment indicate the presence of
Lo-C/Lo-0 -> Hi-C/Lo-0 1013 two functional constraints on the organization of
Lo-C/Lo-0 -> Lo-C/Hi-0 1073 knowledge about c o m m o n activities. Knowledge
structures (like the scripts examined here) are used to
T h e nomenclature here is perspicuous; for example the provide context to better understand experience. T h b
first entry indicates that a highly central and highly implies that the knowledge structures can be quickly
distinctive action was presented in the first position accessed when the need for them becomes apparent.
followed (after 1.5 seconds) by a highly central but non- W h e n an isolated action b encountered it b necessary to
distinctive action, and the mean reaction time was 873 find a context into which it Ats. T h e organization of
msec. knowledge structures must reflect thb necessity. T h e
dbtinctiveness of an action to a script can be represented
If w e are right that distinctive actions access their
as a link to the superordinate script concept. If a
script, then conditions where a distinctive action (Lo-
dbtinctive action b encountered, then thb link can be
C/Hi-D or Hi-C/Hi-D) b presented Tirst should facilitate
traversed and the script concept retrieved. If the action
the response. This is because the script should be
b not dbtinctive then either the retrieval path b
accessed in the 1.5 seconds before the second action is
unavailable or too m a n y available scripts are accessed
presented. Having the appropriate context should speed
and the context b ambiguous. Dbtinctive actions then
the interpretation and processing of the second action, as
provide one way to find an unambiguous context. O u r
well as simplify the sameness decision. W h e n the first
results demonstrate that dbtinctiveness b a relevant
action is not distinctive (Hi-C/Lo-D or Lo-C/Lo-D), then
the script is not accessed and subjects must try to access structural characterbtic in the functional organization of
a contextualizing structure when the second action is knowledge structures for c o m m o n activities.
presented. This prediction is equivalent to a comparison A second functional constraint is that knowledge
of the Tirst six and the last six means above. T h e first six structures must organize information in such a way as to
contained a distinctive action in the Urst ptosition have the necessary components available for utilization
by the comprehension processes. In other words, having
a context means (among other things) being able to
generate predictions about subsequent input in order to
lessen the processing load when that input Ls encountered.
This constraint would be satisfied if a Ibt of all
information that could possibly be relevant to the context
were activated when the context was retrieved.
Alternatively, since some of the information in a
contextualizing knowledge structure is likely to be more
relevant, it might be that this more relevant information
is more available or more easily accessed. Such relevant
information might include the main goals of the activity
and the most important actions in the performance of
those goals. If the comprehension system can keep only a
limited amount of information about a context available
for prediction, then this information is probably the best
sort to have. For instance, if the restaurant context b
involved in a narrative then it is a very good prediction
that subsequent input will include something about the
action of eating. Our results indicate that this more
central information does beneflt from a greater
availability once the context is accessed. Here again we
have demonstrated an important aspect of the functional
organization of knowledge structures.
In conclusion, we take thb research to be a beginning
in the specification the functional organization of
information in knowledge structures for common
activities. Furthermore, we think our results outline a
theory of getting and using context in order to
understand experience.
We are grateful to Robert Abelaon, Kate Ehrlich,
Brian Reiser, Scott Robertson, and William Salter for
their help. This research was supported by a grant from
the Systems Development Foundation.
Bower, G. H., Black, J. B., & Turner, T. J. Scripts in
memory for text. Cognitive Payehologg, 1079, 11,
Galambos, J A . Question anawering and the plan
structure of routine activities. Paper presented to the
American Educational Research Association Annual
Meeting. N e w York, N e w York, March 1982.
Galambos, J A . Question-Answering and the Structure
of Event Knowledge. Paper presented to American
Psychological Association. Washington, D.C., August,
Galambos, J.A. & Black, J.B. W h y do we do what we
do? Proceedings of the Third Conference of the
Cognitive Science Society. Berkeley, California, 1981.
Galambos, JJl. & Rips, L.J. Memory for routines: just
one thing after another? T o appear in Journal of
Verbal Learning and Verbal Behavior, August 1982,
22, no. 4.
Mandler, J.M. & Johnson, N.S. Remembrance of things
parsed: Story structure and recall. Cognitive
Psychology, 1977, 9, 11-151.
Schank, R. C., & Abelson, R. P. Scripts, plans, goals,
and understanding. Hillsdale, N.J.: Erlbaum, 1977.

Conceptual Combination and
Fuzzy Set Theory
Edward E. Smith
Bolt Beranek and Newman Inc.
Daniel N. Osherson
^4a3sachusett3 Institute of Technology

Conceptual combination is the process constituent, pet or fish. Osherson and

by which people combine existent simple Smith argue that this pet-fish example is
concepts (e.g., brown and apple) into just one of an indefinite number of
novel combinations (e.g., brown-apple). counterexamples to the min rule.
As a possible formalism for conceptual Rationale for the present work.
combination, most proponents of prototype There are two problems with the Osherson
concepts endorse fuzzy-set theory (e.g., and Smith (1981) counterexamples. First,
Zadeh, 1965). Osherson and Smith (1981), they rest only on Osherson's and Smith's
however, argue that the amalgamation of intuitions; such claims need to be tested
fuzzy-set-theory and prototype concepts is against typicality ratings of naive
fraught with problems. subjects. Second, there is no indication
Some fuzzy-set theory. A key notion of the generality of the failure of fuzzy-
in fuzzy-set theory is that of a set theory; perhaps Osherson and Smith's
characteristic function, which maps counterexamples are of a few types in some
entities into numbers in a way that underlying taxonomy of conjunctions, where
indicates the degree to which the entity other types might conform to the theory.
is a member of some set or concept. To To deal with these problems, we first
illustrate, consider the characteristic present a taxonomy of adjective-noun
function, c , which measures degree of conjunctions, and then describe some
~F relevant experimental work.
membership in the concept fish (F). When An initial taxonomy of adjective-noun
applied to any creature x, c (x) yields a conjunctions. All counterexamples of the
F Osherson-Smith variety, such as pet-fish
number between 0 and 1, where the and brown-apple, have the following
larger c (x), the more x belongs to characteristics: the adjective concept
F (i.e., the property denoted by the
F. Thus, our pet guppy may not be very adjective) is relevant to the noun concept
typical of fish, so it gets a (i.e., the object denoted by the noun) and
characteristic-function value of .30. Our negatively diagnostic of it; e.g., being
pet dog will get a very low value, say brown is relevant to whether an object is
.05. If we now consider pets (P), and its an apple, and counts against it. More
characteristic function c , then our guppy precisely, an adjective is negatively
P diagnostic of a noun to the extent that
and dog might be assigned the values .70 knowing that the adjective is a true
and .90. description of some object increases the
The issue of conceptual combination probability that the noun is a false
has often been reduced to a question about description of that object, and knowing
characteristic functions: namely, given that the adjective is false of some object
that concepts P and F are combined to form increases the probeUjility that the noun is
the complex concept PSF, how do we specify true of that object. An adjective is
P&F's characteristic function (c (x) ) on positively diagnostic of a noun to the
PSF extent that knowing that the adjective is
the basis of those of P and F (c (x) true (false) of some object increases the
~P probability that the noun is true (false)
and c (x))? The answer from fuzzy set of that object. And an adjective is
ApplyingF this min rule to our pet guppy, nondiagnostic of a noun to the extent that
g, yields
theory is that c (x) is the minimum knowing that the adjective is true (false)
c (g) =« min (c (g) PSF , c (g) ) of some object has no bearing on whether
of c (x)PSFand c (x) . P F the noun is true or false of that object.
-p F Thus, in sliced-apple the adjective is
min (.70, .80) .70 largely nondiagnostic; in red apple the
This says that our guppy is less typical adjective is positively diagnostic; and in
of pet fish than it is of fish. And brown apple the adjective is negatively
therein lies the problem. For as Osherson diagnostic.
and Smith (1981) point out, intuition In addition to the relation between
suggests that a guppy will be more typical the constituents, we also considered the
of the conjunction pet fish than of either degree to which the conjunction provides a
true description of an object that is to 47
be categorized. To keep things simple, we
consider only the degree to which the to-
be-categorized object manifests the
property denoted by the adjective in the
conjunction, and we let the object take
either a high or low value on this
property. T h i s gives a total of six of a brown apple were presented once with
c a s e s , presented in Table 1. "red" and once with "brown." In the A d j -
Table 1 Noun group, on each trial the experimenter
Initial Tunnoay of Adjactiva-Nottn Conjunctions spoke the names of an adjective and noun,
then a picture was presented and subjects
D«9r«« to Which Object M«jiifest> Property rated how good an example the pictured
High Low object was of the conjunctive concept.
(1) (21 Each picture was presented twice, once
Nondiagnostic unslicsd apple unsliced apple with a conjunction whose adjective denoted
object 19 unsliced abject is ilicsd a property the picture had a high value
Relation o n , and once with a conjunction whose
of (3) (4) adjective denoted a property that the
Mjective Positively red-apple red-apple picture had a low value on; e.g., the
Concept Oiaqnostic object is red object is brown picture of a red apple was presented once
to (5) (61 with "red apple" and once with "brown
Noun negatively brown-apple brown-apple apple." All subjects had ten seconds to
Concept Diagnostic object is brown object is red make a judgement, the judgements being
made on a 10-point scale, where higher
Consider now how people might judge numbers indicated better examples.
the typicality of v a r i o u s objects vis a The top half of Table 2 contains the
via the different kinds of conjunctions in data for the three cases of the taxonomy
Table 1. In Case 1, since the constituent where the object has a high value on the
concepts are relatively independent of one property denoted by the adjective. For
another, people might separately judge the Case 1, we expected the minimum rule to
extent to which an object is an instance work. The results are otherwise: the
of the adjective concept and of the noun conjunction's typicality clearly exceed
c o n c e p t , and then combine the outcomes of the minimum of its constituents. A
these two distinct judgements into an coiq>arable deviation from the min rule
overall typicality rating. Since this is also occurred in Case 3. For Case 5,
the key idea behind fuzzy-set theory, some where we expected the largest violations
v a r i a n t of the theory might prove adequate of the min rule, the conjunctions'
for Case 1. In c o n t r a s t , Case S, where typicality exceeds the minimum value of
the adjective is negatively diagnostic of the constituents by virtually half the
the noun, c a p t u r e s the counterexamples scale! For all three c a s e s , the deviation
used by Osherson and Smith (1981). H e r e , from the min rule is significant by a sign
intuition suggests that an object with a test.
a. Object Has High valua on Property Adj-Houn
high value on the property (e.g., an apple tttble
Adjective 2
Noun Ad j-Noun Minus
that is indeed brown) will be rated more TypicalityRating
Ratings for Ttirse Groups,
Rating Rating HlniMB
typical of the conjunction (brown-apple) 8.71 for7.25
It nondiagnostic Separately Each Case8.65 1.40
than of either constituent (brown or 3: Positively
apple). The o u t c o m e s for the remaining ISiagnostic 8.50 7.81 8.87 1.06
C a s e s (2, 3, 4, and 6) might fall 5: Negatively
somewhere inbetween these extremes. Diagnostic 6.93 3.54 8.52 4.98
An experiment to test the taxonomy. b. Object Has Iaw valus on Property
For each of 48 pictured o b j e c t s , one group
of 20 subjects rated the object's
2: Nondiagnostic .45 7.25 .52 .07
typicality with respect to an adjective «i Positively
Diagnostic .02 3.54 .10 .08
c o n c e p t (e.g., red, brown, s l i c e d ) , a
second group of 20 subjects rated its 6t Negatively
Diagnostic .81 7.81 .39 -.42
typicality vis a vis a noun concept (e.g.,
a p p l e ) , and a third group of 20 rated its As for a l t e r n a t i v e s rules within
typicality with respect to an adjective- f u z z y - s e t t h e o r y , none seem to d o a better
noun conjunction (e.g., red a p p l e , brown job. G o u g i n ' s (1969) m u l t i p l i c a t i v e rule
a p p l e . sliced a p p l e ) . The adjective-noun s u g g e s t s that the c o n j u n c t i o n ' s typicality
c o n j u n c t i o n s were such that all six cases rating should be less than the m i n i m u m
of our taxonomy were tested. v a l u e of the c o n s t i t u e n t s , w h i c h is even
In the Noun group, on each trial the wronger than the min rule. Another
experimenter spoke the name of a noun, a l t e r n a t i v e is that the conjunction's
then a pictured object appeared and typicality v a l u e be the a v e r a g e of its
subjects rated how good an example it was c o n s t i t u e n t s , b u t this too is v i o l a t e d by
of the noun concept. Each picture was the data (see T a b l e 2 ) . T h e best-fitting
presented o n c e . In the Adjective group, p o s t hoc rule is that the c o n j u n c t i o n ' s
on each trial the experimenter spoke the typicality is the maximum of its
name of an adjective, then a pictured c o n s t i t u e n t s . T h e max rule w o r k s w e l l for
object appeared and subjects rated how C a s e s 1 and 3 but fails for Case 5; and it
good an example the pictured property was is not really a serious p o s s i b i l i t y in
of the adjective concept. N o w , each fuzzy-set theory for if c o n j u n c t i v e
picture was presented twice, once with an c o n c e p t s are r e p r e s e n t e d by a m a x i m u m then
adjective denoting a property that the there is no o b v i o u s way to r e p r e s e n t
pictured object had a high value on, and
once with an adjective denoting a property
that the picture had a low value o n . ;
e.g., the picture of a red apple and that
disjunctive concepts.
The bottom half of Table 2 contains
the results for cases where the pictured
object had a low value on the property
denoted by the adjective. For all three
cases the min rule works well, but only
because subjects in the Adjective and Adj-
Noun groups judged the pictured objects to
be nonmembers of the relevant concepts.
Thus, when presented a picture of a brown
apple and asked to judge its typicality of
red or of red-apple, most subjects gave it
0 ratings. This floor-effect, which
prevents us from taking the data in the
bottom of Table 2 as a sensitive test of
the min rule, reflects a poor choice of
how to experimentally implement the extent
to which an object instantiates the
property denoted by the adjective. Thus,
for the concept red, had we used pictures
of red apples and reddish-brown apples, we
might not have obtained so many 0 ratings
for the concepts red and red-apple. This
change has been made in our subsequent
In conclusion, for cases where an
object "fits" a concept well, fuzzy set
theory fails to provide an adequate
account of conceptual combination.
Osherson, D.N., & Smith, E.E. On the
adequacy of prototype theory as a
theory of concepts. Cognition. 1981,
9, 35-58.
Zadeh, L.A. Fuzzy sets. Information and
Control. 1965, 8, 338-353.

Natural Language Prooaaslng
Ualng Spreading Activation
and Lateral Inhibition
Jordan Pollack & David Waltz
Coordinated Science Laboratory
Oniverslty of Illlnola


The knowledge needed to process natural useless Interseetiona, and quantitative adjustment
language comes nrom oany sources. While the may result in "heat death," where every node
knowledge Itself may be broken up modularly, Into becomes activated. (A solution for this latter
knowledge of syntax, semantics, etc., the actual form of activation involves the use of decay,
processing should be completely Integrated. This dampening factors, or the spread of negative energy
form of processing la not easily amenable to the - lateral inhibition.) Nonetheless, both forms of
type of processing done by serial *von Neumann" spreading activation display Interesting behavior.
computers. This work In progress is an For example, the previously mentioned work by
Investigation of the use of a spreading activation Collins and Quillian showed how spreading
and lateral Inhibition network as a nechanlam for activation could account for aspects of human
Integrated natural language processing. memory priming, while Fahlman's work demonstrated
that many forms of problem solving could be
simplified when an Intersection search was
This work was supported in part by the Office of computationally "free." Ortony, on the other hand,
Naval Research under contract N00014-75-C-0612. built a system for schema selection using damped
activation, and McClelland and Rumelhart effected a
close simulation of experimental results on human
INTRODUCTION letter and word perception in context.
Other work In parallel approaches to natural
It has long been thought that the modular language processing has been done by Small [1981]
decomposlbllity of i''"T'"'g« ifnowiadyn into syntax, and Hleger [1977] in which the traditional practice
semantics and pragmatics implied that langimga of breaking down knowledge Into syntax and
pm^'a!l.1^ ng could be similarly decomposed; that semantics was turned on its head, and knowledge of
natural language could be processed by first all kinds was distributed to individual "word
parsing the syntax, then fleshing out the meaning experts"; by Hendler 4 PhllUps [1981] who are
of a syntactic derivation tree, and finally (if we working on an ACTOR-based [Hewitt,1976] NLP systea;
could ever get to this point!) attempting to and by Gigley [1982] in which a
Interpret the speaker's Intentions. Nowadays, it neurollngulstically-inspired
has become apparent that this processing is simulating aphaslc behavior
Integrated In humans [Marslen-Hllson, 1980], and The authors of this paper are presently
that It should, thus, also be in computer models buUding a NLP system In which the knowledge
[Schank k Blmbaum. 1980; DeJong, 1980]. However, sources are modular, but the processing is fully
the natural inolinatlon of von Neumann computers to Integrated. The integration mechanism is an
run one-step at a time presents a severe roadblock activation/inhibition network similar in nature to
to the kind of Integration needed for NLP. the one used by McClelland and Rumelhart and
What Is needed la an integration mechanism described below.
sensitive to interpretation pressures from several An activation/inhibition network la a weighted
directions. A promising approach would seem to be directed graph, where node weights represent
the use of a quantitative spreading activation / activation levels, and link weights represent
lateral Inhibition network. This kind of network, strength of activation if positive, or strength of
similar in conception to relaxation techniques for inhibition if negative. The process of spreading
low-level vision, and to neural network models, activation / lateral Inhibition Involves Iterative
works through
AND RELATED Iterative adjustment of real- recomputation of activation levels. At each cycle,
valued node weights. .every node receives a contribution from each of Its
The term "spreading activation" Is almost as neighboring nodes equivalent to the neighbor's
overworked as the term "frame," but most systems activation level multiplied by the weight of the
which spread activation do It in one of two ways: Intervening link. This contribution (scaled to
As mrltftr oaaalM Interaectlon s»nroY, [Quillian, range between -1 and 1) causes a proportional
1968; ColUns & Quillian, 19T2; Fahlman, 1980], in change in the activation level of the node; a
which a parallel intersection search Is simulated contribution of 1 zaps the node up to Its
by binary marking of adjacent nodes in a breadth- (predefined) naxlDum activation level while a
first manner, or as quantitative welifht halannln^r contribution of -1 saps the node of all its
[Ortony, 197U; McClelland & Rumelhart, 1981], in strength. Eventually, a static condition is reached
which activation energies assigned to all nodes are where some nodes reach their maximum or oinlmum
Iteratively adjusted, based on local activation strength, while the rest of them receive
energies and strength of connections. One of the NETWORK CONSTRUCTION
contributions of 0. (For a complete mathematical
well-known dangers of spreading activation Is its An activation/inhibition
formulation network such as this
see Pollack [1982b].)
potential for overkill; an intersection search,
under certain circumstances, may generate too many
can sBOOthly oodel the flow of quantitative In figure 1 with arrows denoting activation links,
constraints up and down a nultllevel syaten. For and circles denoting inhibition links (following
natural language processing, tbe main problem McClelland & Ruaelhart). Note that each node in
becomes how to build such a Bultilevel network. We this network is suffixed by two numbers which
feel that a proper network can be built through the f that nocle.
Judicious instantiation of network fragaenta which
are represented in standard knowledge
representation structures, such as fraaea [Hinsky, 1S 0 5 '. , S 0 S P ,,
1975]. " \ ,
The fraaes in our systea contain the knowledge IVP 1 5' .J•" / ^: pp 2 51 \
of syntax, of senantic features, and of case roles,
organized to efficiently generate pieces of network JNPai,-^ VP 1 3 ' ?=.£.= 2 3 " NP 3 5
on denand. These fi-anes are richly Interconnected '
with activation and Inhibition links, and . / - Z T L Z l ^
constitute the general knowledge base of tbe |VP 1 2 | • PART 2 3 1 / 1 OET 3 u : i .1 u 5 •
aystea. When sentences are input, a teaporary 1 !/ i I
network is constructed out of fragments stored 1 JOHN Oil 1 ATE 1 2 11 UP 2 3 1 1 TXe 3 4 1 1 STREET n S |
within lexically accessed frames. These fragments
are organized into a network by the same sort of rlCUSE 1 - llHtkx AcTlViTIOH/lnMUlTlON 'lETWOUK
breadth-firat operation used in a chart parser roR "John Eats L'p The Sheet'
[Kay, 1973]. The resulting network has activation
links between phrase markers and their
One would expect a robust NLP system to be
In more detail,
constituents, the required
and inhibition links actions are as
between pairs of
follows: that have comnon constituents. (So far, we confused by ambiguity but then to gracefully
resolve it. This is Indeed what happens. Figure 2
have done
First, theisnetwork
there buildingiMtantlatlQB
breadtb-flrat by hand.) of
contains a graph of the activation levels over time
nodes representing phrase markers, case roles, and for all the nodes in the network. Each node is
expectations for other nodes. These expectations depicted by a single letter, and each activation
are triggered when lexical Iteaa or grammatical cycle by a horizontal row in the graph. When a
constituents are encountered, and consist of siaple letter traces a path to the left. It is being
feature patterns to match and connection procedures inhibited and when It moves to the right, it la
to be carried out if the match occurs. Secondly, being activated.
there is nattam-baiiad nonnaetlon whereby if a The most interesting node pairs to watch are B
newly instantiated node matches a pattetm, specific and C, the mutually inhibitory sentences, and G and
linkages are made. As an exaaple of these these two F, the mutually inhibitory verb part23phrases:
Is ahcMn as 1}
processes, if a node of type NP is instantiated, it jchriJI Is shown as §)
will then cause the instantiation of an expectation npOl is shown as A) FreD23 Is shown as J)
that a VF will occur; if a TP is found, an S is 305 is shown as B) aie5'> Is shown as K)
generated and connected to both the NP and TP. Of ,s05p is shown as C) detSU is shewn as L)
course, if more than one candidate for a pattern ate12 is shown as D) streettS is shown as M)
shows up, the two candidates are connected with an vp12is shown as E) T«5 Is shown as N)
inhibition link, so one will eventually be vp13 is shown as F np35 Is shown as 0,
Activation Level pp25 Is shown as P.
eUfflinatad. vp15 is sham as G
The aetlvatlon and. 1nhih<t.inn processes ui£3 is sfaon as H)
reinforce . nodes that are supported by aotivatlon Jlt - j«» »•
c SG t r
links and inhibit those which are not, so, for c s r
example, expectations that are not quickly •« x
fulfilled will die. Furtheraore, activation and CC 19CG
inhibition are also happening in the background c a •^ ". • ^
frame systea by a purely word associative scheme, a s T a-
== IS :•!!*• S
which helps prime good word senses (and aids in
s c -J a ^f
schema selection). Finally, nodes which become cc «t" -i
inhibited below a certain point are yarha^n a Fiz J H •• f 3: 3
poiiaotad thus keeping the active network as small r : 3 ^: • - f
as possible. r t I: a .^c
-• -•• ,i: +
f* :: K3J
a ••• •• f;!
Some preliminary results are presented here [ B .•- a
which demonstrate the feasibility of the
activation/inhibition approach to NLF. As r• I H -J 3 >—
mentioned above, since the systea is In its early I - ,1 5 = -^ Ti
stages, the networks presented were built by hand. C
We demonstrate bow the system reacts to syntactic i ,
ambiguity, how a lexical preference can affect its cr
behavior, and finally how semantic constraints can
be integrated.
Consider, then, the following sentence, which,
in the absence of any semantic knowledge, is Figure 2 - "Confused"
syntactically ambiguous due to the lexical 51
ambiguity of "up":
John ate up tbe street.
The hand-built network for this sentence is shown
Bs(Jotan} (ate (up the street)) ;te up OBJ
C3(John) (ate up) (the street)
Gs(ate (up the street))
Fa(ate up)
The system Is confused at first: B is more
heavily weighted than C, so the sentence with the
preposition is selected, while F is more strongly
activated than Q. so the verb-particle phrase is FP 2 5
selected. This selection is, obviously,
innonaiMtant. But then, after about 30 cycles, the NP 0 1 VP 1 3 I PREP 2 3 I (IP 3 5
system "decides" ("Look Ma, no homunculusI") on a
consistent reading of "up" as a preposition, and VP 12
weights G more heavily than F.
In the absence of semantic preferences (e.g. a JOHN 0 11 I ATE 1 2 I | UP 2 3 1 THE 3 "tj | STREET i« 5j
preference for interpreting "street" aa a FlOUM t - S€)W«TlC»LLr AuGMfNTtD NrTWOBK
location), syntactic preferences can play a role.
Certain words ia. have lexical tendencies, a s , for
"ate-up-loc" and "ate-up-obj." These nodes
Instance, the word "does", which is most often a
represent "cases" [Fillmore, 1968] of their
verb, but which is also a plural noun, meaning
respective nodes and are a subset of those that
several female deer.
would be instantiated by our system. The pattern-
Figure 3 demonstrates the sensitivity of an
matching connection component would connect the
activation/inhibition network to syntactic
prepositional phrase "up the street" to "ate-loc"
preferences. The link strength fl-om "up" to
based on its span and on inherited features from
"particle" has been increased, corresponding to a
•up" and "street".
lexical preference. Notice that the phrases related
The modified network is shown in figure 4, and
to interpreting "up" as a preposition (B, G, J, and
figure 5 graphs the response of the
P) become inhibited much more quickly this time.
activation/inhibition network to this new
However, when humans process this sentence,
information. As one can see, after 15 cycles, all
they also take Into account the knowledge that
nodes related to interpreUng "up" as a particle
"street" is a good candidate for a location, but a PROSPECTS
are being rapidly inhibited. (T, S, C. F, and I ) .
bad candidate for the object of eating. The next The results given above are interesting in
example demonstrates the sensitivity of our NLP that they demonstrate the sensitivity of
approach jchrflltoIs shown
this assemantic
g) part23 is shown
knowledge. Four asnodes
activation/Inhibition networks to sUgbt
ncOl is shown as A) preD23
have been added and connected into the network. The is stxja as J)
verb (s05 is afaown
phrase as B)
"ate" ttie34 is shown
is U n k e d to "ate-loo" and as"ate-
detSI 13 shown as I) JchrOI is shown as $) . is shown as J)
(s05p IsatKunas C)
o b j , " and the verb phrase "ate up" is U n k e d to is shown as t)
(atel2 Is shown as 0) street45 is ahoun as M) npOl is shown as A)
ntS is aiiowi as N) s05 is shown as B) , ,.• Is shown as L)
(vp12 is shown as E) (street45 is shown as M)
vp13 is shown asActivation
F) n p ^ is shown as 0} s05p is shown as C)
Level (nit5 is shown as H)
vpIS is shown as G) ppc5 ir shown as P) ate12 is shown as D)
vp12 is shown as E) ,cp35 Is shown as 01
up23 isIV staoun as U) pp25 is shown as P)
?ac vp13 is shown as F
vplb is shown asActivation
0, ateloc Is shown as Q)
ateobj Is sfaoun as R)
,' • - ' 2 : ! up23 is shown as U
J ,. • = •: part23 is shown as I) ataiploc
i« I is shown as S
^'?^^i ateupobj is shorn as T)
« - r^•r
3 f P "0 4 . u«
» • 7 i. •: 3 as r * 'J fa»»i':
I • Tj. •:
•3 1 r ^; * • a • :
V? ic • T*:
• 3 : , I Hc«« "5 • i
i c :• '-*^:
^ 4 •• 1^
a f» • •
a f-ii-n
•c ,• - - — -
1 *• ""a
*u c* f
•e c r-T : iii'. I.
: I := M 3-^" «
: I ^. .1 ?: ?
z 5 := \i( -.1 r-a
= I 3 :i5:i:
Figure 3 - Syntactic Preference for
Figure 5 - Added Semantics
"Up" as Particle
differences In luiowle<lge. Currently we are working approach for Natural Language Processing" Cnanitiv«
to complete the autoaatlc instantiation and Science V.3 #.3 pp. 251-273, 1980.
connection coaponents of the systea. Fahlman, S.E., H£IL: A avatea for ReprRaantl ng aM
The use of a parallel and decentralized Oalng Real-World KnowlwrlTA, MIT Press, 1979.
decision process can be brougbt to bear on many
other Interesting problems In NLF as well. For Clgley, H., "Neurollnguistlcally Baaed Modeling of
instance, there are Indications that the timing and Natural Language Processing," Doctoral Thesis (in
volume of spoken language both play useful roles in preparation), University of Mass, Amherst, 1982.
disambiguation [Wales and Toner, 1979]• A system Hendler, J. and B. FhUlips, "A Flexible Control
based on activation and inhibition could be Structure for the Conceptual Analysis of Natural
designed for sensitivity to these clues, since time Language Using Message-Passing," Technical Report
is, after all, a crucial element in the 08-81-03, Texas Instruments, 1981.
activation/inhibition process. Hewitt, C, "Viewing Control Structures as Patterns
Furthermore, the processing of garden path of Passing Messages," MIT AI Memo UIO, 1976.
sentences, which are an interesting but not well-
understood phencoenon in natural language, could Hobbs, J.H., "A Metalanguage for Expressing
quite possibly be handled by an Grammatical Restrictions in Nodal Spans Parsing of
activation/inhibition network. Marcus [1979] built Natural Language.', Report NSO-2,Courant Institute,
a parser which attempted to account for garden-path Min, January 197'*
sentences as a result of memory limitations. Kay, M., "The MIND System", in Rustin (Ed.) N>«tural
Unfortunately, there are garden path sentences his '•«"?"'""> PT^n<>«•^^•^^^n.f• Algoritbmics Press, Hew Xork,
parser could (though shouldn't) heindle [Milne, 1973- Marcus, M.P., A Theory oT SYTit.actlc
1980], such as: R«nngnlt1nn far Natural Language. MIT Press, 1980.
The prime number few. Marslen-Wilson, W. and L. K. Tyler, "The Temporal
Within the framework of activation/inhibition Structure of Spoken Language Understanding,"
networks, garden path sentences would be accounted £ogaitlfla V.8 #.1 pp. 1-72, 1980.
for by irreversible inhibition of expectations. Milne, H., "Using Determinism to Predict Garden
Also we have recently begun to consider ways Paths," DAI Research Paper 142, University of
of integrating a novel form of knowledge Edinburgh, 1980.
representation, 'event shape diagrams' [Waltz
Mlnsky, M., "A framework for Representing
1982], to model certain kinds of metaphor
Knowledge", in Winston (Ed.) The Pavphningv nf
understanding and adverbial modification. As an
Computer Vlalon Megraw Hill, New lork, 1975.
example, these methods should allow us to interpret
sentences such as: McClelland, J.L. and D.E. Rumelhart, "An
Robbie's metal legs ate up the space between Interactive Activation Model of the Effect of
himself and Susie. Context in Perception", Technical reports 91 4 95,
as meaning a kind of PTSANS [Schank 1975]. Center for Human Information Processing, UCSD,
Finally, a practical syatea based on
Ortony, A., "SAPIENS: Spreading Activation
activation/inhibition networks could be the
Processing of Information Enclosed in Assooiatlve
starting point for new computing architectures. In
Network Structures", unpublished. 1976.
this vein, [Pollack,1982] has designed a VLSI cell
for parallel simulation of activation/inhibition Pollack, J., "An Activation/Inhibition Network VLSI
networks, thus showing that a programmable set of Cell", WP #31, Advanced Automation Group,
logical connections (i.e. links) can be run on a Coordinated Science Laboratory, Urbana, January,
machine with fixed and regular physical connections 1982
(i.e. wires). Pollack, J., "An Activation/Inhibition Approach to
CONCLUSION Natural Language Processing", WP #35, Advanced
The processing of natural language requires Automation Group, Coordinated Science Laboratory,
the sensitive Integration of multiple sources of Urbana, AprU, 1982
knowledge. A mechanism very likely to achieve this Schank, R.C., "The Primitive ACTS of Conceptual
integration is an activation/inhibition network. Dependency", in Schank 4 Nash-Webber (Eds.)
-Thpopfttioa1 laauea In NLP. ACL, Arlington, Va.
Schank, R.C. and L. Birnbaum, "Memory, Meaning, and
REFERENCES Syntax," Research Report 189, Yale C.S. Department,
November 1980.
Collins, A. and M.R. Quillian, 'Experiments on Rieger, C, "Viewing Parsing as Word Sense
semantic memory and language comprehension' in L.W. Discrimination," in R. Dingwall (Ed.) A Survey nf
Gregg (Ed.) Cognition la Learning and. Hemnrv. Llmnilahlt. Snlanna Greylock, 1977.
Wiley, New York, 1972. Wales, R. and H. Toner, "Intonation and Ambiguity",
Church, K. and R. Patil, "Coping with Syntactic in W.E. Cooper and C.T. Walker (Eds.) Sentsnow
Ambiguity or How to Put the Block in the Box on the Proeeaaing: PaYchollagulatlc Studlfla Prs.-3ented in
Table," presented at 56th Linguistic Society of Merrin Gpprgtt. Erlbaum, New Jersey, 1979-
America meeting, December, 1981. Waltz, D.L., "Event Shape Diagrams", To Appear in
DeJong, C, "Prediction and Substantiation: A new frflfi.. NCAI. Pittsburgh, August, 1982.
Laura Silver
University of Pittsburgh
Lawrence J. Mazlack
University of Cincinnati

0.0 ABSTRACT It is of further Interest to develop an under-

standing of the relationship between the natural
This work addresses the pragmatic and semantic
language system and the artificial language
distinctions between natural and artificial
systems also developed by humans. These systems
languages by the development of a context-free
are intentionally created in order to represent
generative grammar to describe motions in =odem
systematically in systems of signs other percep-
dance. The dance is a particularly good vehicle as
tions In the symbolic manner or representation.
it conveys meaning, but is undescribed by a
Systems of signs can be represented as systems of
generative grammar. Whether or not a grac=iar
signification where perceptions exist within the
describing dance motion can be considered to be for
plane of content and are represented by the
a natural or artificial language is unclear.
symbolic plane of expression. Artificial language
systems such as mathematics and logic, are
There are two different kinds of languages: usually referred to as symbol systems, however
natural and artificial. Artificial languages have they too are language systems and function as
been developed to deal with formal systems of man- semiotic systems, as they are formal systems of
created knowledge. Natural languages enable signification.
naturally arising entities to deal with their In order to extend the analysis of informa-
environment. Generally, artificial languages deal tion distinctions between the semantics of natural
only with truth or knowledge that is specific to and artificial systems have to be clarified as do
their artificial environment. the distinctions between information and prag-
Both the written and spoken forms of human matics.
speech are universally considered to be languages. 2.0 BACKGROUND
Animals as well as humans appear to communicate Language, communication and information are
with each other through body motions. Whether or three tightly interwoven concepts. The problem
not body motioo should be considered a language is of inforoatlon representation and communication
open to deoate. Some workers believe that the term Is the focus of this work.
"language" should be narrowly defined to Include 2.1 Language
only signaling systems which are capable of
manipulating abstractions. Others, would consider Language is a process or symbollzatlon that
any orlgized system of signaling to be a language. enables signification of some thing by represent-
It is agreed that whatever a language Is, its ing It by something else. The "thing" represented
construction and Interpretation is constrained by has an existential space-time reality; the
a specification mechanism. In langiiages, the con- representation is an abstraction of the reality.
struction specification is called a grannar. Meaning is derived from the relationship between
Precisely how humans come to know the grammar the physical and the symbolic
of a language is unknown. One group of workers 2.1.1 Meaning
holds that it is learned. The other group,
believes that the capability Is innate. Irregard- Language provides the capability to function-
less of how men come to icnow the structures of ally relate symbolized meanings. However, lang-
spoken language, they certainly are capable of uage is more than individual relationships among
learning the grammars of artificial languages, the meanings. Words, which are symbols, recur-
for example, automata. sively become things themselves as they are
Both artificial and natural language can utilized. As things themselves, they can be used
carry meaning; i.e., have semanticity. However, symbolically to express or represent concepts as
the semantic information represented by artificial the next order or abstraction. Signs, syabols,
languages appears to be of a different type than words, tokens, pictographs are the tangible prod-
that of the information carried by a natural lang- ucts of the interrelationship between the thought
uage. and the referent. This interface provides an
In order to develop an understanding of the operational definition for the nature of the con-
pragmatic and semantic differences between natural cept of meaning, the property cf language defined
and artificial languages, a generative grammer is as "semanticity." That is, "the property of being
being developed to represent dance generation. able to convey meaning" [LYON 79].
The developed grammar is artificial, that describ- 2.1.2 Semiology: An .Analytic Tool
ed appears to communicate naturally. Whether or De Saussure defined language as the Semiotic
not the dance is a language is open to question as system; i.e., the science of signs. The sign is a
the tokens of the dance are never abstractions. subsystem, or a component of Che system of lang-
The problem is to understand the nature of uage. The principle of "signification" indicates
language: of how humans perceive, understand and the relationship between the thing signified (the
represent their world in their semiotic system. signified) and the things signifying it (the
slgnlfier). Sigr.lfiers exisc within the "plane The language being observed is usually called
of expression" and signifieds exist with the the object-language. The language used to discuss
"plane of content." This relationship expressed the object-language is called the metalanguage.
by the sign as: The semiotic of the object language is torrulated
sign » (signifler, signified) in the metalanguage system. Carna? identified the
semiotic analysis of the object language into the
which is a specific relation between the plane of three components of syntax, semantics, and prag-
physical reality cr "content" and symbolic reality matics.
or "expression." More generally, the sign is The terms syntax, semantics, and pragiatics
defined as: are somewhat ambiguously applied. ?art of the
ambiguity of these terms is a function of whether
sign = (plane of expression, plane of content)
the analysis of the object language is either the
A language is considered to be comprised of a set natural or artificial form of language.
or system of signs. According to Carnap, syntax "attends strictly
to the expressions and their forms." However,
Semiology aims to take in any system of "syntax may include rules which determine certain
signs, whatever their substance, and limits; logical relations between sentences, e.g., the
images, gestures, nusical sounds, objects, and the relation of derlvability" [CARN 79]. The inclu-
complex associations of all these...constitute, sion of the property of derlvability in the
if not languages, at least systems of significa- syntactic component blurs the boundary between
tion" [Bart 9]. Seminology will be used as tool syntax and semantics.
in the analysis conducted by their work. 2.4 Differing Semantics: Descriptive and Logical
2.2 Natural and .^tificial Language
Both natural and artificial languages contain
Whether or not artifically constructed the components of syntax, semantics and pragma-
languages, or language schecas, can be considered tics. In the artificial language system, seman-
as language "proper" is not central to this work. tics refers only to the expressions and their
Semiology, although initially concerned with designations without reference to any particular
natural signalling or communication systems set external system. In the natural language system,
the stage for the analysis of any system of signs, semantics includes the analysis of meaning by
whether they be natural languages or artificial pointing to referents in the extentional world.
language systems. In essence, there are two kinds of semantics,
In discussing languages, Carnap states which can be understood as depending on either
context-sensitive or context-free grammar. In
"so long as we are concerned with artificially or "logically" constructed language
building this language, and not systems, the grammar is context-free. In natural
with its application and inter- language, the grammer is context-sensitive. This
pretation respecting a given theory, latter form of semantics could be referred to as
the signs of our language remain descriptive semantics, following the terminolog-
uninterpreted. Strictly speaking, ical distinction that "descriptive linguists" do
what we construct is not a language the analysis. Their analyses are context-sensi-
but a schema or skeleton of a lang- tive, in that they include the pragmatic component
uage: out of this schema we can of meaning. In contrast, the form of semantics
produce at need a proper language pertaining to the artificially or logically con-
(conceived as an instrument of structed language system can be labeled logical-
communication) by interpretation semantics.
of certain signs." [CARN]. 2.5 Communication and Information
Cherry discusses the difference between the Languages can be both naturally developed and
natural and artificial kinds of languages. artificially created. At the semiotic metalevel
By 'language' we shall mean those organ- form of analysis both forms of language are treat-
ically developed systems, whether spoken ed as object level languages as both fulfill the
or scribed, by which humans transmit need to signify; i.e., to represent perceptions
messages; but the work 'cipher,' or and abstractions. This process of signification
'code,' will be used to mean any invented, is more generally known as communication, where
self-consistent system, whereby one set the language serves as an instrument of communi-
of symbols may be transformed into another cation.
for certain special stated purposes" In the analysis of the problems inherent in
[CHER 93, 94]. communication, workers such as Shannon and Weaver
This difference between "language" and "code" can have identified three levels of difficulties which
be understood not as a. difference in structure, complicate the problem of identifying information,
but as a difference in development. The concept particularly at the semantic level: (a) accuracy
of language generally implies an organic or of symbol transmission, (b) communication of mean-
natural development, and consequently referred to ing, and (c) effectiveness (how conduct is
as "natural" language. The concept of code affected). These three levels are all concerned
Implies an intentional development, and conse- with the concept that is labeled information, yet
quently if referred to as "artificial" language which "information" applies is not consistent for
systems. all three levels. Level A uses "information" as
2.3 Analyzing Language the amount of signal transmission, where Levels
B and C the "information" is the semantic and
pragmatic sense.
3.0 PROBLEM DOMAIN the syntactical units to generate more complex
units. Consequently there is no representation
In order to develop a context-free informa- by a grammar expressing the information sysrem
tion representation a doaain other than human that is communicatea by the movement.
verbal communication had to be selected. Verbal Other notation systems are similar [SILV].
communication is too context-sensitive. Rather The Eschol-Watchman, the flenesh, Klnesics, Choreo-
than working with the anbiguities of human verbal metrics, to name only a few, only differ in the
communication where it is difficult not to be specific particularlzation of the syntactic repre-
pragmatic, or with information system design where sentation.
the objective is to be pragmatic, the information
Why is there no system for representing the
system of human movement comcunicaclon was select-
human body and its movement apart from any con-
text? Perhaps because in the development of the
Just as linguists have attempted to develop
representational systems, distinctions between the
the notation tor natural language grammars, seek-
syntax and the logical-semantic were never clearly
ing to represent the logical-semantic component,
a similar gramcatical structure of human movement
can be developed. Generally, the domain of human 3.2.2 Models and SimulacionB
communication is categorized into the verbal and The objectives of various designers of models
the non-verbal. The verbal includes both the and simulations of human movement have been to ex-
verbal and written forms of natural language. The tend the representational facilities of the human
non-verbal includes everything that is not verbal by mechanizing the laborious task of describing
communication. Within this large category of non- and computing problems in human movement. The
verbal, the domain of human movement conmunication goal was to develop a computer graphic display of
has been selected in order to construct a formal a human model [POTT] [BILL].
grammar representing the logical-semantic compon- Another area of research was the development
ent of human movement information. of an interactive graphic editor for Labanotation
3.1 Purpose [brow]. The objective was to use the computer to
This research investigates the semantic com- facilitate the laborious process of hand writing
ponent of the artificial language system. The Labanotation. This work was extended as part of
concern addressed is the clarification between the development of a graphic slmulacion for human
the descriptive-semantics with the context-sensi- motion [BALD] [TRAC].
tive grammar (CSG) representation, and the log- 3.2.3 Recognizing What To Know
ical-semantics with the contaxt-free grammar (CFG)
representation. The purpose is to illustrate the The visual aspects of the perception of move-
separation of the logical-semantic from the prag- ment are essential in the design of mobile robots
matic, in order to demonstrate that it is possible and the context-sensitive forms of representations
to separate the information structure from poten- are usefull. However, a context-free fona of rep-
tial meaning. The CFG is a template, providing resentation is preferable prior to any context-
the structure for the set of possible construc- sensitive (i.e. applied) form of representation.
tions any eventual user could select in order to For example, the visual aspect can be specified
represent any intended meaning. Prior to the as a context-sensitive situation, which subse-
representation of meaning a structure has to be quently can be defined using a context-free
defined whereby meaning representation can be made grarmatlcal structure. Research on this problem
possible. Just as the information system can be is important not only for the solution to problems
viewed as both a process and referred to as a in movement understanding, representation and
thing, a grammar can also. It is a template and generation, but also to Illustrate the context-
therefore a thing, but it is a dynamic processing sensitivity of systems.
structure. If we wish to represent the dynamic process
3.2 Existing Systems Representing Human Movement of information, research must be done on abstract-
The representational systems for human move- ing the information from the pragmatics of use of
ment are data systems in chat they are bound to that information. The structure of the process of
some pragmatic component and that they are con- information must be represented prior to the prag-
text-sensitive. Each has a basic set of symbols matic application of the Information.
representing units that the user needed to -3.4 Separating Logical and Semantic Descriptive
represent. Although each representation system Structures
identifies a variety of syntactical units, no Human movement and human verbalization both
logical-semantic or grammar has yet been develop- have the association of meaning with the sensorial
ed. Thus there is as yet no representation for transmitcable component of movement and of speech
the process of human movement information. that is transferred as a product of the informa-
3.2.1 Notations tion system. Where natural language has the
Labanotation is one of the most widely used symbolic representational facility of the written
representational systems for notating human move- form, providing another channel for the transfer
ment [dnot] [HL'TC]. The notations are syntactic of the information, the movement notations do not.
representations specifying syntactic units: Just as the information process of human verbal-
direction, level, timing, and areas of the body, ization has been grammatically coded, the aspect
which are represented by unique symbols. It is of the problem that first needs to be addressed is
not possible to use the system for anything but the definition of the logical-semantic, i.e. the
the description of the movement in the units pro- formal representation of the grammatical structure
vided by the initial symbol set. There are no to code the information transferred via human
structures or rules indicating relations among
movemenc. feature of productivity, and can generate valid
expressions to represent new meanings even in the
Before the graraar can be context-sensitive, extensional world to which it is bound.
it needs a context-free fcir:. The problea is to
develop a way to write movement infonation such Looked at in this way, a grammar exists at the
that context-sensitive meanings can then be meta-level, providing a form of analysis for an
communicated in a written symoolic fona. information structure for any possible object-
level expression that is generated in that language
3.4 Conmiunication Systems: Human Movement system.
Compared to Natural Language BNF was chosen to represent the granmar. It
The semiotic system of human movement provides a method of notation with the capability
communication was selected as the domain in which to code information that is dense and non-linear.
to investigate representation of a natural activ- BNF also lends itself to consistency verification.
ity in an artificial language. 4<2 Scope: Context-Free Representation
Adequate representation of the generative of An Information System
structure of the semiotic system seems to be the In the analysis of the communication or
necessary and sufficient conditions which lin- semiotic system of human movement that is to be
guists, anthropologists and philosophers require represented, only the logical-semantic form of the
for "language" identification. U'here natural semantic component will be considered in order to
language encompasses both verbal and written forms illustrate that context-free representation is
of the sounds and their meanings, movement lan- possible when the coding is only of the inf or::iation
guage exists only with what can be equated with system rather than including the pragmatics of the
the verbal level of natural languages. A compari- communication system.
son of this difference would be equating the nota- This reception of data as information by the
tions for movement with the phonological ortho- receiver is the pragmatic component, which is added
graphic representations of the sounds of natural to the input data from the sender. Meaning is the
languages. Each natural language has particular result of the contextual processing of data given
orthographic s>-mbols necessary to represent the some Information input.
sounds of that language, just as each form of In order to develop a context-free grammar for
movement has developed notational symbols to rep- Che logical-semantic of human movemenc information,
resent the visual perception of moveaents part- a non-purposeful context needs Co be examined, i.e.,
icular to that form. However, where verbal where the movement is not intended to communicate
language has not only the particular phonologog- any meaning buC where the units of movement are
ical representation, it further has a representa- learned for the production of movement icself, which
tional form which is called the written from where subsequenclv can be used in various contexts to
the meaning in the experiential form can symbol- communicate a variety of meanings.
ically be represented. 4.3 Dance Units
4. INVESTIGATIVE STRUCTURE Dance inscruccors teach the units of the move-
The specific problem addressed is the develop- menc language withouc any intended transfer of
ment of a prototype for information representation, information ocher than hew to produce the units of
by experiment with representions of the losical- movement. The vocabulary of movement that is used
semantic structure of human movement information for dance is a complex series of units, which are
using the BNF form of the context-free granmar as derlveable in terms of initial units, plus rules
the analytic tool. for connecting the various units. These more com-
It Is posited that the situation in natural plex units are referred to as "combinations." The
language representation is analogous to problems in units and the combinations are the information
information system design. Before the granniatical communicated in dance inscruction.
structure is constructed, con-prised of the vocabu- 4.4 The Goal: A Movement Semantic
lary elements of the system and the set of This methodology formally can be represented
relations anong then, particular referents to the as an operation of the logical structure of the BNF
units are assigned, building the concext- grammar, operating upon the selected scope of the
sensivitity into the initial design of the system. verbal channel of the domain of the information
The idea of the context-free form of representation system of human movement yielding as a product a
preceeding any context-sensitive representation is grammatical representation of the logical-semantic
the direction of this research. component of the information system.
4.1 Role or The The movement semantic will be the graimnar
The grammar itself is a representational tem- derived from the operation of the template pro-
plate, in that it does not contain the meaning, but cessing the logical-semantic structure of the infor-
rather provides a structure. The graruaar is a pro- mation into a representational form. This product
cess in that it is used as a template. It is a will represent the results of research of the
commodity in that it is a tool constructed for representation of the dynamic structure of the
analytical and representational purposes. It is information process in a grammatical context-free
tied to a particular form of representation, but it form. The form of representation is that of a
is relatively context-free. Any language system is formal logical system.
a particular form of a semiotic system useful to 4.5 Verification
communicate a range of meanings, and sensitive to A form of logical verification can be acccr-lislied
that range. (Whorf defined this concept as 57
"linguistic relativity" [WHOR].) Yet, the same
semiotic system is context-free, in that it has the
by using LEX, the lexical analyzer, and YACC, the Press, Urbana, Chicago, 1980, cl9A9.
compiler compiler of the UNIX operating system.
One of the advantages of using the BNF notation is SILV Silver, L. D., "Towards a Movement Lanugage:
that the movement semantic being developed and the On the Representation of Movecent Knowledge,"
code that LEX recognizes are both in the context- manuscript. Interdisciplinary Departi.ent of Infor-
free form which is based upon the BNF notation. mation Science, University of Pittsburgh, Pitts-
burgh, Pennsylvania, 1981.
4.6 Project Summary
TRAC Tracton, W. P., "GEL: A Graphic Editor for
The project will: 1) represent a portion of Labanotation with an Associated Data Structure,"
the logical-semantic information structure of a Movement Project Report Ho. 15, The Moore School of
selected domain of human movement information, 2) Electrical Engineering, The University of
represent a prototype for a written code of a Pennsylvania, Philadelphia, August 1979.
representational rather than an experimental human WHOR Whorf, B. L., Language, Thought and Reality.
movement language where 3) the symbolic represen- The MIT Press, Cambridge, Massachusetts, 1979,
tation of human movement information is accomplished cl956.
using a dramatical rather than a descriptive
template. The aim is to define a subset of human
movement information code that meets these criteria
of the logical or artificial system, such that it
can be used without the problems of contradictory
and ambiglous expressions that are inherent, for
example, in natural language systems.
BADL Badler, S. J., Smoliar, S. W.. "Digital
Representation of Hunan Movement," Computing
Sur\-eys. Vol. 11, No. 1, March 1979.
BART Barthes, Roland, Elements of Semiologv, Layers,
Smith (trans.). Hill and Wang, New York., 1968,
cl964 Elements de Setniologie.
BILL Billings, M.P., Yucker, W. R., "The Computerized
Anatomical Man (CAM) Model," SASA-CR-134043, MDC-
G4655, CUT: NAS9-13228, Issue 23, 1970-71.
BROTV Brown, M. D., Smoliar, S. W., "A Graphic
Editor for Labanotation," Computer Graphics, Vol.
10, No. 2, Suomer 1976.
CASN Camap, Rudolph, Introduction to Symbolic
Logic and Its Applications, Meyer, W. H. , Wlllcinsotv
J. (trans.), Dover Publications, Inc., New York,
1958, C1954 Einfubrung in die symbolische loglk.
CHER Cherry, Colin, On Hunan Communication, MIT
Press, Cambridge, Massachusetts, 1980, cl957.
DNOT The Dance Notation Bureau, Courses and Pro-
grams, New York, 1980.
HUTC Hutchinson, A., Labanotation, Theather Arts
Books, New York, 1977.
LAEN Laban, R., The Language of Movement: A Guide
to Choreutics, Plays, Inc., Boston, 1974, cl941.
LYON Lyons, J., Semantics, Cambridge University
Press, Cambridge, 1977.
OTTE Otten, K., "Basis for a Science of Infor-
mation," Information Science: Search for Identity,
Debons, A. (ed.). Marcel Decker, New York, 1974.
POTT Potter, T. E., Willmert, K. D., "Three-
Dimensional Display Model," Office of Naval
Research, July 1975.
SAVA Savage, G. J., Officer, J. M., "CHOREO: An
Interactive Computer Model for Dance," International
Journal of Man Machine Studies, Vol. 10, 1978.
SHAN Shannon, C. , U'eaver, U. , The Mather.atical
Theory of Communication, University of Illinois

Martin Rlagle
Computer Science
Vassar College
Poughkeepsle, NY
For the past twenty years philosophers have (from one natural language to another) or stylis-
observed the development of research in natural tically acceptable prose.
language processing (NLP) and have offered periodic
3. To design systems which will permit the user to
critiques of both its methods and its goals. (See
Initiate and direct a dialogue. In natural lang-
Bar-Hillel, 1964; Matson, 1976; Dreyfus, 1978;
uage, in a particular topic domain, with the
Searle, 1980; and Odell, 1981.) Much of the criti-
latitude and fluency available in ordinary human
cism has proven to be valuable and artificial
intelligence workers such as Winograd (1980) and
Woods (1981) have acknowledged the positive 4. To design systems which will persuasively
influence of philosophical input to their work. exhibit the full range of human linguistic abili-
A great deal of philosophical criticism of ties, such as reading, translating, paraphrasing.
natural language processing (and of artificial Interrogating, conversing, and so on.
intelligence in general) however, rests, on 5. To design systems which are able to use and
misconceptions about the actual goals and claims understand natural language in precisely the same
of this research. This is partly due to the fact way that people do.
that NLP workers have not explicitly established 6. To design systems whose workings provide us
a set of methods and aims for their work; it is with an explanatory model of the structures and
also due to the fact that there are actually a processes responsible for human language use and
number of different goals which motivate NLP understanding.
research. The first four formulations involve pragmatic
The purpose of this paper is to spell out goals, the fifth represents an epistemologlcal
the different objectives in the field of natural goal, and the sixth an explanatory goal. The
language processing in order to identify the places vast majority of efforts in natural language pro-
where philosophical criticism is legitimate and cessing fall into the category of pragmatic goals.
useful as well as those areas where it Is inappro- (See Waltz, 1982 for descriptive surveys of recent
priate. Hopefully, this analysis will be valuable NLP projects.) Most, in fact, are examples of
to philosophers and AI workers alike. the first or Type 1, goal. Systems such as LIFER
Research in natural language processing can (Hendrix, 1977), ROBOT (Harris, 1979) and LUNAR
easily be misconstrued to be a concerted effort (Woods, Kaplan, and Nash-Weber, 1972), for
towards a single goal. In the simplest terms, this Instance, provide natural language front-ends
goal would be the Implementation of a system whose which are used principally for database query.
linguistic powers matched those of a literate, The research alms which motivate the construction
native user of a natural language such as English of these systems (and others like them) are
or French. In fact, however, research In natural relatively modest insofar as the use of natural
language processing is a loose amalgam of projects language is constrained by topic, vocabulary,
aimed at a variety of goals. Even though common syntactic breadth and user dialogue goals.
research requirements exist, such as the develop- The Type 2 goal is slightly more ambitious,
ment of techniques for parsing. Inference, memory since the analysis, generation, or translation
organization, and so forth, presuppositions, of text may require a system to deal with a broad
methodologies, and criteria of success differ in range of topics, a large vocabulary, complex
significant ways from one project to the next. syntactic constructions, and the intentions of an
It is somewhat misleading, therefore, to appraise author (or reader) which may be less than obvious.
or to criticize the theoretical foundations of Progress towards this goal has not been as substan-
natural language processing as a single enter- tial as progress towards the first goal, but there
prise. Yet some philosophers (e.g., Odell, 1981) are programs which can analyze and paraphrase
have assumed that the principal goal of NLP is the text (DeJong, 1982), produce modest translations
unified goal just mentioned, and have proceeded to from one natural language to another (Wilks, 1973)
question the plausibility
1. To design systems whichofwill
the allow
research on that
a user to Serious efforts
and generate towards
moderately theEnglish
smooth Type 3 prose
goal are
ground alone.
perform some traditional operation(s) on a compu- very few in semantic
an internal number andrepresentation
have appeared (Mann
only &within
ter (suchthe followingquery)
as database formulations
without of the
thereby the past five years.
aims of natural
requiring language
the user processing
to learn research:
an artificial lang- Systems In this category include SRI's TDUS
uage or a set of formal constraints which must be (Robinson, 1980) and BBN's HWIM (Bruce, 1982).
applied to the use of natural language. Neither these, nor other systems of this sort, have
2. To design systems which will be capable of achieved a level of combined reliability and effi-
processing textual material in order to produce ciency which would make them suitable for broad
accurate summaries, reliable translations. implementation. However, there has recently been
a great deal of actenclon curned cowards this area crucial one. If we are concerned with a Type 1
and a greater effort to achieve this goal can be pragmatic goal, then genuine undersCandlng is
expected In the near future. probably superfluous. A Type 1 incerface can be
The Type 4 goal is one which has been popu- limlced Co such a well-defined area of naCural
larized in science fiction and the lay press, but language Chat we can design syscems Co "deal
it is not cited by AI researchers as the rationale effecdvely with" the range of anticipated linguis-
for any serious NLP programming effort. This is tic input by means of deterministic production
not to say, of course, Chat Al workers have not rules, discrimination nets, or similar mechods.
entertained the idea of such a goal as a backdrop But if we are interested in Type 2, 3, or 4 prag-
for their activities. In the proper perspective, matic goals, then we must accept the fact that the
such a goal is analogous to the one which underlies potential for novelty, diversity, and deviant usage
physics (and the natural sciences) namely, the of linguistic inputs may be so great Chat a system
eventual discovery of all lawful relationships would be effecclve under such condlcions only 1£
among natural objects. Physics, after all, is IC were able Co process Che meanings of Chose
dedicated to the objective of ultimately explain- inpucs. And chis implies Chac ic muse be able Co
ing the universe in terms of quantitative laws. genuinely underscand natural language.
One does not, however, invoke this goal as the aim IC follows, Chen, Chac while Che Type 5 goal
of any particular research project. Moreover, it may be Irrelevant to the majority of pragmatic
would be absurd to try to criticize a particular systems of the present (and recent past), it is
line of research in physics by attempting to show essentially related to the development of the more
that this long-range goal is untenable. Even if ambitious pragmatic systems of the future. It
the universe is not ulcimacely knowable in Cerms is in this context that philosophical evaluations
of Che principles of physics, the encerprise of natural language processing become relevant:
sclll provides us wich an ever-increasing under- by analyzing the conceptual requirements of genuine
scanding of nacural phenomena. The same holds language understanding, the philosopher can illumi-
true for research in natural language processing: nate the theoretical conditions which an NLP system
Even if Che long-range goal is unattainable muse meec. Moreover, unless Chese condlcions are
and that remains to be shown this does not affect met, the epistemological goal cannot be achieved
the plausibility of the other three pragmatic goals and thus the more ambitious pragmatic goals cannot
nor does it invalidate the knowledge of natural be realized. Whether or not AI workers explicitly
language processing derived from programs designed view the Type 5 goal as a motivating force in their
Co achieve those goals. research, cherefore, Chey must acknowledge its
The fifth goal raises a completely different indirect relevance if they intend to pursue a Type
set of questions. Here we are concerned with Che 2, 3, or 4 pragmatic goal.
status of Che performance rather chan wlCh Che per- The Type 6 goal is one which has drawn a great
formance icself. In Che case of pragmatic goals, deal of attention in artificial intelligence due to
the cricerion of success is Che degree to which a statements such as the following:
syscem is able Co deal effectively with linguistic We consider the theory and model of semantic
input (or output). The phrase "deal effectively nets to be a computational theory of super-
with" may be interpreted differently for different ficial verbal understanding in humans (Slmnons,
applicacions, but In general It implies chac Che 1973, p. 63).
syscaa is able Co carry out a function which would . . .Cw]e shall describe a model of human
involve use and understanding of naCural language language understanding that forms the basis
if performed by a human. The claim Is not made, for a set of computer programs. . .(Schank,
however, chat the system actually uses or under- 1973, p. 187).
stands natural language itself. He can appreciate Both of these statements were published nearly
Che poinc of this last statement by considering ten years ago and since then there has been a
the following question: Can the pragmatic goals considerable change in the claims made for the
be pursued without pursuing the episcemologlcal. psychological significance of AI programs. Never-
goal as well? theless, some AI researchers (especially members
Some AI researchers would undoubtedly say 'yes' of Che Yale Group) sclll view Che explanaCory
in answer to this question and would poinc to the goal (Type 6) as a primary one, and some philo-
success of nacural language interfaces such as Che sophers (e.g., T. Simon, 1979) scill find Che view
one used in MYCIN (Shorcllffe, 1976), which are noC Co be worchy of criciclsm.
generally characterized as "language understanding" An argumenc Co demonscrace the relevance of the
syscems. Cauclous researchers, such as Winograd Type 6 goal to the rest of natural language pro-
(1973) and Leicner (1977) have emphasized Che cessing might go something like this:
epistemological limitations of their programs by Genuine understanding is necessary for any
putting the word "understand" in quotation marks natural language understanding systen capable
when using it to refer to their natural language of achieving Type 2, 3, or 4 pragmatic goals.
systems. Genuine understanding can be achieved only by
Other researchers, however, freely speak of processing language in the same way that humans
Cheir programs as nacural language undersCanders. process language.
Schank and Rlesbeck, for example, go one step A system which does things in the same way as
furcher and argue chat natural language programs humans do them can serve as a model for
must be directed cowards genuine understanding: explaining human language processing.
The point
Computer thatthat
programs Schank and Rlesbeck
attempt make is a
to replicate Therefore: Pursuit of goal Types 2 - 5
60 understanding without simulating the human entails pursuit of goal Type 6.
understanding process are doomed to fail- There are, however, several problems with such
ure when it comes to very complex processes. an argument. The second premise asserts a
Nowhere has this been clearer Chan in "process-produce" identity relation which is very
natural language processing (Schank &
Rlesbeck, 1981, p. 2 ) .
much open to dispute. There are numerous Hendrlx, G. "The LIFER Manual," SRI Technical Note
instances (e.g., the synthesis of urea) where an No. 138, 1977.
artificial process results in a substance, event,
Lehnert, W. & Ringle, M. (eds) STRATEGIES FOR
or function which is identical to a natural sub-
stance, event, or function in every respect save
Inc., 1982.
its mode of origin. It has yet to be shown that
a cognitive ability, such as the understanding of Leltner, H. "The Determination and Conceptual
natural language, can be produced only by employing Structuring of Restricted Domains of Discourse for
exactly the same processes and structures which are 'Intelligent' Interactive Systems," SIGART Newslet-
involved in numan language understandiitg. Indeed, ter, 1977, 61: 51-52.
it has yet to be conclusively shown that all human Mann, W. C. & Moore, J. A. "Computer Generation of
beings understand language by means of exactly the Multiparagraph English Text," American Journal of
same processes and structures. Computational Linguistics, 1981, 7: 17-29.
However, even if we accept the second praaise
under some interpretation of the phrase "in Matson, W. SENTIENCE, Berkeley, California, Uni-
the same way" the conclusion still does not versity of California Press, 1976.
follow. A program for natural language proces- Odell, S. J. "Are Natural Language Interfaces
sing is not, itself, an explanation of anything. Possible?" IBM Systems Research Technical Report
In order to be viewed as explanatory, the TR73-024, 1981.
details of a program its variables and data Robinson, A. "Understanding Natural Language
structures, its control structures, and so on Utterances in Dialogs About Tasks," SRI Technical
must be interpreted with respect to human pro- Note No. 210, 1980.
cesses and structures. Any program can be
legitimately interpreted in a variety of ways, Schank, R. "Identification of Conceptualizations
few of which will bear any relation to the Underlying Natural Language." In Schank & Colby,
concerns of human psychology. The explana- pp. 187-247.
tory value of a natural language program, Schank, R. & Colby, K. (eds) COMPUTERS MODELS OF
therefore, is not inherent in the program itself THOUGHT AND LANGUAGE, San Francisco, W.H. Freeman,
but arises, rather, from the use which can be 1973.
made of it by someone who is concerned with cog- Schank, R. & Rlesbeck, C., INSIDE COMPUTER UNDER-
nitive modeling. The use of AI programs to STANDING, Hillsdale, NJ, LEA, Inc., 1981.
theorize about human language processes, in fact,
is not AI research at all. It is a tool of cog- Searle, J. "Minds, Brains, and Programs," The
nitive psychology or, if one prefers, a methodo- Behavioral and Brain Sciences, 1980, 3: 417-457.
logical heuristic for a multidisciplinary investi- Shortliffe, E. H. COMPUTER-BASED MEDICAL CONSUL-
gation of phenomena such as discourse comprehen- TATIONS: MYCIN, New York, North Holland, 1976.
sion, text comprehension, and so forth. It does
not follow, therefore, that research in natural Simmons, R. "Semantic Networks: Their Computation
language processing entails explanatory goals of and Use for Understanding English Sentences." In
Type 6; consequently, philosophical objections to Schank & Colby, pp. 63-113.
AI programs as theories of human language abilities Simon, T. W. "Philosophical Objections to Programs
are irrelevant to the plausibility of AI research as Theories." In M. Ringle (ed) PHILOSOPHICAL
Of all the types of goals ascribed to natural Jersey, Humanities Press, 1979.
language research, then, philosophical evaluation Waltz, D. "The State of the Art in Natural
is directly pertinent only to the epistemological Language Processing." In Lehnert & Ringle,
goal formulated as Type 3, above. More specifi- pp. 3-32.
cally, the only valid judgment philosophy can pro-
vide is one which says that "the concept of natural Wilks, Y. "An Artificial Approach to Machine
language understanding entails X, hence a system Translation." In Schank 5. Colby, pp. 114-151.
must (be, do or have) X or it will not be capable
REFERENCES Winograd, T. "A Procedural Model of Language
of natural language understanding." The only valid Understanding." In Schank & Colby, pp. 152-
objection Y. "The can
philosophy Present
make State of Automatic
to natural language 186.
Translation of Language." In F. L. Alt (ed) can-
processing is that "a computer, in principle,
Advances In or
Computers, Winograd, T. "What Does it Mean to Understand
not (be, do have) X."New York, Academic Press, Language?," Cognitive Science, 1980, 4: 209-
Bruce, B. C. "Natural Communication Between Person 241.
and Computer." In Lehnert & Ringle, pp. 55-88. Woods, W., Kaplan, R. & Nash-Weber, B. "The Lunar
Sciences Natural Language Information System:
De Jong, G. "An Overview of the FRUMP System." Final Report," BEN Report No. 2378, 1972.
In Lehnert & Ringle, pp. 149-176.
Dreyfus, H. WHAT COMPUTERS CAN'T DO, Revised Edi-
tion, New York, Harper & Row, 1978.
Harris, L. "Experience with ROBOT in Twelve
Commercial Natural Language Database Query Appli-
cations," IJCAI Proceedings, 1979, 6: 365-368.
Lawrtnea Mazlaok
No»Mi M. Paz

ABSTRACT For semantic-praamatic structures to

0Perat*> it is not enouah to deteraine th*
Ntwspapcr cartoons can araphicallr meanina of individual words. Other types
display tht rasults of aabiauity in human
of information must be accessed. In order
sketch. Thi result can be unexptcttd and to select between comp*tina mcaninast
Punny. Captiontd cartoons dtrlwt thtir Knowledae is required about the
huaor froa a sudd*n inconaruitr which can aramaatical funotions represented by
bt aadt to follow by a huaan btina who can particular word orders in the natural
autowatically us* stortd world Knowltdso lanauaae sentence. Also. Knowledae about
to rasolv/t tho aabiauous situation. the "real world" (presuppositions) is
LiKtwisor coMFUtar analysis of needed; i.e.. the context in which the
natural lanauaai stateaents also ntcds to utterance tooK
Alona place.
wi th contextual Knawl*<l8*> a
succtssfully rtsolv* aabiauous situations.
Coaputcrizad undarstandina of dialoaua seaantic-praaa tic structure needs to
that taKas Place bttween hunans must not account for "s peech acts" (performatives).
only include syntactical and semantical Speech acts deaonstrate the speaker's
analysiSt but also praamatical analysis. aoal. They can be a conaandf a <«uestion. a
Praaaatics consists of an undcrstandina of stateaent. at c. In other words. the
the spcaKar's intantionsr the context of associated «ea nina and the iaplied action
the utterance^ and social laplications of Mjst be unde rstood. The theory must
polit* huaan comnunication. account for th e fact that the listener or
Coaputer techniques have already reader of the stateaent understands this
developed been use restricted world double Next.
"aeanin a".the(Bates. 1376)
Knowledge in resoluin* aabiauous lansuaae structure aust explain the speaker's
use. This paper illustrates how these ability to understand sequences of
techniques can be used in resoluina lanauaae which should mean on* thina but
aaoiauous situations arisina in cartoons. clearly mean another (conversational
1. THE GENERAL ROLE OF PRAGMATICS IN postulates). It is assumed conversation
that normal
NATURAL LANGUAGE UNDERSTANDING human beinas who enter into a This means
u 1 thin linauisti c theor Y. th* study have aareed to be cooperative, the truth.
of lana uaae use can be called praaaatics. speakers will tell each other inforaation
One def int 10n of praaaaties developed by that they will only offer to the
Charles Morr IS (1346) IS tha t praaaatics assuaed to be which
new and
can be chara cterized b y the relationship information they relevant
This and a s*t
represents willof standard
only rules.
between s lans and thei r huaan us*rs. Sians
fall in to th rcc classe s: ico ns. indices. D*wiatian* froa this 'code of conduct*
and sym bols. Praaaatic s relates directly will b* s**n as uiolations.
to sian s that are indi ces bee ause indices 2. SUPPLYING PRAGMATICS FOR COMPUTER
can on Ir b e underst ood wh en they are ANALYSIS
actual The
1y use d.
aeanina of indices can be found Th*re are sev*ral <*u*s tion answerina
by describtna rules for relating the sian systeas that make use of various
to a context. These are praaaatic rules technique s for includina praaaatic
which are in essence "action" rules for analysis in understandin a natural
•findina* relationships. Th* set of lanauaae. As these tec hni^ues are
structures developed for desoribina th*s* described a cartoon will be analyzed to
rul*s »Tt callad praaaatie-semantie illustrat e how praaaatic ana lysis could be
"trees" and divide into three cateaories: used to d isambiauate the s ituation. The
1. Perforaatives - which describe the character s that correspond t 0 th* coaputer
speaker's intention or aoal in usina a and those that correspond to a huaan in a
s*nt*ne* as a ^uastiont a eoaaandr *tc. man-machi ne dialoaue will b e identified.
2.1 car toons *T9
found in the Appendix.
2. Presuppositions - assuaptions about
the context that are necessary to maKe The CO-OP System (Kaplan. 1373) is a
that sentence verifiable. or •Question answerina data base system that
appropriate, or both. Follows the "codes of conduct" presented
3. Conversational Postulates - a class earlier. Its objective is to provide
of presuppositions concernina the cooperative responses from a natural
nature of human dialoaue which can be lanauaae data base luery. Some examples
referred to as discourse codes of from this system follow.
conduct. (Sates.1976) CO-OP IS able to determine from a
^utstton not onlr uhat inPornation is The computer on realiiina the possible
re^uiPtd. I.e. the direct. literal, and ambiauous situation could then respond:
correct responser but also that the
1) Pido is mr doa's name.
•Questioner is unaware of hiahly pertinent
facts not exFlicitl/ requested in the
2) 'Computer* is my name.
question. A heuristic used br the lyttem
is Knouledae of such facts frt^uently Z.2 RESTRICTED DOMAIN OF DISCOURSE
maKes asKina the Question unnecessary.
because the-/ entail an answer. The system Another data base system. ROBOT
action is to isnore the question and (Harris. 1978). uses the data base itself
provide the pertinent to find the use of words in the question.
Question: Hou manyfact. For example:
students failed to build expectations. and to resolve
CSEllO m Sprina, 77? ambiauities. The system interprets input
System's answer isl CSEllO was not based on what maKes sense within its
siuen in Sprina '77. limited world model, the data base.
In processina ambiauous statements.
The answer of "zero" would not haue been several interpretations may arise. A
c ooperatiue. heuristic used is unintentional
The user who posed the aboue interpretations of input questions are
question presumed that the CSEllO class usually not false for the specific domain.
uas tauaht in the Sprina of 1977. The but have a vacuous response (Coles. 1972).
system on f m d i n a that this presumption To use this heuristic these
uas false responds with a "correotlue interpretations are POsed to the data base
indirect response" by supplying the as queries. If all interpretations fail to
nesated presumption. find a response to the question, then the
Cartoons often lead to funny results answer is "there aren't any". This
when their statements are ambiauous. In negative answer assumes that the dialoaue
the cartoon TIGER (appendix, fia. 1) there will only be about information contained
is an example of a eooperatiue response in in the data base. If more than one
the answer to the question: "Did he catch interpretation can be answered
him?". Prior to this question the human in successfully, then the system enters into
the dialoaue only Knous there is a chase a clarification dialoaue. just as humans
aoina on and that there are two would haue to do when faced with an
participants: Stripe and Mrs. ParKer's ambiauous question. If exactly one
cat. The situation is ambiauous because we interpretation that is found. then the
do not Know who is chasina whom. Here the system responds usina this interpretation
common human presumption is doas usually for the question.
chase cats and therefore Stripe must be a In the cartoon GARFIELD (appendix.
doa. The computer system on findina this fia. 3) the human speaKer is asKina
presumption to be false. could respond (jarfield to Play with Nermal. The
with a corrective indirect response: followina ambiauous situation is created:
"Nope. AnStripe
other aot of CO rather
t ypeauav" operati uethanresponse
with OR
1> "Play with Nermal" means that Nermal
the CO-OP answer
the direct sy stem of rePl les wi th
"no"- is a is a toy
suaaesti ve in direc t re sponse. Fol lowina 2) "Play with Nermal" means that Nermal
the "cod es of condu ct and Garfield should Play toaether.
it is appropriate
for an answ er to contain relevant The computer system could resolve
informat ion li Rely to be reque sted in a this ambiauous situation by searchina Toy
follow-u p 4ues t ion. A he ur 1st ic used here and Friend domains for the entry Nermal.
IS to ch anae t he f ocus of th e oriainal On not findina Nermal in the Toy domain
quest ion and to r espon d with a direct but in the Friend domain the ambiauous
answer t o the or lai nal q uestion . but with situation IS resolved.
the foe us ch anaed Th e f ocuquestion.
is that The LIFER System (Hendrix. 1378) has
aspect of th e qu est 10n whic IS most capabilities for extending the natural
The BC carl oon (append IX. fia. 2) lanauaae subset that is understood by the
1iKely t o shif t in a fol 1OW-UP
could haue used a suaaest lue indirect system. Users may employ
response. In this cartoon, a census taKer easy-to-understand notions such as
(the human) is asKina que stions of a synonyms and paraphrases to extend the
subject (the computer) . A common lanauaae. The users can than asK questions
presumption here IS the first question the about information contained in the data
census taKer will asK IS t he subject ' s base usina their own natural lanauaae
name. The compute r needs to realize the "style". In this war "utterances" by the
amblauous situ tion creat ed by the SPeaKer (user) can be understood by the
question: "Namci Please"- If a chanae of listener (computer).
focus is analyzed the questi on could be In the cartoon WIZARD OF ID
seen1)as Mhat IS 1
two quest your
ons : (subject's) name.
(appendix, fia. 4) the human is statina to
2) Mhat IS rour doa's name, please? the Sire (computer) that "our records show
there is a dip in unemployment" and is
perhaps implying the question "what do we
do next?". The ambiauous situation here is
that "a di^" could dtscnbt tithtr a humorous interpretation is th» untxptcttd
Foolish ptrson or a dounuard irtnd in a ont. Throuah the use of praamatic
statistic or Fiaurt. To r»»olw» this anal/siir humorous interpretations can be
ambiauous situation the human could haut rccosnizcd as ucll as aeneratvd.
tnterid tht ^araphrasi: SUMMARY
"DiF in un»(«Ploym*nt" ii a paraFhra«»
of "temrorar/ diclini in tht Cartoons can araphica lly represent
uneMPloymcnt statistic". the humor due to ambiauit les in human
speech. It w\mr be possible to recoanize
2.3 UORLO KNOWLEDGE MITH FRAMES OR SCRIPTS humorous ambiauities us ina already
Many other s/stems haue included exisiina techniques. Three 1 evels of the
uorld Knowledae and inForm ation related to use of praamatics have been described. The
the dialoaue uith a user in a frame or first IS to resolve doubl c meanina by
script. Frames. simply putr are Just restrictina the domain of discourse.
niahly structured sets for Keeping eliminatina the occurrence of double
praamatics. Mhen an action is carried out. mtanina. This dtviot is tapl oytd by LIFER
some canonical description IS stored in in rtstriotina tht lanauasa and by ROBOT
the frame that uill permit the proaram to by rtstriotina tht obJtots i n the doaain
reconstruct the context in which the event to only thost that apptar In tht data
tooK Place. Frames carr y ouer to the bast. Tht stcond Itvtl of the use of
praaaatios involves maKina the doaain
subsequent statements and conventions that laratr rathtr than restrictina it. Frames
i»arK anaphora or presuppo stion link the and scripts are used to involve more world
proaram to slots in the c urrent or past Knowledae in the natural lanauaae analysis
frames Frames not resolue
that will only include syntactic
t he reference. so as to disaabiauate based on what is
information, ea. subject. object.
common occurrences in a aiwen situation.
prepositional phrases, but also semantic
The third level involves an ability to
and praamatic facts uhich provide various
daduce from coaplex frames and scripts a
reasons, motivations and purposes not
Purpose and then actina in aareeaent with
explicitly stated.
that purpose. CO-OP is capable of
Scripts are liKe frames in that they
dtttraina the ^utstiontr's motiut. and if
also have empty slots that are filled uith
necessary, p o s m a for itself a ^utstion
the context from a dialoaue or text.
aort in Ktepina with tht ^utstiontr's
Houever. scripts provid* uorld Knowleds*
aotivt than the oriainal ^utstion.
about coaaon aMPeritnces or situations in
Praamatic dtvicts ustd ineludt
terms of SchanK's conceptual dtpcndanoy
proscribina conttxt, tnlaraina context.
primitives. A text Is undtrstood br
and dtducina motivation from conttxt.
•appina sentences into actions or REFERENCES
primitive acts as described in tht script. Barr, A. 1980. 'Natural Lanauaae
Unstated facts described in a soript but Undtrstandina". AI Maaazint Uol. 1.
not in the sentence are assuatd to be No. 1, Sprina.
true. This providts a 'bacKaround' or ntext: Tht
Batts. E. 1976. Lanauaat and Co Acadtaie
world Knoultdst for undtrstandina and Acquisition of Praaaatics.
reasonina. (Sehank 1975). Press. New YorK. K?. Boyd &
The BEETLE (appendix, fia. 5) Intelliaence: Can Computer Thin ancisco.
cartoon can be used to illustrate the Fraser Publishina Co. San Fr Directed
frame and the script concepts. Here the Coles. L.S. 1972. "Syntax nauaae", in
relative pronoun must be resolved in the Interpretation of Natural La Experiments
Phrase "that aun". The two possibilities Representation and Meanina: a Systems.
are:1) shoot at that • un. use that aun
as a taraat with Information Processin sy (eds.)
H.A. Simon and L. SiKlos nauistics?.
2) use that aun to shoot with and shoot 44-SSi , Enalewood
at the oriamal taraet. Elsin. S.H. 1979. Mhat Is Li S. 197S.
The aabiauous situation can be 2nd ed., Prentloe-Hal1, Inc.nauaae and
resolved by usina world Knowledae about Cliffs. NJ. AI Memo.
taraet practice. For example. it is Goldstein. I. and Papert
helpful to Know that taraets should not bt "Artificial Intelliaence. La
expensive, useful thinas. i.e. a aun. and the Study of Knowledae". MIT
Vol. 1 No.
that a taraet is not located near a human 337, March.
beina. This type of information can be GrosZf B. 1960. "Utterance and Data Basf
provided by demons in the case of frames Issues in Natural nt to Aid
or by the reason or aoals statement in tht Communication". AI Maaazine. nauaae Data
case of scripts. 1. Sprina. olleae, TR
2.4 RECOGNITION OF HUMOR Harris. L.R. 197B. "Usina the
Itself as a Semantic Compone
Humor due to ambiauous statements in the Parsina of Natural La
re«iuires the reader to recoanize that an Base Queries"' Dartmouth C
ambiauous interpretation has occurred. The 77-2. October.
64 Hendrix. G.. Sacerdoti. E.. Saaalowiez,
D.. and Slocum. J. 1978. OeveloPina a
natural lanauaae interface to complex
data. ACM transactions of database
frsttMS 3: 105-147. Siklossy.L. and Siaon. H.A. 197Z. "Some
Kaplan. S 1979. CO operative responses Semantic Methods for Lansuaae tn
froa a Portable nat ural lanauaae data Representation and MeaninaC E x p e n n e n t s
base ^uery s/ste*. doctoral uith Information Proccssina S/stems.
disstrt ation dept of computer and H.A. Simon and L. SiKloss)' (eds.)
infor«a 11 on sot ene e> untuersitr of 44-SS.
penns/l uania. Sone Considerations SiKloss)'. L. 1978. impertinent
nazlacK, L •J. 1979. " Lanauaaes Queries ^uestion-ansuenna systems;
in napp ina Natural eas*- UorKina Paper. justification and theory. ACtI
onto Da ta Base Sr%x "The Structure of proccedinas annual conference Uashinsr
Se^ttatt er. • Representation and d.c. vol 1 39-44.
SehanKr R •G. 1975. les in Coanitiwe Ualtz> d julyr 1979. an enalish lanauaac
Episode s in Meaori'" ow and A. Collins question answerina system for a larae
Undersi andina Stud ess> Neu York. relational database. CACtI uol 21 no 7,
Science O.G. Bobr 526-539.
(eds.). Acadeaic Pr

1>JCVJ!WMAT" • ^
ACHATE! . , - ^ ^OTAiNiAV

CKC! • .-7 ' ftjT'^J-^^

• ~cC


(^ CQS^ C:gN^U6. j ZOYoirJi'Bn rxy^ff^. V.K..-.30? ..



r i^-i'-? -.-'.; . - .: L£A?;.: R£PHRASE


"• '^' • 5 : ' - -

•- 5

w ^ F'VP
I Ml/-.', -^VT
A A Clf IN
I X " ^

M O W S TM« BNc, Sll?. (S003' NC'.V \ «,*WT

W»ACTIC5, X WiTThS SV S-<00-'S3 I <AT. =^^-




"Tell me if you're guessing."
Jane Terry Nutter
Canputer Science Department
SUNY at Buffalo
Buffalo, New York

embody, causal claims. Furthermore, we accept

This paper discusses default reasoning, statistical evidence as supporting causal claims
distinguishing generalizations associated vdth only when there is independent reason to suppose
defaults from both universals and statistical that the phenomena involved are relevant to one
generalizations. I argue that conclusions based on another.
defaults should be reported differently frcm For example, I recall reading sanewhere that for
conclusions which do not involve default reasoning, many years, the manbership rolls of a baker's union
and that however we represent than, the related in New York City precisely paralleled the births
inference system must distinguish default claims and deaths in a town in India. Whether this
from other propositions and treat than differently. actually happened is not important here; my point
IVo existing analyses of default reascxung are is, it could well happen, and if it did, no
briefly criticized in light of the distinctions reasonable person would take it as anything more
presented. than a striking (and somewhat humorous)
1. intcoductian.
The transitivity of inferences based on
A great deal of knowledge seems to take the form
generalizations again distinguishes them from
of generalizations: neither genuine universals,
statistical claims. Presumability can be inherited
true of all things in their understood domains, nor
through truth-fmotional inferences; but
simple statistical claims of "more than half", but
statistical relationships are far more complex, and
claims which, although understood sometimes to
statistical inferaices follow utterly different
fail, nevertheless warrant presumptions in the
absence of conflicting information. Such
For instance, consider the result of conjoining
generalizations are usually represented by
two statistical claims S and fil. Say the
defaults. Ihis paper examines default
probability of fi is i, and that of 5i is x. Now
generalizations, distinguishing them from
what is the probability of S & S'? Well, let's
universals and statistical claims, and pointing out
look at seme examples.
some pitfalls their implementation presents.
Suppose the subject is coin tossing. Say £ says
Especially, it behoves us to realize that answers
"TOSS 1 will be heads," and SL says "TOss 2 will be
based on default reasoning represent educated
tails." Thai i = y = .5, and the probability of
gueses; and however useful they may be, guesses
2. gsneraliationa vs. universal "Tbss 1 will be heads and toss 2 will be tails" we
cannot safely or honestly be handed and atatistical
out as facts.
know to be .25, or x^. But is this alvays the
In English, "all" rarely means "every single case? Qearly not. Let a be as before, and let SL
thing without exception", and failing to note this be "Toss 1 will be tails." Now the jxobability of S
can produce unfortunate results [2]. Tb use & S' is 0. If a is the same as SL, then the
Btachman's example, if we say that all elephants probability of the conjunction is the same as the
are four-legged gray mamnals, and if we treat "all" probability of a.
as indicating genuine universality, then we have no Furthermore, statistical analyses tend to be
way to talk about Clyde the unfortunate amputee applied to two fundamentally different sorts of
elephant with only three legs. But suppose we situation. In the first kind, the various events
always treat "all" as indicating a generalization are r»x hypnt-.hesi independent of one another. We
Qyde the three-legged elephant, but unfortunately assume that the result of toss 1 does not affect
we can talk with equal ease about Clyde the the result of toss 2. In the second, a causal
non-mammalian elefdiant, or even about Clyde the relationship is being sought or presumed. At this
non-elephant Indian ele{*iant. point, probabilities become inextricably linked to
Generalizations cannot be treated like the theoretical context, and in some sense take on
statistical claims either, although the difference a different meaning. Given one set of results fi,
here is more subtle. Most people realize that over the probability of fi will differ depending on the
half the population is female. Yet in the absence hypothesis relative to which it is computed. More
of information concerning a person's sex, one does importantly,
3. ExamplPfi ofwhat changes
default tends to be not the
not typically presune that the person in question probabilities
Suppose we areof individual
designing occurrences,
a "travel but
agent" system.
is female (indeed, the presumption tends to go the precisely the probabilities
Ihe classical example of aofdefault
cooccurrences: that
rule in this
other way!). By contrast, the nimber of flightless is, the isprobability
context of conjunctions
the assumption that, all elsechanges,
birds (emus, ostriches, kiwis, penguins, baby withoutallthat
equal, of originate
trips the conjuncts changing.
wherever So no
the customer
birds, etc.) is hardly negligible. Yet we feel general rule
currently is.captures the wayreasonable
This seems the probability
justified in assuming of birds in general that they conjunction
the system needrelates to the
hardly probability
assume it, sinceofit can
fly. conjuncts.
request that information with no great loss of
Generalizations usually represent causal claims, convenience.
albeit masked and incomplete ones. Most birds fly, 67
because the features which distinguish something as
a bird evolved to facilitate flight. By contrast,
statistical claims are pvidence for, rather than
But consider the following "rule": within the not want to recommend treatment i solely on the
departure time limits the customer supplies, more grounds that we don't yet know that A is
direct connections cire to be preferred over less exceptional. On the other hand, if the symptoms in
direct ones. If someone says, "I'd like a round question can themselves prove fatal, nor do we want
trip to New York," for the system then to ask, to say we don't kncM anything about what to do for
"Where are you leaving frcm?" seems reasonable; h.
for it to ask, "Would you rather get there in one In this kind of case, we would like the system to
hour or nineteen and a half?" does not. say something like, "Treatment x naiMiiy helps," or
Furthermore, imagine a system which mindlessly "Prpfiimably treatment ;i helps." Even better would
produced every set of connections fran Buffalo to be an answer which directly tells the user what the
New York — direct, via Albany, via (Houston, via counterindications are; but at the very least, a
Seattle, via London, via Buenos Aires.... While responsible system should warn the user that the
the list might not go on forever, it will surely go information results fran a prpaimpfinn. and not an
on long enough to prove inconvenient. Some inference. Once the systan has issued the warning,
presunption must be made to order the alternatives the user can then pursue it in further questions.
so that reasonable ones get listed early. A further difficulty with defaults lies in
Yet we cannot simply add a universal rule that deciding what it means for than to be true or
direct routes are to be preferred over indirect false. Qearly "If Roger is a bird, then
ones, because it isn't always true. Foe exan^e, presumably Roger can fly" can be true even if Roger
some people refuse to use certain airlines or is a bird, but Roger can not fly. Indeed,
airports under any circumstances. Others will want "Presumably Roger can fly" can be true, even though
to stop over for a few hours in seme intermediate "Roger can fly" is false. That is the whole point
city. of saying 'presunably': it protects the speaker
There is also a more general problem. All else from saying something false when the facts go the
being equal, the cheaper of two routes is usually "wrong" way. That is what it means to give a
preferred over the more expensive. While the more guarded response.
direct route is usually also the cheaper, it is not Hence the truth value of defaults cannot be a
always so. One can currently fly from Buffalo simple function of the truth values of their
direct to Albany, tAiich is shorter and more direct component propositions: default operators are not
than changing flights in New York City. But it truth fiActu.onal. Furthermore, defaults make sense
turns out that flying via New York is cheaper. because they reflect causal (and henoe non-logical)
Whether the customer wants to fly direct or via New connections among their constituents. The missing
York City will now depend on which is more information Tiarani-t^a that their content cannot be
important to the custaner, convenient time a5. Prtablana
simple with
function of the
two goBPsed contents of the
scheduling or low price. components.
Several approaches to defaults expect
But then we should not have to be
Hence the system cannot presune absolutely either able to give aSome
suggested. purely logical account
researchers treat of defaults
defaults as
that the more direct route is pre£ecred, or that [41.
modalized [6,7,8]. Several problems with this
the cheaper route is. A guarded answer whidi approach have been pointed out already (see e.g.
presunes either, but with explicit reservations, [3]). In addition, this approach interprets "In
will prove more useful than either a flat general, birds fly" as something like "If ^ is a
presunption which cannot be overruled (a universal) bird and it is ocmpatible with what we know that x
or a failure to make any presunption at all. flies, then x. flies" [7]. But this is only true if
Other exasples abound. If a customer asks to every single bird without exception which we do not
travel from New York to Cincinnati via Athens, we know to be flightless does in fact fly. That is,
want the system to recognize that the custcmer if McOermott's version of the generalization is
probably means
4. Problfflns Athens,
defaults Ohio, and not Athens,
raise. true, it can never be the case that seme bird does
Perhapsor the
even Athens,
most commonGeorgia.
kind of At the same
default takestime,
the not fly and we can not prove that it doesn't. But
form,assunption should somehow
"In the absence be that
evidence reflected in may
"p, you the this is surely not what the generalization means.
infer p" response,
[7,8]. When lestthetravellers
system is who mean
asked "p?"toand
go The fuzzy logic approach [1,5,9,10] uses a
to Athens,
finds Georgia rule,
the default learn of
it Athens,
attaintsOiio by finding
to derive 'p. continuum of truth values in the closed range <0,1>
If it failsthere.
to do so, it returns p as the answer. instead of sijiply "true" and "false". Several
Hence systans augmented by this kind of rule can questions imnediately arise. First, every
take advantage of generalizations of the kind "assertion" in the data base must have an
above. So far, so good. associatied truth value; where are we to get these
But this procedure only looks reasonable so long from? Second, how are the tjruth values of
as we deal with questions like "Can Poger the bird propositions related to those of their components,
fly?" Then, saying "Of course, he's a bird," seans and how are the truth values of conclusions related
unobjectionable — but only because nothing depends to those of the premises of the demonstration in
on the answer. Notice that if we don't care what question? Preliminary results [1] boil down to the
the answers to our questions are, there is little msurprising claim that the conclusions are no
reason to implement defaults. After all, if we better than the pranises, but also on the whole no
don't care, we can as well say "I don't know" as worse (where "better" is interpreted as numerical
either yes or no. "greater than"). It is significant that this is
But suppose that we do care what answer we get. already non-trivial to establish. Third, how do we
For instance, consider a medical diagnostic and deal with the apparent result that different
treatment-reoomnending system. Suppose that for a demonstrations of the same proposition "establish"
particular set of symptoms, treatment Ji is different truth values?
generally very beneficial, but that in the But the largest problem, in my opinion, lies in
exceptional cases treatment & invariably kills. the irresistible temptation to view these fuzzy
Now if A has the symptoms in question, surely we do truth values as probabilities. This tendency is
encouraged by the need to assign what, in context,
look much like Bayesian prior probabilities to the
propositions in the data base. Sone kind of 8. AcKnnwlpdgnieata^
Bayesian analysis may prove useful in A.I. systenis; I would like to thank Stuart Shapiro and the
but there is no "cut-rate" way of doing it. members of the SNePS Research Group at SUNY/Buffalo
Neither fuzzy logic nor default reasoning for their many helpful conments and suggestions.
adequately analyzes probability. Under the
circumstances, it seems best to avoid a system 8. Refprencps.
which mislecxte to this extent.
[1] Aronson, A.R., Jacobs, B.E., and Minker, J. A
6. Conci,uaion. note on fuzzy deduction. JACM v. 27 (1980)
We would like some way to deal with the "fmny' 599-6(23.
truth status of default rules and of conclusions
drawn on the basis of default asstmptions; but [2] Brachman, R.J. "I lied about the trees" or
neither modality nor fuzzy truth values seems to defaults and definitions in knowledge
capture the desired effect. Furthermore, there representation. Draft (1982)•
seems good reason to siqjpose that no purely logical [3] Davis, M. Tlie mathematics of non-monotonic
analysis could. reasoning. A.I. v. 13 (1980) 73-80.
But this does not rule out the possibility that [4] Israel, D.J. What's wrong with non-monotonic
logical restrictions on defaults and their logic? Proc. Firs*- Annial National Conference on
consequences can be found and described, on the Artifiriai TnfpUiqpncy. American Association for
basis of which a system of inferences allowing Artificial InteUigence (1980) 99-101.
default reasoning can be developed. We are
currently developing a semantics for default [5] Lee, R.C.T. Fuzzy logic and the resolution
reasoning which treats defaults as propositional principle. J a m v. 19 (1972) 109-U9.
operators and which we hope will provide such a [6] McDermott, D.V. and Doyle, J. Non-monotonic
basis. Once this has been done, we can hope to logic I. A.I. V. 13 (1980) 41-72.
deal with defaults in a reasonable and useful way.
Hence an A. I. system which deals with defaults [71 McDermott, D. Non-monotonic logic II. JSOi
successfully must also have at least two properties V. 29 (1982) 33-57.
which existing proposals lack. First, it must [8] Reiter, R. A logic foe default reasoning.
delineate the logical restrictions on defaults and A.1^ V. 13 (1980) 81-132.
their consequences without ruling out the existence
of genuine exceptions, i.e., recognizing that [9] Zadeh, L.A. Fuzzy sets. inf. Control v. 8
default reasoning scmetimes gives the wrong answer. (1965) 338-353.
In doing so, it should be careful to distinguish [10] Zadeh, L.A. Fuzzy algocittms. Tnf. Control
default generalizations both from genuine V. 12 (1968) 92-102.
universals and from statistical generalizations.
And second, when the system gives answers which eire
based on default reascming, it should aoknit this
weakness by issuing warnings with them. For
without such warnings, default reasoning by any
scheme is not only unsound: it is also unsafe.


Valerie C. Abbott and John B. Black

CognitlTe Science P r o g r a m
Yale UniTersity, N e w H a v e n , C T 0 6 6 2 0

Identifying factors that influence pronoun reference assignment. In the experiments reported below w e first
assignment is a challenge to anyone attempting to test whether subjects are sensitive to these cues alone and
characterize the process of language understanding. in combination in a task requiring explicit pronoun
Because a pronoun itself carries only a small part of the reference assignment. Second, in a task in which reading
meaning that the understander is expected to assign to it, times for lines of text containing pronouns were
he or she must use contextual information to assign the measured, it was determined whether these sources of
pronoun an unambiguous referent. Characterizing pragmatic constraint influenced the difficulty of reference
aspects of the context which are used for thb purpose is assignment as measured by reading time.
an active area of psychological research.
M a n y recent studies have considered the role of Experiment I: Explicit Aaaignment
syntactic context, that is, the effect of structural
constraints on pronoun reference in a fragment of text, Four simple two-character stories were written. Each
typically a sentence, without recourse to constraints story contained an anaphoric pronoun in the final
which might be found in the meaning of the text sentence. Either character could be made the main
(Langacker, 1969; Sheldon, 1974). Shwartz (1981) has character of the story, or each character might be
found evidence for the use of syntactic information in the weighted equally. Additionally, each character was given
resolution of anaphoric pronouns in single sentences. a role or a goal in the story. Preceding the clause in
However, strategies based only on syntax are not which the critical pronoun appeared was a phrase
sufncient to determine unambiguously the referent of all containing an action appropriate to the role or goal of
pronouns. Consequently, investigators have examined the one character or other, or an action which was equally
role of semantic factors within sentences in directing the likely to have been performed by either of the characters.
assignment of referents (Caramazza, Grober, Garvey, & For instance, in "Brushing off a table, she smiled at her
Yates, 1977; Caramazza and Gupta, 1979; EhrUch, 1080). friend." the action preceding the pronoun b consbtent
T h e studies reported here will focus on the use of with the role of a waitress. Note that in sentences of thb
pragmatic constraints in resolving anaphoric pronouns. sort, the subject of the main clause b interpreted as the
Hirst and Brill (1080) have found that these constraints agent of the action in the preceding phrase.
influence the time needed to assign a referent even when Combination of these cues yields five presentation
that referent can be unambiguously determined by conditions.
syntactic rules alone. T h b result indicates that • The main character and goal or role cue are
pragmatic context can be expected to play a significant
both present and indicate the same referent.
role in reference assignment. However, the text
fragments used in their study were only two sentences • T h e main character and goal or role cue are
long, and the nature of the pragmatic considerations both present and indicate conflicting referents.
involved were not specified. It remains to be determined • Only the main character cue b present.
whether there are identifiable cues in longer texts which
influence reference assignment of anaphoric pronouns. • Only the goal or role cue b present.
W e will be concerned with characterizing two major • Neither cue b present.
sources of contextual information in paragraph-length Each subject was presented with two stories of the
texts, and evaluating their influence on pronominal type described above, one in each of two conditions.
reference assignment. Following each story on a separate page was a multiple
First, the presence of a clear main character m a y be choice question requiring identification of the character to
expected to play a role in reference assignment. Black, w h o m the anaphoric pronoun referred.
Turner, and Bower (1979) have shown that the point of T h e results of thb experiment are summarized in
view provided by a main character has an observable Tigure 1 below. W h e n main character and role or goal
effect on story understanding. In the extreme case, there cues led to assigning the same character as referent,
m a y be only one character in a story. W h e n there is pronoun reference was determined in accord with both by
more than one character, it is still likely that the main 8 4 % of the subjects, a significant difference from chance
character is given primary consideration for reference (X^ = 10.72, £ < .01). T h b shows that main character
assignment. This was investigated in the current and role and goal manipulations are powerful enough to
experiment. influence pronoun assignment when used together. In the
Second, Scfaank and Abelson (1977) have suggested case in which neither main character nor the phrase
that the goals and social roles of characters in stories m a y preceding the pronoun provided a cue concerning
contribute to reference assignment. If an act is pronoun reference, subjects chose both characters almost
appropriate to a particular goal or role and the agent of equally often as the referent of the pronoun, 4 6 % of the
the act is specified by a pronoun, it is likely that the subjects choosing one and 5 4 % choosing the other (x^ =
pronoun will be disambiguated to the character w h o has 0.12, ns). W h e n the phrase preceding the pronoun was
the appropriate goal or role. neutral with respect to the roles or goab of both
Since the goab the characters in a story are pursuing, characters in the stories, but there was a main character,
the roles they are filling, and the identity of the main
character can be experimentally manipulated, w e can test
whether these contextual cues influence pronoun reference
this character was adopted as the referent of the pronouD It is conceivable that in this experiment asking
by 8 2 % (x^ = 9.02, £ < 01) of the subjects. This is explicitly about the referent of a pronoun altered
essentially the same level of performance as was observed subjects' responses. Thus, it seemed desirable to obtain
with both sources of information avaiable to the subjects. another measure of the difficulty of assigning referents to
However, when both characters were given equal anaphoric pronouns in the same texts.
weighting in the story, but the phrase preceding the In the following experiment reading times for the
pronoun was appropriate to the role or goal of one sentences of these texts containing anaphoric pronouns
character, the referents chosen were consistent with this were measured. It was expected that reading times would
character for only 6 2 % (x'^ = 107, ns) of the subjects. be fastest for pronouns in the condition in which there
This pattern of results seems to indicate that subjects are was a main character, and the phrase preceding the
not making extensive use of information about the pronoun was appropriate to the role or goal of that
relationship between an action the agent of which b character. Reading times should increase as it becomes
speciHed by a pronoun, and the known goals and roles of increasingly difflcult to assign a referent unambiguously
characters, in assigning the pronoun a referent. to a pronoun.
However, this interpretation is complicated by the
results of the condition in which subjects had to m a k e a Experiment IL Reading Tlma
choice between an assignment to the main character of
the passage, or to another character with the role or goal Materiab were the four stories used above and six
appropriate to the action preceding the pronoun. In this additional stories of the same type written for this study.
situation, subjects chose the assignment which agreed Each story could appear in any of the Ave conditions
with the main character 3 8 % of the time, and chose the discussed above. T h e penultimate line of the story
assignment which agreed with the role or goal context contained the action which was consistent with the role
6 2 % of the time. Although this result is not signiHcantly or goal of one character or the other, or with either. T h e
different from chance (x = 107, ns), a difference in the final line of each story was constant over conditions and
opposite direction would be expected if only main contained an anaphoric pronoun.
character cues were influencing the choice. This result Each subject read the 10 stories, two in each of the
indicates that although a character's goal or role is not five conditions. They were instructed to read the stories
always sufficient to influence pronoun assignment alone, for comprehension. Each story was presented one line at
it is important when seen in combination with other a time on a computer terminal, subjects pressing the
information. The difference between the choice of "Return" key when they had finished reading each line.
I CHOICE Reading times for the final line of the story were
I compared between conditions.
(CONSISTENT)I 84 10 23001
CUE ONLY I 82 18 IN 2200!
CUE ONLY I 82 38 2100! X
(CONFLICT) I 82 38 20001 2033 2020

neither" I 1
* consistent = consistent with goal or role cue ONLY ONLY PRESENT
consistency arbitrarily determined
Figure 1: Subjects' choice of pronoun referents
in percent.
Figure 2: Reading times for a clause containing an
referent in this condition and in the condition in which anaphoric pronoun
main character identity is the only cue available is
significant (x^ = 15.47, £ < .001). T h e utility of main The results for the five conditions are presented in
character information thus seems to be dependent on the Figure 2. T h e reading time data is quite consistent the
absence of conflicting information. data seen in Experiment I above. A comparison between
The results of the this experiment indicate that the the condition in which both cues are present and lead to
extent to which subjects chose one referent or the other the same choice of referent and that in which both cues
was governed by the contextual cues manipulated. T h e are present but lead to conflicting choices shows faster
main character of the story was most effective in reading times in the former condition (F = 4.805 ^ =
innuencing reference assignment, with consistency of the 0.033). Having only one cue in the form of a main
pronoun's context with the goal or role of a character character leads to almost identical reading times as
effective in nullifying this main character effect. having both cues and results in significantly faster
reading times than the confusing condition (F = 9.487 £ References
=s 0.005). However, although there is a trend, having
only the cue of consistency with the goal or role of a
character does not lead to significantly faster reading Black, J. B., Turner, T. J., & Bower, G. H. Point of view
times than the confusing condition (F = 3.022 g = in narrative comprehension, memory, and
0.080). The condition in which neither main character production. Journal of Verbal Learning and Vetinil
nor consistency with a goal provided a cue as to the Behavior, 1070, 18, 187-108.
reference of the pronoun is a puzzle. Although it is not
significantly faster than the confusing condition (F = Caramazza, A. & Gupta, S. The roles of topicalization,
1.325 2 == 0.258), it is also not significantly slower than parallel function and verb semantics in the
the condition in which both cues are available (F = 0.527 interpretation of pronouns. Linguistics, 1870, 17,
g^ = 0.480), the condition in which only the main 407-518.
character is available (F = 0.608 £ = 0.448), or the
condition in which only consistency with a goal or role is Caramazza, A., Grober, E., Garvey, C. & Yates, J.
available as a cue (F = 0.148 £ = 0.705). One possible Comprehension of anaphoric pronouns. Journal of
explanation is that subjects are fairly quick to realize VerlMl Learning and Verbal Behavior, 1077, 16,
that they have no information with which to make a 601-600.
decision, and proceed in hopes of obtaining the
information they need in the remainder of the text. In Ehrlich, K. Comprehension of pronouns. Quarterly
other words, in the confusing condition, enough Journal of Experimental Psychology, 1080, 32,
information is available, so an attempt is made to find 247-256.
the referent. This proves difficult, leading to increased
reading times for such sentences. In the absence of Hirst, W., ft Brill, G. A. Contextual aspects of pronoun
relevant information, the attempt at resolution is assignment. Journal of Verbal Learning and Verbal
deferred. Behavior, 1080, 19, 168-175.
The results of these two experiments show the
Langacker, R. On pronominalization and the chain of
influence on pronoun reference assignment of
manipulation of pragmatic aspects of the text in which command. In D. Reibel, S. Schane (Ed.), M o d e m
they appear. The main character of the text, in the Studies in English, Englewood Cliffs: Prentice-
absence of disconfirming evidence, is quickly and reliably HaU, 1060.
assigned as the reference of these pronouns. They also
Schank, R.C., and Abelson, R.P. Scripts, Plans, Goals,
point out that the influence of some possible pragmatic
and Understanding. Hillsdale, NJ: Lawrence
cues cannot be characterized simply. For example, if the
Erlbaum Associates, 1077.
action of an agent represented in the text by a pronoun b
consistent with the role or goal of a character, this is not Sheldon, A. The role of parallel function in the
sufHcient to lead reliably to assignment of that character acquisition of relative clauses in English. Journal
to the pronoun. However, the influence of this cue is of Verbal Learning and Verbal Behavior, 1074, IS,
substantial enough to lead to confusion if there is other
evidence indicating another character as the referent.
Additionally, it cannot be assumed that the less Shwartz, S. The search for pronominal referents.
information available for pronoun reference assignment, Technical Report 10, Cognitive Science Program,
the longer it will take subjects to read the sentence in Yale University, 1081.
which it appears. From the results of experiment 11 w e
can see that subjects proceed rather quickly when they
have no information on which to base their choice.
We are grateful to RoweU Huesmann for sponsoring
this paper, and to W e n d y Lehnert and Larry B i m b a u m
for helpful discussions regarding the research reported
here. This research was supported by grants from the
Systems Development Foundation and the Sloan

Topic and Comment in Spoken Sentence
Hans Brunner
University of Indiana
Chomsky (1965) has defined the Topic syntactic form has been underlined, above,
of a sentence as "the leftmost UP according to this criterion. In this study
immediately dominated by S in the surface we capitalized on this exchange of roles so
structure" and Comment as, quite simply, that, when comparing the overall effects of
"the rest of the string". Others have topic vs. comment status, we would be
either defined or used these two concepts comparing each word against different
to denote, among other things, the tokens of itself.
distinction between (1) "new" information Armchair theorists have been asserting
and information that has already been for some time now that the topic of a
conveyed (e.g., Clark & Haviland, 1977), sentence (1) receives less intonational
(2) the notions of "psychological subject" stress (i.e., lower amplitude and FO and a
and "psychological predicate" (e.g., shorter duration) in production and (2) is
Hornby, 1972), or (3) the "current" vs. somehow prerequisite for correct
"presupposed" information of a sentence interpretation of the comment. If this is
(e.g., Halliday, 1967). Differing true, then comprehension of any given word
interpretations abound and, in the words of should require less spectral information
deBeaugrande (1980), "it has remained when it functions as topic than when it is
unclear precisely what phenomenon we are stretched out in time as part of the
dealing with". comment on what has been topicalized.
The purpose of this research was to Moreover, if the functionalist approach is
investigate the roles of "topic" and correct, then there should be a
"comment" in different semantic and well-ordered interaction between
syntactic contexts. To do this we used the topicalization and syntax, with agents
gating paradigm, a procedure in which requiring a smaller minimal gate size in
spoken sentences are repeatedly presented active sentences and sentences with cleft
to subjects, the amount of spectral and pseudocleft objects, where they are
information from each constituent word topicalized, than in the remaining three
being gradually increased with each syntactic forms, where they are part of the
successive repetition. In the first comment. And once again, the converse
presentation of each sentence, the spectral should obtain for the object of each
gate size (i.e., duration from the onset) sentence.
of each word was only 50 msecs. The Neither of these predictions was
remainder of each word was replaced with supported by the results: The amount of
envelope-shaped noise, a procedure which spectral information necessary for word
eliminates the spectral information while recognition did not decrease as a function
preserving prosodic fluctuations in the of increasing topicalization. Moreover,
intensity of the speech. Each target there was a significant main effect of
sentence was repeated 10 times, the gate syntax (F( 5 , 270)=«26.18), resulting from an
sizes being increased in 50 msec increments increase in the amount of spectral
across repetitions. Subjects were information necessary for word recognition
instructed to simply write down whatever as the syntax of sentences became more
they could understand after each complex.
presentation of the sentence. The These results should not be construed
dependent measure of interest was the as evidence against the functionalist
amount of spectral information (i.e., the approach to sentence comprehension. Our
"gate size") necessary for comprehension of sentences were presented out of context, in
each word in the sentence. the absence of any larger text or dialogue
This technique was applied to the fraimework. Thus, it is doubtful that the
current issue by transforming the syntax of topicalized words in these stimuli really
simple, declarative sentences so as to vary represented anything akin to "given" or
the topicalization of subject and object "presupposed" information for the subjects.
nouns from one sentence version to the Nonetheless, these results do serve to
next. Our syntactic transformations, taken constrain some of the notions that have
from a study by Hornby (1972) are shown been advanced about the nature of topic and
below: comment in the processing and structure of
(i)The farmer plowed the field. language. They make if quite clear that
(2)The field was plowed by the farmer. "topic" and "comment" are textual, rather
(3)It was the farmer who plowed the field. than syntactic or structuralist concepts.
(4)It was the field that the farmer plowed. Thus, any effort to define these constructs
(5)The one who plowed the field was the without reference to intersentential
farmer. relations simply misses the purpose of
(6)lVhat the farmer plowed was the field. topicalization in real-time processing.
Hornby (1972) showed that agent of a However, the results also demonstrate that
sentence serves as the topic when presented it is important not to lose sight of
in syntactic structures with a cleft object syntactic effects in text processing. The
(sentence 4), pseudocleft object (6) or in syntactic constraints of these sentences
active sentences (1) and as the comment did much more than just control the focus
when presented either in passive sentences of attention; they had profound, top-down
(2) aor
vice in (5)
versa. sentences
where the
The cleft
being (3)
of of
of or
the effects
in on
of the
as overall
will speed
in manipulations
the of

CoUeen M. Seifert, Scott P. Robertson,

and John B. Black
Cognitive Science Prognun
Yale University

Cognitive science researchers have proposed a wide goal, a set of connected actions, and associated states.
variety of inferences and inference mechanisms that m a y Eight of the stories were «cnp< based (e.g. going to a
be used in comprehending stories. Inferences are restaurant, going to the movies), the other eight were
concepts, or links between concepts, which are not plan bated (e.g. robbing a store, getting directions). Each
explicitly stated in a text but which are present in the story included inferenee-atatementa which explicitly
final memory representation. Many previous described the goal, the plan, an act, and a state.
psychological experiments on inferences have been unable Following each of these statements was an eight-syllable
to distinguish between inferences that are generated target-gtatement which required the preceding
during comprehension (on-line) and those that are information to be inferred if it was not already present in
constructed later (for example, during summarization or memory. For example, sentence 2 when read alone
question answering). T h e experiments presented here requires that the goal stated in sentence 1 be inferred;
contrast four types of pragmatic inferences to determine sentence 3 requires an inference of the plan stated in
whether they are usually generated on-line. sentence 2; sentence 6 m a y require an action inference
Pragmatic inferences are a class of inferences that (sentence 4) but not a state inference (sentence S). (Our
result from the application of world knowledge to stories were not as compact as this example suggests.)
information in a text. Knowledge structures typically 1. John was hungry.
employed in the production of pragmatic inferences 2. John hurried to a restaurant.
(especially for narratives) are goal structures, planning 3. John ordered the special dinner.
mechanbms, and scripts (Schank & Abelson, 1077; 4. T h e waitress brou^t the food.
Wilensky, 1078). A number of psychological experiments 5. John had silverware.
have demonstrated the use of individual schematic S. John ate his meal in a hurry.
structures in producing pragmatic inferences (e.g. Bower,
Black & Turner, 1070; Graesser, Gordon, & Sawyer, Target-statements (e.g. sentence 3) were presented
1070; Smith &. Collins, 1081), but have not shown the on- with their associated inference-statements (e.g. sentence
line operation of a combination of knowledge structures 2) either present or absent. Each subject received stories
involved in pragmatic inference generation. In the two with goal, plan, act, and state inference-statements
studies discussed here, w e will present evidence that 1) absent, but within any one story a subject had only one
knowledge-based inferences about goab, plans, and high level inference type (goal or plan) and one low level
actions are m a d e daring reading and 2) inferences about inference type (act or state) left oat. Subjects read the
consequent or associated states of the world are not m a d e stories one line at a time from a C R T screen and their
during reading. W e will also give indirect evidence for reading times for the target-statements were recorded. It
on-line forward inferencing of plans from goals. was assumed that inference generation would be evident
Knowledge of goals and plans organizes otherwise in increased reading times for the target-statements in
disconnected text elements, and thus it is important that the inference-statement absent conditions. After the
they be inferred early in the comprehension process reading task and a short intervening task, the subjects
(Owens, Bower, & Black, 1070; Smith & Collins, 1081). were given a recognition test (1-7 scale) which included
Lower level inference types, like story actions, are used to the inference-statements. High recognition ratings for
nil in information specified by already active schemata absent inference-statements indicates the presence of the
(Bower, Black & Turner, 1070). State information, inferences in the final story representations.
however, while potentially inferable, is not predicted to Table 1 shows the mean reading times for target-
be generated aa part of the comprehension process. statements and mean recognition ratings for inference-
There is considerable evidence that physical states that statements of the different types in the present and
are antecedents or consequences of actions are not a absent conditions. T h e analysis of reading times showed
central part of narrative representations (Black, 1080; that goal and action targets took longer to read when
Graesser, 1081; Kemper, 1082; Lehnert, Robertson, & their inference-statements were absent, but this was not
Black, in press; Robertson, Lehnert, & Black, 1081). For the case for plans or states. Recognition results showed a
example, when someone sits down in a restaurant, specinc interaction in which states were not falsely
information about the position of tables and chairs is not recognized when they are left out of the stories while the
typically accessed. other types of inference-statements were. T h e reading
T o test for on-line inferences of the speciHed types, w e time data and recognition data together support the view
measured subjects' reading times for target sentences that goals and actions are inferred on-line whereas states
which required a pragmatic inference for coherence. In are not. Plans proved problematic and were investigated
the Hrst experiment w e wrote sixteen short (17 line) further in a second experiment.
stories each containing a goal, a plan for achieving that
understanding. Active goal and plan schemata serve
Ttrj«t RT Inf*r«ne« RteognJtion during reading to organize otherwise disconnected
concepts in the text. W e also obtained indirect evidence
Tjrpt of Inftranc* lBf»rtne« for on-line forward inferencing of prototypical plans from
Inftrtne* Abstnt Prtsant Absant Prastnt goals since we were only able to demonstrate that plans
were inferred in a backward manner from plan inference-
Coil • 1.660 1.550 4.89 5.81 statements when they were non-prototypical of an active
Plan 1.626 1.601 4.95 6.09 goal.
SUt* 1.538 1.487 » 3.62 5 82
Act • 1.595 1.448 4.75 S.06 In terms of low level actions, the results support the
view that script and plan completion inferences
Table 1. Mean reading times (sec.) and recognition (remember that we had both script-based and plan-based
ratings for the different inference types. stories) found in the representation after reading are not
reconstructed at test time, but are built during reading.
O n the other hand, there was no evidence that inferences
Though the reading time difference for plans was not about states of the world occur during comprehension,
significant in the first experiment, the high recognition even though w e know that they are available after
rating for absent plans suggests that they were inferred comprehension and even during comprehension in
at some point. A closer look at the materials revealed a response to question probes (Graesser, 1081). Of course,
possible explanation: knowledge of the goals in stories some types of states m a y be very important and reliably
where the plan inference-statements were left out m a y inferred in some texts (Owens, Bower, & BLck, 1079);
have allowed subjects to infer the plans before their however, the theoretical claim is that low level states in
target-statements were read. For example, knowledge of general are inferred on-line less often than the other types
the goal "John was hungry," m a y lead to a prototypical of inferences studied.
plan expectation, i.e. "going to a restaurant.' If a This T m e tuning* of data about the types of
prototypical plan is inferred when a goal in read, the inferences m a d e on-line provides important constraints on
presence or absence of the plan inference-statement would inference models. Since pragmatic inferences are
not have made any difference. probable rather than necessary, and since there is so
In a second experiment, prototypical plans in our m u c h inferential material available at any given time
materials were changed to less typical plans to minimize from world knowledge, direct measures are needed to tell
forward inferencing from the goals. In addition, some when inferences are m a d e and which types are made.
story titles were changed to decrease the chances of Although most models of language comprehension include
inferring a goal prior to reading the goal target- an inferencing component, it is important to examine h o w
statements. Also, action inferences were not included in different classes of knowledge are differentially utilized by
the second experiment since this effect had already been the comprehension process.
clearly demonstrated.
The results of the modified experiment are shown in
Table 2. T h e reading time differences for goal and plan We are grateful to Arthur Graesser for sponsorship
inferences increased and plans now became significant. and to Brian Reiser for comments on this paper. This
W e again failed to fud evidence for on-line state research was supported by grants from the Sloan
inferences. The recognition data remained consistent Foundation and Systems Development Foundation.
with these results, showing a high false alarm rate for References
goals and plans, but not for states.
Black, J. B. Memory for state and action information in
narratives. Twenty Rrst Annual Meeting of the
Tirgtt. RT Infaranea RacognitioN Psychonomic Society, St. Louis, Missouri, 1080.
Bower, G. H., Black, J. B., & Turner, T. J. Scripts in
Typt of Znftrtne* Infaranea memory for text. Cognitive Psychology, 1079, 11,
Inftrtnet Absant Prasant Absant Prasant 177-220.
Goil • 1.764 1.613 5.28 5.94 Graesser, A. C. Prose Comprehension Beyond the Word.
Plan • 1.720 1.626 5.69 6.27 N e w York: Springer-Veriag N e w York, 1081.
Stit* 1.536 1.490 • 3.97 5.56 Graesser, A. C , Gordon, S. E., & Sawyer, J. D. M e m o r y
Table 3. Mean reading times (sec.) and recognition for typical and atypical actions in scripted activities:
ratings for the different inference types. Test of a script pointer -t- tag hypothesis. Journal of
Verinil Learning and Verbal Behavior, 1070, 18,
Taken together, these experiments support the view Kemper, S. Filling in the missing links. Journal of Verbal
that some pragmatic inferences, specifically goals, plans, Learning and Verbal Behavior, 1082, 21, 90-107.
and actions, are made during reading while others, Lehnert, W. G., Robertson, S. P., & Black, J. B. Memory
speciflcally low level states, are not. It is especially interactions during question answering. In H. Mandel,
important to note that high level inferences about goals N. L. Stein, & T. Trabasso (Eds.) Learning and
and plans are made on-line. This result is congruent with comprehension of text. Hillsdale, N.J.: Ablex, in
models of language comprehension that incorporate press.
strong top down uses of pragmatic knowledge during
Owens, J., Bower, G. H., & Black, J. B. The "soap
opera" effect in story recall. Memory and Cognition,
1979, 7, 185-191.
Robertson, S. P., Lehnert, W . G., & Black, J. B.
Alterations in memory for text by leading questions.
Paper presented at the 1082 meeting of the American
Educational Research Association, N e w York.
Schank, R. C , & Abelson, R. P. Scripts, plans, goals,
and understanding. Hillsdale, N.J.: Erlbaum, 1977.
Wilensky, R. W h y John married Mary: Understanding
stories involving recurring goals. Cognitive Science,
1978, a, 235-268.

Generation of Useful Problem Representations in a
Semantically Rich Domain: The Example of Physics
Joan I. Heller and F. Reif
University of California, Berkeley

The initial representation of a problem can information about the specified situation and
crucially determine whether the subsequent search problem goal, introduces convenient symbolism, etc.
for its solution is easy, difficult, or even im- Since the generation of this basic description is
possible. However, the processes used to generate relatively straightforward, we shall not discuss
initial problem representations, particularly in it further here.
semantically rich domains, have been studied less The next stage of the description procedure
extensively than those used for search. According- is more complex and involves the generation of a
ly, the study reported in this paper has aimed to "theoretical description" which deliberately re-
formulate and test a model specifying how human describes the problem in terms of special concepts
problem solvers can generate effective initial des- provided by the knowledge base for the relevant
criptions of problems in a realistically complex domain. All the principles in the knowledge base,
scientific domain. which are expressed in terms of these special con-
The preceding goal, which is prescriptive, is cepts, become thus readily accessible to facilitate
more general than one concerned with naturalistic the subsequent solution of the problem.
studies of actual experts (Chi, Feltovich, 5 The generation of the theoretical problem
Glaser, 1981; Larkin, McDermott, Simon, S Simon, description is based on the following considera-
1980). In particular, it focuses interest on pro- tions. The knowledge base about any domain con-
cedures for generating good problem representa- tains declarative knowledge specifying the parti-
tions, without necessarily trying to simulate the cular entities of interest in this domain, the
behavior of experts and without making the assump- special concepts useful for describing these
tion that experts behave optimally. From this entities, and principles specifying relationships
general point of view, models of good problem des- between these concepts. For example, in the sci-
cription may thus be suggested by purely theoreti- entific domain of mechanics, the entities of
cal analyses as well as by observations of experts. interest are particles or more complex systems
(Indeed, protocol observations of experts reveal consisting of such particles. The special des-
relatively little about the processes used to gen- criptive concepts are special concepts used to
erate initial problem representations since these describe motion (e.g., "position", "velocity",
processes are usually carried out rapidly and "acceleration") and special concepts used to des-
almost automatically on the basis of much tacit cribe the interaction between particles (e.g.,
knowledge.) "force", "potential energy",...). The principles
A prescriptive point of view, transcending specifying relations between these concepts are
naturalistic studies of expert performance, is also "interaction laws" (which specify how the force on
centrally important for attempts to improve human one particle by another is related to the proper-
performance or for educational applications. In- ties and positions of these particles) and "motion
deed, in instructional applications, students can principles" (which specify how temporal changes of
not merely be taught to mimic expert performance concepts describing motion are related to concepts
which often relies heavily on the recognition of describing interaction).
patterns acquired as a result of years of exper- The preceding kinds of declarative knowledge
ience. in the knowledge base about a particular domain
Our prescriptive interest has been specifi- provide the basis for explicit "description rules"
cally focused on human performance in generating that specify procedures for generating a theoreti-
effective problem descriptions. From a theoretical cal description of any situation in this domain.
point of view, this emphasis allows us to presup- In particular, these description rules specify
pose complex human capabilities (such as natural- what particular kinds of entities should be des-
language understanding and pattern-recognition cribed, what special concepts should be used to
skills) while focusing attention on the more describe them, what properties of these concepts
sophisticated cognitive skills needed to generate should be incorporated in the description, and
good problem representations. Furthermore, our what checks should be made to ensure that the re-
interest has been in developing experimental sulting description is consistent with the princi-
approaches which (unlike some forms of computer ples in the knowledge base.
simulation) allow direct validation of models of For example, our model for generating a theo-
good human performance in problem solving tasks. retical description in the particular scientific
We chose to study the generation of problem domain of mechanics contains explicit rules speci-
descriptions in the particular domain of physics fying that attention is to be focused on particles
(especially within the subfield of mechanics) be- or certain systems of particles (e.g., strings,
cause this is a realistically complex domain rep- solid objects, . . . ) . The motion of each such par-
resentative of other quantitative sciences. On the ticle is then to be described by a diagram indi-
other hand, this domain is sufficiently simple and cating available information about its position,
well-defined that the generation of problem des- its velocity, and its acceleration. Similarly,
criptions can be specified and studied in some the interaction of each such particle is to be
detail. described by a diagram indicating available infor-
Model of Problem Description mation about all forces on this particle by other
Our aim was to formulate a theoretical model particles (with an explicit algorithm specifying
specifying how a human problem solver can generate, how all these forces are to be identified and
for any problem in a particular scientific domain, enumerated). Finally, the resulting description
a useful initial problem description facilitating is to be checked by assessing its consistency with
the subsequent solution of the problem. This model known motion principles (e.g., by checking that
decomposes the description process into two succes- the acceleration of any particle has the same
firstto stage
and uses
des- by
as the
is expected
to following
to initial
pro- 77
(1) The resulting descriptions should be consid- M 9uiatd by modx
erably more explicit than those commonly generated MEAN 14* guiOM By modidta inoati
by actual experts. (2) Strict adherence to the NUMSeR C compwiM" ("• juiooct)
description procedure should avoid most of the CORRECT
errors commonly committed by novices (e.g., omit- 3 - - 100%
ting forces or introducing non-existent extraneous go
forces). (3) The description procedure should
lead to problem reformulations which are more ? - • 80
readily interpretable (e.g., questions about slack "--• M* - 40
strings or touching objects are automatically re-
interpreted as questions about forces). (4) The •• C - 20
resulting theoretical problem descriptions should
substantially facilitate the subsequent solutions 0
motion (orco i^uJiioin jrnirtft
of these problems. dOKr
Experimental Methods and Results dtscr
Our experimental approach for testing a pre-
scriptive theoretical model of human performance Figure 1. Results of external-control experiments.
has used the following paradigm: Design carefully
controlled experimental conditions to induce indi- Conclusions and Implications
vidual human subjects to act in accordance with The work briefly outlined in the preceding
the model; then observe whether the resulting per- paragraphs leads to the following main conclusions.
formance is effective in the predicted ways. The knowledge base for any scientifc domain
To implement this paradigm, we have used implies guidelines specifying how to describe
"external-control experiments" of the following effectively any situation encountered in this
kind. We first design a program of step-by-step domain. These guidelines can be expressed in terms
directions, and associated knowledge, whereby a of explicit rules prescribing how to generate a
human subject can be guided to act in accordance useful initial description of any problem in the
with the model (e.g.. directions which implement domain.
the steps of the specified description procedure). Prescriptive models of effective human per-
These directions are problem-independent and at an formance can be usefully tested by external-control
appropriate level of detail to be reliably inter- experiments in which individual human subjects are
pretable by the subject. In the actual experiments deliberately induced to act in accordance with a
an individual human subject is then induced to model and the resulting performance is then
carry out a task (e.g., the description and sub- observed in detail.
sequent solution of a problem) by executing the The work described in the preceding paragraphs
sequentially presented directions of the program was specifically undertaken to formulate a model
implementing the model. In this process the sub- for generating effective initial descriptions of
ject is asked to talk out loud about his or her problems in the particular domain of mechanics.
thought processes. The resulting protocol, con- External-control experiments show that this model,
sisting of the subject's transcribed verbal state- when implemented by human subjects, is very suc-
ments and written work, can then be analyzed in cessful in leading to good initial problem descrip-
detail. tions that facilitate the subsequent solutions of
Figure 1 shows the experimental results ob- these problems.
tained by such external-control experiments It should be noted that these experiments
designed to test the proposed model for generating demonstrate the effectiveness of the specified des-
effective initial descriptions of mechanics pro- cription rules implemented by human subjects, but
blems. Each subject worked on three problems. were not designed to teach description skills.
Figure 1 shows the performance of these subjects (Indeed, such teaching would require that control
in generating good descriptions of motions and of knowledge, explicitly external in these experi-
forces, as well as subsequently generating solu- ments, be internalized by the subjects and made
tions with correct equations and correct answers. habitual.) However, such a well-validated model
The following are the main results obtained in for generating effective initial problem descrip-
these experiments: (1) The proposed model for tions can be used as a basis of explicit instruc-
generating initial problem descriptions is suffi- tional methods to teach students effective problem-
cient to lead subjects to generate explicit des- description skills and thereby enhance their pro-
criptions that are complete and entirely correct. blem-solving abilities.
In turn, these descriptions greatly facilitate the REFERENCES
subsequent problem solutions which are then almost Chi, M.T.H.. Feltovich, P.J., § Glaser, R., Cate-
flawless. (2) Although subjects in these experi- gorization and representation of physics
ments possess a good knowledge of basic physics problems by experts and novices. Cognitive
concepts and principles, a knowledge sufficient to Science, 1981, S^, 121-152.
implement the individual directions contained in Larkin, J.H., McDermott, J., Simon, D.P., 5 Simon,
the model, this knowledge is not sufficient to lead H.A., Models of competence in solving physics
to good descriptions. These results are apparent problems. Cognitive Science, 1980, 4, 317-345.
from the much poorer performance of subjects in a
comparison group working without external control
of the model. (3) The main features of the model
are, in fact, necessary for good performance.
These results follow from experiments where sub-
jects worked under external control of a modified
model that omits certain features of the proposed
model (e.g., that provides a direction to enumerate
all forces, but does not provide more detailed
predictions specifying
data how
model to enumerate
ofthe them).
avoidance (4)

John Clement
Physios Department
University of Massachusetts
Amherst, Mass. 01003
Spontaneous analogies have been observed to inrtftnTnim
play a significant role in the problen solutions r-- "^ , KWi lUMO -ITU
of scientifically trained subjects [1,2]. In some -r- k
i uucc ittsul smil
cases analogies can even lead to the construction
jMiliains n« ctttmii ftvimsL^ nice
of a new mental model for understanding a problem ^* J: "( 1-^
domain. This paper describes a number of TlMVOOTITiaRJ SMri« inn I rE^njat OF MIMLEn IN
different analogical reasoning patterns that have u m t m ago >aifii« i v w
been observed In thinking aloud protocols from Ell M10I1M GCKUTH IKTOVVOMIT
expert problem solvers. The purpose of the CASE ro cawim amust
present study is to identify, classify, and label dturiai
the critical subprocesses involved in such
analogical solutions. In this study each of ten ro * "tvious *K«UBT I
subjects were given a number of problems. '0 ItWKKE UMXIISTAIVII4
Including the following one: (21 OfU TCASE
W t •CAH 'iciLjnre!
coviwQ •Eurion

Spring Coils Problem Etc. I

mvaitM VAvmim pprnim otsEwtp in Eiwtr n q a m s<n»irc
A weight is hung on a spring. The original
spring Is replaced with a spring made of the Analysis of more complex expert protocols
same kind of wire, with the same number of however. makes it apparent that analogical
coils, but with coils that are twice as wide reasoning is not a simple, one-step process, but
in diameter. Will the spring stretch from involves a niaaber of different processes, shown
its natural length, more, less, or the same below.
amount under the same weight? (Assume the (PI) Generating the Analogy. Given the
mass of the spring is negligible compared to original conception A of an Incompletely
the mass of the weight.) Why do you think understood situation. the analogous
so? conception, B, is generated, or "comes to
Subjects were advanced doctoral students and
professors in technical fields who had reputations mind";
for being creative problem solvers. Seven of the (P2) Confiraing the Analogy Relation. The
ten subjects generated spontaneous analogies in analogy relation between A and B must be
solving this problem. A spontaneous analogy "confirmed";
occurs when the subject, without being prompted, (P3) Comprehending the Analogous Case.
shifts to consider a situation B which differs in Conception B must become well understood, or
a significant way from the original problem at least predictive;
situation A, and then tries to apply findings from (PU) Transferring Findings. The subject
8 to A. In solutions by analogy the two contexts transfers conclusions or methods from B back
being compared are often perceptually different to A.
but they are still seen to be functionally or Table 1
structurally similar in some way. For example,
five subjects attempted to relate the problem to
The last three processes can occur in any order.
the analogy of a bending rod, as in the transcript
Analogies are often proposed tentatively, and
excerpt below taken from video tape.
processes (P2) and (P3) especially, can be quite
SI: (Draws bending rod in drawing G2-B of fig.2.)
time consuming. We have also been somewhat
Hy intuition about that [the rod] is that if
surprised to find that there appear to be not one,
you., doubled the length and hung some weight
but several ways of carrying out each of the above
on it, that.. it, would bend considerably
processes. Some of the most important of these
further... it would seem that that means that
sub-processes are shown in fig.2. The figure
um, in the original problem, the spring in
provides a basic typology of analogical reasoning
picture 2 [the wider spring] is going to hang
patterns that have been observed across different
subjects. This paper gives an example and brief
Here SI generates an analogy by drawing the
explanation of each PROCESSES
picture of an analogous problem involving bending
rods instead of stretching springs. This analogy
has in fact led him to the correct answer, and 79
provides a plausible but only partial
justification for it.
Aaaoclative leaps. The subject using an and evaluate analogies are at least as important
associative leap J^ps to an analogous situation in expert problem solving as processes used to
that differs in many ways fron the original generate them.
problem. A second subject, S2, generated evidence Bridging analogies. Determining a match
for several associative leaps In the spring between key relationships in oases A and B is the
problem when he said: "I feel as though I'm first and most obvious method for confirming an
reasoning in circles and I think I'll make a analogy relation CU,5,6]. However, another
deliberate effort to break out of the circle interesting process in the form of a "bridging rubber bands, molecules, analogy" may also be used. For example, S2 was
polyesters.." apparently attempting to link the concerned about the apparent lack of a match
problem to other situations he knew more about. A between the non-constant slope In the bending rod
third subject, S3, compared the wide and narrow and the constant slope of a stretched spring. In
springs to two blocks of foam rubber, one made order to evaluate the bending rod analogy, he
with large air bubbles and one made with small air constructed the intermediate, bridging example of
bubbles in the foaa. He had a strong intuition a spring with square coils as shown in drawing
that the foan with large air bubbles would be E1-C of fig.2. This allowed him to recognize that
easier to compress, and this added some support to restoring forces in the spring come Crom twisting
his conjecture that the wide spring would stretch in the wire as wall as bending— a major
more. Ue hypothesize that an associative leap breakthrough in his solution which corresponds to
takes place when an established conceptual the way in which engineering specialists view
framework for situation B in long term memory is springs. His discussion of the square spring is
activated by an association to some aspect of the evidence for a cognitive bridging analogy, C,
original situation A. Evidence for an associative which helps him decide whether conceptual
leap Generative
occurs when transformations.
the subject shifts toThisconsider
frameworks A and B are truly analogous. In this
method of generating an analogy familiar
occurs when the
new situation B that is obviously to him case the square spring analogy eventually aqulred
subject modifies the original problem rather than
or refers to "being reminded o f or "recalling" B. M model
the role of a mental = a which
^ gave him a new
recalling a different analogous situation from
memory. In other words, the subject transforms understanding of how springs work.
the problem by changing an aspect of it which was E ^ ^ h H i ^
previously assiaed to be fixed. For example, S2
refers to the rod as an "unwound spring". In this In a question about whether one can exert a
case the unwinding of the spring is considered a more effective force on a wheel at the top or at
transformation because the subject is modifying a the axle (In pushing on the wheel of a covered
feature of the spring (Its shape) that would wagon, for example) several subjects compared the
ordinarily be held fixed in the problem. wheel to a lever hinged on the ground (fig.38).
It is hypothesized that a generative Pushing higher up on the lever would allow it to
transformation occurs when the subject focuses on move a larger weight, they reasoned. Another
an internal representation of the existing problem example of a bridge, which helped one subject to
situation A in working memory and changes an confirm the appropriateness of this analogy is the
aspect of it to create a new but closely related spoked wha«l without a rim shown in flg.3C.
situation B Thus a generative transformation Although physicists usually analyze the wheel
usually leads to the construction of a new problem directly in terms of torques,
situation B rather than activating an already mathematicians often do not. The reader may be
constructed framework in long term memory. Interested In conjecturing about how one
This subject also generated another analogy mathematician, SI, solved this problem via an
via a transformation below while thinking about analogy to a pulley.
moving the weight along the spring wire: Extension analogies. The diagram for process
S2: Hmnmi, what if I imagined moving the weight E2 in fig.2 shows an extension analogy proposed by
along the spring? Now what if I recoiled the SI in the form of two parallel pipes. SI was
spring and made the spring twice as hoping to predict whether the radius/stretch
long...instead of twice as wide? relationship in the spring was linear or quadratic
seems to me pretty clear that the spring that's or cubic, and his understanding of the bending rod
twice as long is going to stretch more. analogy was not sufficient to help him. So he
The analogy to the thought experiment of comparing -generated a further analogy to the bending rod.
springs of different lengths suggests to him that In this analogy two pipes are fixed at the left
a wider spring may stretch more than a narrow side and held together in such a way that when the
spring. Notice that the analogy was generated weight is applied to the right side, the upper
from the rather playful transformation of sliding pipe is stretched and the lower pipe is
the weight up and do%ni along the spring wire. compressed. His analysis of this thought
Evidence for a third method, generating an experiment was part of an attempt to model the
analogy via an abstract principle, has been bending rod in more detail and determine its
observed on occasion, but only infrequently Cl,2]. length/deflection relationship so that this
ANALOGY EVALUATION PROCESSES information could in turn be used in analyzing the
Another finding that has surprised us is the spring. In such an extension analogy, a second
fact that rather than simply generating a single analogous case is used to understand the first
analogy, some subjects generate chains of several analogous case. Thus analogies can be used
analogies. Two types of chains are shown as recursively to understand and evaluate a previous
processes El and E2 in fig.2. These are used to analogous case.
evaluate analogies. Processes used to critique Extreme cases. Aiding in understanding an
80 analogous case is also one of the uses of extreme
cases. For example, 32 generated the extreme case
of a very short rod In order to conflrn his prior Research reported in this paper was supported by
prediction that a short rod would bend less than a MSF Award No. SED 8016567.
long rod (process E3 in fig.2). Other methods of
understanding an analogous case are to use a
specific fact recalled froa menory, a physical
intuition, or an analysis in terms of abstract REFERENCES
principles [2].
SUMMARY [1] Clement, J., Analogy Generation In Scientific
Problem Solving, Proceedings of the Third Annual
Fig.2 illustrates several alternative Meeting of the Cognitive Science Society,
subprocesses for acheiving processes PI, P2, and Berkeley, August. 1981.
P3 in Table 1. Together, these subprocesses
[2] Clement, J., Spontaneous Analogies in Problem
constitute a collection of intuitive heuristics
Solving: Part I- The Progressive Construction of
used by experts in solving problems via analogy.
Mental Models. Paper presented at AERA annual
Few of these subprocesses are described by
meeting. Mew York City, March, 1982.
subjects explicitly as they occur (they do not
have names for them.) Rather, they must be [3] Clement, J., Spontaneous Analogies in Problem
inferred from patterns in the content of the Solving: Part II- Generation Mechanisms,
subject's investigations. Reasoning patterns G1 Simulation, Extreme Cases, and Model Construction,
and G2 in fig.2 are analogy generation patterns. working paper. Physics Department, University of
Pattern El, the bridging analogy, is a method for Massachusetts, Amherst, 1982.
evaluating an analogy relation. Patterns E2. the [«] Gentner, D., The Structure of Analogical
extension analogy, and E3, extreme case analysis, Models in Science, technical report. Bolt, Beranek
are methods for evaluating and improving one's and Newman, Inc., Cambridge, MA, 1980.
understanding of the analogous case. These [5] Gick, M. and Holyoak. K.J., Analogical
reasoning patterns form a set of non-deductive Problem Solving, Cognitive Psychology, 12,
problem solving strategies which: (1) are quite 306-355, 1980.
different from traditional problem solving [6] Hesse, M., Models and Analogies in Science.
procedures; (2) are associated with imagery University of Notre Dame Press, Notre"T5ame, I9bb.
reports; and (3) are capable of generating new [7] Collins, A., Fragments of a Theory of Plausible
insights and recognitions of previously Reasoning, in Waltz, 0., Theoretical Issues in
undiscovered causal factors in a problem solution Natural Language Proceaslng-Z. Urbana-Champaign:
Various "compound solutions" combining two or University of Illinois, 1978.
more of the basic processes shown in fig.2 have
also been observed. Our current hypothesis is
that most observable chains of reasoning using
spontaneous analogies are describable as recursive
combinations of these basic patterns.
In the cases of the square spring and the
parallel pipes, the novelty of these cases argues
that they were at least in part invented by the
subject rather than recalled directly from memory.
Thus, the analogies observed do not always consist
of familiar cases recalled from long term memory;
Che analogies can also consist of invented cases
constructed in working memory. Furthermore, in
the square spring and parallel pipes cases, the
analogy Is used as a mental model which allows the
subject to understand the problem situation in a
new way. This type of mental model construction
appears to be important in the development of
creative problem solutions and may play an
important role in the development of new
explanatory models in science [6].

R.VBBIT: Cognitive Science in Interface Design


Micliae! D. Williams
Frederich N . T o u
Richard E. Filces
Austin Henderson
Tliotn.'is Malone

Cognitive and Instructional Sciences Group

X E R O X Palo Alto Research Center
Palo Alto, California

Abstract T o help solve these problems we looked for inspiration to

Uieories of h o w people retrieve information form their
A new kind of user interface for information retrieval has o w n memory. W e believe this approach is promising for
been designed and implemented to aid users in two primary reason: (J) T o the extent diat the interface
formulating a query. The system, called R A B B I T , relies between a person and his external memory is like the
upon a new paradigm for retrieval by reformulation, based interface between the person and his internal memory the
on a psycho ogical tlieory of h u m a n lemembering. The external memory may be easier and more narural to use,
paradigm actually evolved from an explicit attempt to and (2) to die extent that h u m a n m e m o r y systems embody
design a 'natural' interface which imitated h u m a n relreival a 'solution' to the problems of retreivul from large
processes. heterogeneous databases, they m a y provide useful insights
To make a query in RABBIT, the user interactively refines about h o w to design similar artificial systems.
partial descriptions of his target !tcm(s) by criticizing We began our design process by conjecturing an interface
successive e.\ample (and counterexample) instances that which pemiitted descriptive retreival. The basic tenet of
satisfy the current partial descripiion. Instances from the descriptive retrieval is that people retrieve information
database are presented to the user from a perspective from (their o w n ) memory by iteratively constructing
in/erred from the user's quer>' descripdou and the structure partial descriptions of the desired target item [Bobrow and
of die knowledge base. A m o n g odier diings, this Norman, 1975; N o r m a n and Bobrow, 1979; Williams
constructed perspective reminds users of likely terms to and Hollan, 1981]. The problem w;is diat our conjectured
use in their descriptions, enhances their understanding of system appeared to give us little more than the traditional
the meiining of given terms, and prevents them from boolean expression schemes such as D I A L O G . W e simply
creating certain classes of semanticiilly improper query replaced die technical term 'keyword' wiih die term
descriptions. R A B B I T paiticularily facilitates users w h o 'desrciptor.' This led us to a re-examination of the
approach a ±;tabase with only a vague idea of what it is problems inherent in boolean expression interfaces.
diat they want and w h o dius, need lo be guided in the Upon consideration we conjectured that there were three
(rc)formulation of their queries. R A B B I T is silso of major sources of difficulty for casual users of interfaces
substantial value to casual users w h o have limited biised upon boolean expressions of keywords: (1) the user
knowledge of a given1.database
or w h o must deal with a has incomplete knowledge about the descriptive terms
multitude of databases. needed to create a auery (e.g. what car colors does die
One way to test a theory is to try to do something useful
with it W e have taken a cognitive theory of h u m a n database know about* red, crimson, rose, mauve?), (2) the
remembering together with some artificial intelligence user doesn't knov/ what kind of attributes of the item(s) he
ideas about knowledge representarion and used it to is seeking die database recognizes (e.g. does die database
design a new paradigm for database retrieval interfaces for even has an attribute for car color?), and (3) many users
casual users. The paradigm is called retrieval by find the syntax of complex boolean expressions diflicult to
reformulation. A small experimental system based on diis understand.
new paradigm has been implemented in the Smalltalk Yet, if people actually recall information by descriptive
jrogramming language [Ingalls, 1978) using KloneTalk retreival then they must face the same problems; diey
Pikes, 1981] on the Xerox Dolphin and Dorado personal must have some trick to get by those problems. W e found
computers. such a trick in retrieval by instantiation. Retrieval by
Part of the motivation for designing a new kind of instantiation postulates diat the information retrieved at
database interface was die unsuitability of existing each iteration of the retrieval process is often in the form
datal.iase interfaces for casual users. S o m e database of an instannation, i.e., an example item suggested (e.g.,
interfaces (e.g., S Q U A R E Boyce et al, 1975) and S Q L analogically or metaphorically) by the partial description
[Chamberlin et al, 1976) require m a n y hours of [Williams, 1981]. The c o m m o n consequence of such an
mstruction to learn; others have a syntax which users find instantiation is that one is 'reminded' of something similar
difficult to use and understand (e.g., the boolean to the original item [Schank. 1980; Kolodner, 1980;
expressions of D I A L O G [Lockheed, 1979]). Interfaces Bower. Turner, luid Black, 1979]. W e conjecture that this
based on the relational data model (Codd, 1970] usually reminding serves to counter al dirce of the problems
require the user to know in advance which tables and noted for boolean expression schemes. Ta<i instantiations
attributes he will be needing, while usens of network provide a template for describing the target item, access to
82 the descriptive terms, and can provide tl;e basis for an
databases (such as Z O G H^obertson et al., 1981])
frequcndy get lost during the course of their search. incremental reconstruction of the target item that avoids
m u c h of the complexity inherent in highly structured
1 Retrieval by Refonnuliition the query description the user has given so far. (E.g.,
information concerning the dinner m e n u or house
The basic principle imdcrlying RABUIT is a new specialty of a given restaurant would be available from the
paradigm, retrieval by reformulation, for information perspective of "a place which serves food" but not from
retrieval elabi)rtttcd from tlie notioti of retrieval by die perspective of "a business." So if die user had begun
ii\stantiation. T h e user makes a query by lli^t consluicting his query with the descriptor 'Rusiness', dien die image of
a partial description of the item(s) in the database for die retrieved instance, even if it is a restaurant, would not,
which he is icaiching. R A B B I T then provides a initially, include information about its dinner menu.)
description of an example instance from the database The current implementation of RABBIT supports a small
which matches the user's partial description. Since it is set of 5 basic operations for creating a query description
unKkely that ilic first instance will be exactly what the user given die descriptors provided in die image of the example
Is lookiiig for the u'jcr car. then select ;any uf the attributes instance. These operations, shown in figure 1, are require
shov.n in tite t.rample and incorporate those descriptors, and prohibit (which specify diat the given descriptor is or
or variations of ihern. into his partial description, thus, is not to be a descriptor of die retrieved instance,
re/onnulaiing his initial query. At any lime the user can respectively), alternatives (which presents die user widi a
request a new exainnle instance, one which matches the popup m e n u of alternative descnptors to the given one),
latest version of his (pariiari description, and then use the specialize (which shows the specializations of die given
descriptors of that new instance to retlne his query descriptor), and describe (which allows the user to
description still furdier. examine a description of a given descriptor or to describe
Figure 1 shows RABBIT in the midst of a retrieval recursively what diat descriptor should be). T h e describe
interaction. T h e interface consists of four primary window c o m m a n d provides the user widi the capability to build
panes. The 'Description' pane specifies an implicidy ernbeilded descriptions, an example of which appears in
defined boolean expression that appears to the user as a figure 1 with the value of die attribute 'disk' being itself a
partial description of the item(s) he is seeking. T h e description. [Tou, 1982) and (Tou, Williams, Fikes,
Example' pane contains an example item that matches the Henderson, and Malone 19821 contain a more complete
panial description as of the last user initiated retrieval discussion of the paradigm of retrieval by reformulation
cycle from the R A B B I T defined perspective. M o r e and the user interface to R A B B I T .
precisely, it contains a descripdon, called the image, of an This paradigm of retrieval by reformulation, in effect,
instance from some well-defined perspective (e.g., " S T A R defines a form of interaction by which R A B B I T can as.^ist
8011 computer" can be viewed from the perspectives of "a casual users in formulating queries. M u c h of the
manufactured product," "a computer," "an electronic intelligence of R A B B I T comes from control of diis
device," "a piece of office equipment," and "a piece of interaction by appealing to die conceptual structure of the
stock in a store." ). T h e 'Matching Examples' pane lists database.
instances which satisfy the partial descripdon as of the last 3. Perspectives
retrieval cycle. T h e "Previous Description" pane contains The KL-ONE epistemology for representing knowledge
the description used on the last retrieval cycle which [Brachman, 1979] has had a major influence on die
determines the perepective for presentadon of the example development of R A B B I T . O n e of the main uses of K L -
and the list of matching examples. T h e example pane O N E is the implementation o^ perspectives. A perspective
c o m m a n d pop-up m e n u is also displayed is simply a way of describing an event or item from a
The example instance mentioned above is a central particular viewpoint JBobrow and Nonnaii, 1975, Bobrow
element of die interface. It serves several purposes: it and Winograd, 1977, Goldstein and Bobrow, 1980,
fiinctions as a template, it permits access to addiuonal Goldstein, 1980]. In R A B B I T , a perspective specifies
descriptors, it provides semantic resolution of potentially which descriptors are included in die image of any
ambiguous terms, and it frequendy serves as a instance presented to the user. R A B B I T perspectives are
counterexample. dynamic in that the perspective from which the user views
The example instance is a template in the sense that its die instances in the database changes depending on the
presentation provides a pattern for making a query using current partial description and on where he is within the
the descriptors in die instance's image. It permits access to database.
new descriptive terms through die alternatives and There are two distinct mechanisms RABBIT uses to
describe c o m m a n d s elaborated below. construct a perspective. First itfiltersdie attributes to be
It also provides semantic resolution in that the context of a presented to a user by including only attributes implicitly
temi such as the role name 'manufacturer' establishes and acknowledged by the user. Since the partial description is
refines the term's meaning. The role n a m e a representation of the user's intent to the computer, that
'manufacturer:' could refer to a person or a nation or a description is a legitimate basis for determining what
corporation. The statement 'manufacturer: Xerox' in die information to include in the image of the example
context of a description of a computer product helps instance. In R A B B I T the attributes included in the image
resolve a host of potential meanings. •are exacdy those that belong to the instance classes
Tlie example instance is also a counterexample to the occurring in die partial description. For example, if one
user's intentions whenever it is not exactly what die user is were to see the computer descibed in figure 1 retrieved
looking for. Rudier than simply permitting the user to under the partial description 'Product' (i.e. widiout the
express his displeasure with the counterexample and have descriptor Computer') then only die attributes 'name',
R A B B I T Iry to guess what is wrong widi it, the system 'manufacturer', and 'cost' would be presented. Once the
tries to encourage the user to articulate what is wrong widi user refines die partial description to specify diat he is
die instance presented. Tlie counterexample's sunple seeking a computer, additional attributes (e.g. 'disk:,'
presence serves to remind the user that his query 'cpu:,' ...) would appear.
description is incomplete or wrong and, in addition, point A second mechanism for creating perspectives actually
out ihc particular parts of his description ihat need extends the perspective of any given instance beyond 83
correction or modification. attributes direedy held by the object. Note in figure 1 diat
Finally, since the amount of information known about the because the user has created an embedded dcsciiption
retrieved instance could be considerable, the information about the disk of the computer sought, aspects of the disk
actually presented in the image is limited to be only that that the user considers important (e.g. capacity) have been
information which can be inferred to be relevant based on compressed into the image of the computer.
Peripectives serve four main functions in the RA13BIT Boyce, R.F., Chamberlin. D.D., King. W.F., and Hammer.
interface: M . M . "Specifying Queries as Relational Expressions:
The S Q U A R E Data Sublanguage," Communications of the
-controlling the type and amount of information
A C M 18. 11 (Nov. 1975), pp. 621-628.
--facilitating the user's understanding of instances Brachman. R.J., Bobrow, R.J., Cohen. P.R.. Klovstad,
-enforcing certain kinds of semantic consistency J.W.. Webber, B.L., Woods, W.A. "Resemch in Natural
-organizing and managing hetero>jcncous data. Language Understanding: Annual Report, I September
4. Summary 1978 to 31 August 1979," B B N Report No. 4274.
Cambridge. M A : Bolt Beranek and N e w m a n Inc.. August,
This paper has briefly described the process of designing a 1979.
novel type of database interface named RABBIT. Chamberiin, D.D., Astrahan, VI.M.. Eswaran, K.P.,
R A B B I T relies on a new paradigm for information Griffiths, P.P., Lorie, R.A.. Mehl, J.W.. Reisner, P.. and
retrieval, itnieval by reformulation, derived iVom a Wade, B.W. " S E Q U E L 2: A Unified Approach to Data
cognitive science theory of humaj\ remembering loyether Definition, Manipulation, and Control," I B M Journal of
with some uitificial intelligence idc.ts about knowledge Research and Development 20 (Nov. 1976). pp. 560-575.
representation. Tlie four main ideas underlying this Codd, E.F. "A Relational Model of Data for Large
paradigm are: Shared Data Bases," Communications of the A C M 13, 6
1) retrieval by constructed descripuons (June 1970). pp. 377-397.
2) interactive construction of queries Fikes. R. "Highlights from KloneTalk: Display-Based
3) critique of example instances Editing and Browsmg. Decompositions, Q u a Concepts.
4) dynamic perspecdves. and Active Role-Value Maps," Proceedings of the 1981
Tile fu-st diree of these ideas had their origins in human K L - O N E Workshop, Jackson, N e w Hampshire. October,
psychology. The development of the fourth 1981.
idea—dynamic perspectives—was motivated and Goldstein. LP. "PIE: A network-based personal
influenced strongly by the K L - O N E knowledge information environment" Proceedings of the OJfice
representation language. Semantics Workshop, Chatham. Mass., June, 1980.
Cognitive Science has played a crucial role in the design of
Goldstein, I.P., & Bobrow. D. Descriptions for a
RABBIT. W e take the tentative success of the design as
programming environment. Proceedings of the First
an iiidicadon of the potential role of cogniuve science in
Annual National Conference on Artificial Intelligence,
the design of human-computer interfaces.
Stanford, C A , August. 1980.
fngalls. D.H. "The Smalltalk-76 Programming System:
Acknowledgements Design and Implementation." Conference Record of the
Fifth Annual A C M Symposium on Principles of
A major portion of this work was carried out by the Programming Languages, Tucson, A Z : Januai7 1978, pp.
second author under the auspices of the M I T intern 9-16.
program at Xerox P A R C . The authors would also like to Kolodner, J.L. Retrieval and Organization Strategies in
acknowledge the original stimulus for this work steming Conceptual Memory: A Computer Model. Research
from an exciting conference on artificial intelligence and Report #187. Department of Computer Sciecne, Yale
human-computer interfaces sponsored by the A r m y Umversity, N e w Haven, C T . 1980.
Research Insutute. In particular, we would like Stan
H:ilpcrn, Janet Kolodner. and Alan Badre to know a part Ixckheed Information Systems. Guide to D I A L O G
of what came from their efforts in putring that conference Searching, Palo Alto. C A , 1979.
together. W e would also like to thank John Seely Brown, Norman. D.A.. and Bobrow. D.G. "Descriptions: A n
T o m Moran. Rick Cattell, Laura Gould, and Richard Intermediate Stage in Memory Retrieval,' Cognitive
Burton for their patient discussions and guidance. Each Psychology il (1979), pp. 107-123.
contributed crucial pieces of the puzzle many of which we Robertson, G.. McCracken. D., and Newell. A. "The
are still putting together. .John Seely Brown's questions in Z O G Approach to Man-Machine Communication."
particular guided our pursuit of die use of perspectives. International Journal of Man-Machine Studies (1981) 14,
Finally, we would like to thank the other members of the pp. 461-488.
Cognitive and Instructional Sciences Group at Xerox
P A R C for dieir continuing support and critique Schank, R.C. Failure-driven memory. Cognition and Brain
throughout the development of R A B B i r . Theory, Vol. 1, 4, 41-60. 1980.
Tou. F. RABBIT: A novel approach to information
Bobrow, D.G., and Norman, D.A. "Some Principles of retrieval, unpublished M.S. thesis. Massachusetts Institute
Memory Schemata," in D.G. Bobrow and A.M. Collins of Technology. Cambridge, Mass., forthcoming.
(Eds.), Representation and Understanding: Studies in Tou. F.N., Williams. M.D., Malone. T.W.. Fikes. R.E.. and
Cnnniiive Science. N e w York: Academic Press, 1975. Henderson. A. R A B B I T : an Intelligent Interface. Xerox
Technical Report, forthcoming. 1982.
Bcjbrow, D.G.. and Winograd, T. "An Overview of KRL:
A. Kiiowledge Representation Language," Cognitive Williams. M.D. "Instantiation: A Data Base Interface for
Science, 1, pp. 3-46. 1977. die Novice User," Xerox Palo Alto Research Center
Working Paper, i98L
Bower, C.H., Black, J.B., and Turner, T.J. Scripts in Text Williams. M.D., and Hollan, J.D. "The Process of
Comprehension and Memory, Cognitive Psychology. Vol 1, Retrieval from Very Long Term Memory," Cognitive
177-220. 1979. Science 5 (1981), pp. 87-119.

Zloof, M . M . "Query by example," in Proceedings of the
National Computer Conference, A F I P S Press, Arlington,
Va., M a y 1975. pp. 431-437.

M o u i Descnpiion. Descaptioa
• U:-.P:
or 5i;k-t"itts
or :<:k;to«
• Accnt'i.u..v; of chv 1^1^01-

ixampU [ tV'l'lUV
o Attritut.;-; of -.t.vr-;;oi 1 -- proKit'\c
ruil' l-WU'
ilujpUM: Larri«-Forrruit-Dt^pUm
mi';moh^: Xerox-i^C-m^rrunnj
-informanoa .U)oiit itar-soi 1 ~
An Ktei:\mvi! worP. iiaiion bviUt bn XiifDn.;

Pt*wujus Dcscnptiorv 4 Mautuiu) ixampbts

Connpui«r1teta(«dPrDdiict i:n:im.;nu-o-~-.?0
or si^i- -biites
Figure 1. Example of RABBIT display.


Allan Collins
Dedre Centner
Bolt Beranek and Newman Inc.
50 Moulton Street
Cambridge, Massachusetts 02238

A core idea in the literature on molecule escaping from the water is like a
mental models (Brown, Burton, & Zdybel, rocket ship escaping from earth. That is
1973; deKleer, 1977, 1979; Forbus, 1981; to say whether or not it actually escapes
Hayes, 1978; Stevens & Collins, 1980) is is a function of its initial velocity and
the notion of mental simulation. In all its angle. In this way the model builds
these approaches mental simulation is in a rudimentary notion of the attractive
accomplished by dividing a system into a forces between molecules, by likening the
set of states whose transition rules from notion of escape from the attraction of
state to state are known. Given the the other water molecules to escape from
transition rules for each state, and the gravity. However, to understand some
topology of connections between states, it aspects of evaporation, this gravity
is possible to run the system with notion of attraction is not enough.
different inputs to see what happens. The third metaphor states that the
This provides a kind of inferential power molecules in the air mass over the water
not possible with the static data can be thought of as people inside a room.
structures implied in much of the As more water molecules collect in the air
literature on frames, scripts, and mass, the room becomes more crowded with
semantic networks (e.g., Collins 6 Loftus, water and air molecules. The warmer the
1975; Minsky, 1975; Quillian, 1968; Schank air mass, the larger the room. Thus, warm
The Metaphor1977).
& Abelson, Hypothesis air masses are less dense then cold air
masses. The boundary between the air and
In this paper we propose a specific water is the entry into the room, and if
role for metaphor in constructing runnable everyone crowds along that border it is
mental models. It can be stated as hard to get in. This crowded-room
follows: Metaphors map the set of metaphor leads to many correct
transition rules from one domain (the predictions, but is wrong in some
base) into another domain (the target) so fundamental ways. In fact, the space
that it is possible to construct a mental between molecules in a cool air mass never
model to run simulations in the target becomes crowded. Cool air masses hold
domain. This is a special case of less moisture because the water molecules
Centner's (1980, 1982) more general claim in them tend to lose energy with each
that metaphor is a mapping of structural interaction. Then the attractive forces
relations from a base domain to a target between water molecules tend to attract
doma in. the molecules back to the water surface or
We can illustrate the hypothesis by to form raindrops or dew.
showing how three metaphors can be used to Now we want to show how these three
construct a runnable version of the metaphors enable a person to construct a
microscopic model of evaporation discussed runnable model of evaporation processes.
by Stevens and Collins (1980). Then, in We would argue that people usually know
the next section, we compare the model certain interaction rules of billiard
derived from these metaphors with the balls such as those depicted in Figure 1.
model one of our subjects used to reason Velocity of each ball in the interaction
about evaporation in an experiment where is represented by a vector, and the
we asked subjects novel questions about transition rule of the interaction by the
evaporation processes. large arrow. Rule 1 shows that without
The first metaphor states that water collision, speed and direction are
molecules (or air molecules) are like maintained. Rule 2 shows a head-on
billiard balls bouncing around in space. - collision with a non-moving ball where
The warmer the water is, the more velocity momentum is transferred from one ball to
(or greater energy) the average molecule the other. Rules 3 and 4 show how
has. The same metaphor applies to the momentum is transferred as a moving ball
water and air molecules in the air mass strikes a non-moving ball at different
above a body of water. This model is angles. Rules 5 and 6 show typical
incomplete insofar as it includes no interactions when both balls are moving.
notion of the attractive forces between These rules summarize one's local
different molecules and the polarity of knowledge about how billiard balls
the The
second charges
metaphoron different
states that sides
a interact.
of the molecule. But as a first From these local interaction rules,
approximation, it is a perfectly good one can derive certain global properties
model. of how a container full of molecules will
behave. That is we can construct an
aggregate model of molecular interaction
(Stevens & Steinberg, 1981) based on the
mechanical model of billiard-ball
interaction. The most important does not mix completely (depending on
properties of this aggregate model are winds), then water molecules may
that there is variability of speed and accumulate in the air along the water's
direction of the molecules. This produces surface and no new molecules can get in,
randomness o£ motions of the molecules, even though the air mass is not filled.
with some going toward the surface, some If a crowded air mass is cooled, the water
not. There is elasticity of interaction molecules may be squeezed out for lack of
so that energy can be transferred from space. These behavioral properties
molecule to molecule, but not lost. reflect the way air masses actually
Finally, there is no change in direction behave, even though the model is
or velocity without a collision. In our essentially incorrect.
view, people can either imagine molecules In an earlier paper (Stevens &
moving in this aggregate fashion (like Collins, 1980) we described the kind of
seeing dust particles moving in the inferential power that runnable models
sunlight) or by following a single provide for answering novel questions
molecule moving around and encountering about the world. In order to see how
other molecules according to the local subjects use models, we conducted an
Q- rules shown in Figure 1.
o- experiment where we asked subjects to
reason about such questions.
Exper iment on Mental Models

O- O Q©- Four subjects were asked eight

questions about evaporation. They were
asked to explain their reasoning on each
question. All were reasonably
Q- Q intelligent, but were novices about
evaporation processes. Our analysis will
center on one subject, whose model of
evaporation processes was very much like
©- CD the model we constructed from the three
metaphors, if not exactly the same model.
His view includes notions of the energy
needed for molecules to escape from a body
Gh -® -OO- of water and the difficulty of water
molecules entering a cold air mass because
of the higher density. Nowhere does he
o mention attractive forces between
* n r molecules, which suggests that this notion
is not part of his model. H e seems to
share a common misconception that visible
Pl9ura I. So«« Lntaractlon rulas for p«rfactly alttstic billiard clouds (such as one sees coming out of a
balla. boiling kettle) are made up of water vapor
rather recondensed liquid water. This
The rocket-ship metaphor gives a misconception forced him into several
simple three state description of behavior wrong explanations.
of molecules near the surface. If they We will present the portions of his
have any downward component of velocity, responses to three of the questions that
they do not escape. If they are headed illustrate his use of the mental model
straight up, there is some minimum initial described eibove.
velocity they need to escape. If they are Q2: On a cold day you can see your
headed up at an angle to the surface, the breath. Why?
smaller the angle the greater the initial S: I think again this is function of the
velocity they need to escape (because of water content of your breath that you
the attraction of the surface over a are breathing out. On a colder day
larger part of the trajectory). This it makes what would normally be an
three state model summarizes what the invisible gaseous expansion of your
rocket-ship metaphor implies about the breath (whatever), it makes it more
effects of the water's surface. dense. The cold temperature causes
The crowded-room metaphor, like the the water molecules to be more dense
billiard-ball metaphor, leads to and that in turn makes it visible
construction of an aggregate model at the relative to the surrounding gases or
microscopic level. The model has the relative to what your breath would be
Q4: onWhich will evaporate
a warmer day, when faster,
you don'ta pan of
following behavior. The warmer the air
that water
cold placed
effectin causing
the refrigerator
the water
mass, the larger the room is. As water
or theto be
content samemore pan
. .. at room
evaporates into the air mass, it fills up
temperature? Why?
with molecules. Cold air masses take less
time to fill up with molecules. When the 87
air mass is filled, then no more water
molecules can get in. If the air mass
S: When I first read that question, my ships, and crowded rooms, he must have
initial impression, that putting a drawn upon some such objects in order to
pan of hot water in the refrigerator create the model he was using. Based on
you suddenly have these clouds of this model, he was able to deal quite
vapor in it, threw me off for a successfully with the questions, even
second. I was thinking in terms of though his model was incorrect in several
there is a lot evaporation. Well I ways.
guess, as I thought through it more,
I was thinking that it was an Acknowledgments
indication of more evaporation, but
it was just (let us say) the same This research was supported by the
evaporation. Immediately when you Personnel and Training Programs,
put it in anyway, it was more Psychological Sciences Division, Office of
visible. Ahmm, as I think through it Naval Research, under contract number
now, ray belief is that it would N00014-79-C-0338, Contract Authority
evaporate less than the same pan left Identification Number NR 154-428. We
standing at room temperature and my thank Michael Williams for an engaging
reasoning there is that the air in conversation that led to the metaphor
the refrigerator is going to be hypothesis of this paper, and to Ken
relatively dense relative to the room Forbus for his comments on a draft of the
temperature air, because at a colder paper.
temperature again its molecules are
closer together (what not), and that
in effect leaves less room to allow
Q5: Does molecules
the evaporation affect
from the water
hot water to
join the air. If
. .so,
. in what way, and
Brown, J. S., Burton, R. R., & Zdybel,
S: I guess those water molecules that do
F. A model-driven questioning-
leave the surface of the water are
answering system for mixed-initiative
those that have the highest amounts
computer-assisted instruction. IEEE
of energy. I mean they can actually
Transactions on Systems, Man, and
break free of the rest of the water
CyberneticsT l573, i, 248-2577"
molecules and go out into the air.
Collins, A., & Loftus, E. F. A spreading
Now if they have a, if they are the
activation theory of semantic
ones with the most energy, I guess
processing. Psychological Review,
generally heat is what will energize
1975, 82, 407-428.
molecules, then that would lead me to
believe that maybe, although it may de Kleer, J. Multiple representations of
not be measurable, maybe with knowledge in a mechanics problem
sophisticated instruments it is, but solver. Proceedings of the Fifth
maybe it would be measurable after International Joint Conference on
your most energetic molecules have Artificial Intelligence. Cambridge,
left the greater body of water. Mass.: MIT, 1977, 299-304.
Those that remain are less energetic de Kleer, J. The origin and resolution of
and therefore their temperature ambiguities in causal arguments.
The subject's
perhaps less. first two answers Proceedings of the Sixth
manifest the crowded room model: The International Joint Conference on
particles in cold air are crowded Artificial Intelligence. Tokyo,
together, which acts to make one's breath Japan: 1979, 197-203.
more visible and to make it more difficult Forbus, K. D. A study of qualitative and
for water molecules from a hot pan to get geometric knowledge in reasoning
in. The last answer manifests the rocket about motion. Cambridge, Mass.: MIT
ship and billiard ball models: The AI Technical Report No. 615, 1981.
particles move around and those that Gentner, D. Studies of metaphor and
escape are the high energy particles, complex analogies: A structure-
leaving the low energy particles behind mapping theory. Paper presented at
and hence cooling the water. the A.P.A. Symposium on Metaphor as
These excerpts illustrate the Process, Montreal, September 1980.
underlying molecular model of evaporation Gentner, D. Are scientific analogies
that the subject had, and how he used it metaphors? In D. S. Miall (Ed.),
to find answers to novel questions. His Metaphor: Problems and perspectives.
model is close to, if not the same as, the Brighton, Sussex: Harvester Press,
model we constructed from the metaphors in Ltd., 1982.
the previous section. The ' hypothesis of Hayes, P. J. Naive physics: Ontology for
the paper is that this subject's liquids. Unpublished manuscript,
underlying model was constructed by 1978.
pasting together his models of how
88 objects behave. While he may not
have drawn upon billiard balls, rocket
Minsky, M. A framework for representing
knowledge. In P. H. Winston (Ed.),
The psychology of computer vision.
New York: McGraw-Hill, 1975.
Quillian, M. R. Semantic memory. In
M. Minsky (Ed.), Semantic information
processing. Cambridge, MA: The MIT
Press, 1968.
Schank, R. C, & Abelson, R. P. Scripts,
plans, goals and understanding.
Hillsdale, N.J.: Erlbaum, 1977.
Stevens, A. L., & Collins, A. Multiple
conceptual models of a complex
system. In R. E. Snow, P. Pederico,
6 W. E. Montague (Eds.), Aptitude,
Learning, and Instruction (Vol. 2).
Hillsdale, N.J.: Erlbaum, 1980.
Stevens, A. L., & Steinberg, C. A
typology of explanations and its
application to intelligent computer
aided instruction (BBN Report No.
4626). Cambridge, MA: Bolt Beranek
and Newman Inc., March 1981.

Bi-Dlrectional Inference
Scuarc Shapiro, Joao Martins and Donald McKay*
DeparcmenC of Computer Science
State University of New York at Buffalo
4226 Ridge Lea Road, Amherst, MY 14226
*(current address: Research & Development Acti-
vity, Special Systems Division, Federal and Special
Systems Group, PO Box 517, Paoli, PA 19301)

This work was supported In part by the National

Science Foundation under Grants MCS878-02274 and
MCS80-06314 and by the Instltuto Nacional de
Xnvestigacao Cientlficia (Portugal) under Grant
No. 20536.

Abstract inference rules (SNePS semantic network (Shapiro,

Inference can be viewed as a search through a space 79a). Every rule may be used both in FI and BI.
of inference rules. Backward and forward inference When a rule is used, it is activated, remaining
differ in the direction of the search: backward that way until explicitly de-activated by the user.
inference searches from goals to ground assertions; The activated rules are assembled into an active
forward inference searches from ground assertions connection graph (acg) (McKay and Shapiro, 81) ,
to goals. This paper describes an inference pro- a collection of tiULTI processes (McKay and Shapi-
cedure, called bi-directional inference, which ro, 80) which carry out the inference. The acg
limits the number of inference rules searched. also stores all the results generated by the
Bi-directional inference results from the inter- activated rules. If during some deduction SNIP
action between forward and backward inference and needs some of the rules activated during a previ-
loosely corresponds to bi-directional search. We ous deduction, it uses their results directly
show through an example that, when used through- instead of rederiving them. The acg that is built
out a session of related tasks, bi-directional for one query or assertion is not discarded after
inference sets up a conversational context and the query has been answered or the assertion
prunes the search through the space of inference "fully" understood by making all possible inferen-
rules by ignoring rules which are not relevant to ces from it. Rules of the network remain active,
that context. allowing a dynamic context to be constructed. The
1. Introduction
dynamic context is the collection of rules which
Bi-directional inference (BDI) combines forward have been activated. In addition, the active rules
inference (FI) and backward inference (BI) to limit are more prominent: when searching for inference
the search through a space of inference rules by rules to be used, if any previously activated rules
establishing a context on the basis of an ongoing are appropriate then only those rules will be con-
session. We use the term "bi-directional infer- sidered and no other rules will be activated.
ence" because the resulting search loosely corre- 3.
HenceBackward Inference irrelevant to the current
rules apparently
sponds to bi-directional search (Kowalski 72, Pohl, dynamic context are ofIgnored.
We present an example BI, explaining very
71). briefly how acg's work. A complete explanation can
The benefits of BDI become clear during an be found in (McKay and Shapiro, 8 1 ) .
extended session in which the user asks questions
and adds assertions all of which are related. BDI Suppose that SNIP is being used as a database
sets up a conversational context and prunes the retrieval system for some company interested in
space of inference rules searched (either during recruiting computer science (CS) majors. The
BI or FI) by ignoring rules which are not relevant recruiting policies of the company are stored as
to the context. rules in the database (Lines 1-4, Fig. 1 ) . The
In BDI there are two sets of inference
V(x,y)| Pl«inli>9-to-»Uit(x) 4 CS-«joc-.t(y,x) -> (SaoifnftctW.
frontiers, one growing from the assertions added
viM.yll Tta|>-<clicx>l(il i CS-«joc-«t(y,«l -> Goocl-i«<Mjnct(y)l
in FI and the other growing during BI from the
V(zl( aaad-Fn*FKt(zl -> Scnt-Uutitun-todl 1
questions asked. Whenever two frontiers meet some
V(z)[ <Soo6-ftofCtU) i Cradu>elng(x) -> larit^f(x-intcrriai(x) ]
answers are produced.
BDI has been implemented in SNIP, the SNePS
Inference Package. We present examples of BDI and
compare the results obtained using BDI with the
results obtained using BI or FI only. Although
SNIP has a much richer rule syntax than used in
these examples (Shapiro, 79a, 79b) they suffice to
Figure 1
illustrate BDI.
Initial database
Basic notions of SNIP company's database also contains a list of top
schools and a list of the CS majors at different
SNIP relies on a declarative representation of schools (Lines 5-10, Fig. 1 ) .
Every year the company updates Its database with the goal node produces then Immediately. For every
the names of all students graduating in CS and all matching formula in consequent position of some
the schools that the company will visit during that rule, a new rule instance is added to the acg. The
year (Fig. 2 ) . The company then uses SNIP to find other Job of the goal node Is to remember all sub-
out stitutions it receives (these substitutions are
represented enclosed in curly brackets next to the
nmliq-eo-nut (SDBn
goal node). When a goal node receives a new sub-
Plaming-to-rtmit (Otn
stitution, it sends it to all rule instances to
which it points. In this case, Gl can't find any
andi(tlii9(Jotnl ground instances of Invite-for-interview(who) but
the rule V(x) [Good-prospect(x) & Graduating(x) -<•
Invlte-for-interview(x)] may be used to derive such
Figure 2
instances. Rule instance A2 is thus created by Gl
Information updating the database
(Fig. 3 ) . Notice that the variable 'x' in A2
which CS majors should be invited for interviews, should be bound to 'who' in Al when an answer is
which ones should be sent the company's literature, produced by A 2 . For this reason a switch ([x/who])
etc. is inserted in the link between A2 and Gl and has
We now consider the acg describing the reason- the effect of translating between variable con-
ing of SNIP when it is aslced who should be invited texts. Switches are computed by the network
for an interview. matching function (Shapiro, 77) which was used by
Gl. For details of how this Is done see (McKay and
An acg is represented as rectangles and Shapiro, 8 1 ) .
circles. Each rectangle represents a rule Goal nodes G2 and G3 are created for the ante-
instance (a deduction rule together with a sub- cedents of A2. G2 finds two rules which can pro-
stitution for the variables in the rule); the duce instances of Good-prospect(x) and creates the
antecedents appear to the left of the double line corresponding rule instances (A3 and A 4 , Fig. 3 ) .
and the consequents to the right. Circles (called G3 finds three ground Instances of Graduating(x),
goal nodes) represent goals to be proved. Rule namely Graduating(Ted), Graduating (Don) and
Instances and goal nodes are connected by directed Graduating(John). The substitutions (Ted/x),
edges. Substitutions flow through the edges. Rule {Don/x} and (John/x) are stored by G3 and sent to
instances and goal nodes can be viewed as producers its consumer (A2).
of formulas sent out on the edges leaving them and Goal nodes are created for the antecedents of
as consumers of formulas coming in on the edges A3 and A4. G4 finds two top schools (MIT and C M U ) ,
pointing to them. Some edges have switches (repre- and sends the substitutions to A3. G5 finds the
sented by square brackets) which have the effect CS majors at different schools. Informing both
of renaming the variables in the substitutions A3 and A 4 . A3 deduces that both Ted and Anna are
flowing through them. For ease of reference, rule good prospects. A4 deduces that both Don and Ted
instances have labels of the form An (where n is are good prospects after receiving from 06 the
an integer). Those labels are used for notation- information that SUNY and CMU will be visited.
al convenience only and have no relation with the The Information about good prospects flows
way acg's work. through the acg reaching A2 which deduces that both
Initially, a request is created which contains Ted and Don (good prospects who are graduating)
the atomic formula being sought. The rule instance should be invited for interviews and the answer is
labeled Al in Figure 3 represents 1the tortt«-£oc-ljitKTi«»(»liD)
request to II finally produced by Al.
({T*<V«tiol,IIlon/««>l) ® Notice that BI tries to get each answer in all
possible ways, and so the same answer can be pro-
1 Gao<H>co>pKt(i) 1 Gcaduatlnglxl 1 1 I^ivitc-foc-intervinidl 1 duced
4. ForwardseveralInference
times. In this particular case the
llDor\/»l,{T»<i/'xl,(«i»i«/x|) (gl^_ @ ((T«Vxl,lnon/i).IJc*n/xl) answer Good-prospect(Ted) was produced twice, by
rule thisinstances
section weA3discuss
and A 4the
. results obtained
ly/.. .y/ if the company chooses to use FI. We will assume
1 itop-achoollil 1 CS-m]ar-«t(y,zl 11 Oaat-feofetly) 1 that the information represented in Figure I is
stored in the database and that FI is done with the
information represented in Figure 2.
/ 1 nannui9-ca-vlut(il 1 CS-iiajac-u(y,z) 1 1 Gaod-pn'P*ct(yl 1 Doing FI with Planning-to-visit(SUNY) generates
' (a){(sDMt/i(,iafVillJ the acg of Fig. 4: rule instance Al is created
(ISDMC/''yl, laa/M.ttO/y}, Wn/i.Mmv'y). loaJVM,Joln^y|) along with goal nodes for Its antecedents (Gl and
G 2 ) . Gl is immediately satisfied, and G2 finds
Figure 3 CS-major-at(Don, SUNY), sending to Al the substi-
acg for backward inference tution {Don/y}. Notice that G2 is performing some
amount of BI, reflecting a characteristic of SNIP
deduce all Instances of the atomic formula Invite- in which BI and FI are closely interconnected. Al
for-interview(who). The next step is to create a deduces Good-prospect(Don), creating rule instances
goal node for the atomic fomiula. A goal node A2 and A3 to do further FI. A2 deduces its conse-
(Gl) is added below the instance being sought. One quent but A3 doesn't since Graduating(Don) is not
of the jobs of the goal node is to natch its atomic in the database yet.
formula against the network to find all formulas
which unify with it. If there are ground instances
CS-major-at(y,SUNY), finding CS-maJor-at(Don,SUNY).
Good-pco>P*^'I^°'^> 1' Send-litecBtuxe-to(Don) This is enough to deduce Good-prospect(Don) but
A3 nothing can be done with this because finding
I Good-prosp*ct(DQn) I Graduatu>g(Oon) II Invita-Coc-lnt«rvi«f(Dan) I unactlvated rules requires a match. Therefore, the
inference stops, leaving behind the active rule
instance Al (Fig. 5 ) .
I Pl«nmn9-to-vi»lt(SUWI I CS-iMjor-dt(Y,SUHm II Good-pro»p«ctly) I If the user now asks the question 'Invlte-
@ (g) l(Dan/yl) for-interview(who)?' rule instances (A2 and
Figure 4 A3) and goal nodes (G3, G4 and G5) are created
acg for forward Inference (Fig. 5) as discussed in section 3. Here, however,
goal node G4 finds that there Is an active rule
After entering all the information of Figure 2, that can produce instances of Good-prospect(x),
SNIP has deduced (acg not shovra) that Don, Ted and namely Al. Instead of doing a network pattern
Anna should be sent literature and that Don and Ted match to find additional rules, it uses rule
should be invited for interviews. In other words, Instance Al limnediately. The substitution {Don/y}
all possible inferences were made, even if the user flows through the acg producing the answer Invlte-
was only interested in some of them. FI does not for-interview(Don). In this case the CS majors
take the user's interests into account filling the from other schools were not even considered since
database with assertions which may never be used. SNIP had set up the "SUNY context".
Suppose that CS-major-at(Don,SUNY) were not in
5. Bi-directional Inference the network and thus rule instance Al could not
In this section, we introduce BDI and show that produce any answer even though instances of Invite-
it establishes conversational contexts, focusing for-interview(who) could have been derived for
snip's inferences within those contexts and thereby CS majors of other schools. Following the query
limiting the space of rules searched. BDI results 'lnvite-for-interview(who)?', SNIP would return an
from the interaction of FI and BI and can be "I don't know" answer. This, at first glance,
obtained either by doing BI following FI or by seems to be wrong. However, taking into account
doing FI following BI. We consider each of these that the user only wants to consider the CS majors
cases in turn. from SUNY this makes perfect sense, showing a
feature of BDI in which derivable instances which
5.1. Backward Inference Following Forward are Irrelevant to the context are effectively
Inference 5.2.
ignoredForward Inference Following Backward
by SNIP.
Suppose that the user says "I am planning to
visit SUNY, who shall I invite for an interview?". Suppose that the database contained the infor-
In this context, by asking 'Invite-for-lnterview mation of Figure 1 and the user asked who should be
(who)?' the user wants to consider only the CS Invited for an interview. SNIP builds an acg as
majors from SUNY. Ue show how FI can be used to shown in Figure 3, except that goal nodes G3 and
set up the 'SUNY context' which is then used to G6 have no stored data. The acg produces no
answer the user's query. In a pure BI system, answers since the information in the database is
finding the CS majors from SUNY who should be Invi- insufficient. If the user now does FI with any
ted for an interview requires finding the inter- of the propositions of Figure 2, the waiting goal
section between all CS majors from SUNY and all nodes are found. Whenever a new assertion is pro-
persons who should be invited for an interview duced for FI. and a goal node already exists that
(or, in some systems, generating all of one and wants it, no network match is done to find addi-
testing each to see if it satisfies the other). tional relevant rules. For example, if Graduating
The user begins by doing a small amount of FI (Ted) is entered, SNIP tells the user to invite
with Plannlng-to-visit(SUNY). The amount of Ted for an interview, and Ignores that Send-
inference can be defined by the number of network llterature-to(Ted) could also have been derived,
pattern matches performed. Let us assume, for the since presumably the user was not Interested in
sake of argument, that by "small amount of F I " we this latter proposition. Again, BDI takes into
mean that FI is only allowed two network matches. account the conversational context, ignoring the
The first match finds the rule V(x,y)[Planning-to- 6.
irrelevant to the active context.
vi3it(x) & CS-major-at(y,x) * Good-prospect(x)], We presented an overview of BDI, pointing out
setting up the rule instance Al (Fig. 5) and the the two characteristics required by a system to
second match is used by G2 to look Ifor ln»ite-fac-mt«tvl€w(»ho)
instances of 11 make the BDI behavior possible:
u ({Don/xholl
1. Every rule may be used both in FI and BI.
li/«t»l 2. There is a distinction between rules which
Goo<J-pco«pect(x) I Gcadu»tinq(il II Invite-for-lnt»rvie»(x) I have been activated and rules which haven't.
(iTsViLiDorv/iKIJotm/ill Relying on these two characteristics, when SNIP
(a system which uses BDI) searches for rules to be
used, it looks for activated rules first and just
Planunq-to-%asit(SUOTl 1 CS-im]oc-at(y,SUNin M Good-prospect (y) I in case of failing to find any activated rule,
@ @ llDon/yl) non-activated rules are considered. In addition,
as a matter of efficiency, activated rules remem-
Figure 5
ber all the results produced, not solving the same
acg for bi-directional inference
problem twice. The resulting inference loosely
corresponds to a bi-directional search. We say 8. References
'loosely corresponds' because not only may there
be several bi-directional searches going on in 1. Kowalski, R., And-or Graphs, Theorem-proving
parallel (one for each question asked) which can Graphs and Bi-directional Search, in Machine
intersect each other, but also there are two levels Intelligence 7. Meltzer and Michie (eds.),
of search, the first through the activated rules, Halsted Press, 1972.
and the second, which is tried only after failure 2. McKay D. and Shapiro S., "MULTI - a LISP Based
of the first, through the non-activated rules. Multiprocessing System", Proc. 1980 LISP
The example presented, although very small and Conference, pp. 29-37.
simplistic, shows that BDI effectively prunes the 3. McKay D. and Shapiro S., "Using Active Connec-
search through the space of inference rules by tion Graphs for Reasoning with Recursive
focusing the system's attention towards the Rules", Proc. IJCAI-81. pp. 368-374.
interests of the user. 4. Pohl I., Bi-directional Search, in Machine
In BDI, some of the disadvantages of pure FI Intelligence 6, Meltzer and Michie (eds.),
and pure BI do not exist. One of the disadvanta- American Elsevier, 1971, pp. 127-140.
ges of pure FI is that it may fill the database 5. Shapiro S., "Representing and Locating Deduc-
with derived propositions which may never be used. tion Rules in a Semantic Network", in Proc.
We showed that BDI ignores some derivations which Workshop on Pattern-Directed Inference System,
do not interest the user. One of the disadvanta- SIGART Newsletter 63, 1977, pp. 14-18.
ges of BI is that all apparently relevant rules 6. Shapiro S., "The SNePS Semantic Network
are Cried, regardless of the actual data. We Processing System", In Associative Networks.
showed chat BDI Ignores inactive rules in favor N.V. Flndler (ed.). Academic Press, 1979a,
of rules activated by previous (forward or back- pp. 179-203.
ward) deduction. 7. Shapiro S., "Numerical Quantifiers and their
7. Acknowledgements use in Reasoning with Negative Information",
Proc. IJCAI-79. 1979b, pp. 791-796.
Ilany thanks to Terry Nutter, Ernesto Morgado,
8. Shapiro S. and McKay D., "Inference with Recur-
Jeannette Neal and the other members of SHeRG (the
sive Rules", Proc. First AAAI Conference.
SNePS Research Group) for their coimnents on earlier
1980, pp. 151-153.
versions of this paper and for their general dis-
cussions while the research was in progress.


John M. Carroll and Robert Mack

I B M Thomas Watson Research Center
Yorktown Heights, N e w York 10598

Learning to use a word processor provides a study of real stood reformatting could have reinterpreted the instructions and
complex human learning that is fundamentally "active", driven by adapted them to this rearranged text. But this learner had no idea
the initiatives of the learner. People learn by actively trying what she had done, and thus was puzzled by the fact that the
things out, by reasoning, and by referring to prior knowledge. Instructions seemed to be wrong. The fragility of instruction
Our view is that these are natural ~ albeit demanding — strategies sequences, coupled with the propensity of learners to try to re-
for people to adopt when confronted by a learning task of non- cover by Initiating exploratory forays, can result In problem tan-
trivial complexity. What is especially noteworthy in the present gles: Learners, who may not even fully understand the individual
case is that the learners we have studied are almost entirely inno- operations, have little basis for appreciating the subtle interdepen-
cent with respect to computer technology. In the context of dence of clusters of word processor operations. They find them-
learner innocence, we argue, these "natural" strategies entrain selves in distorted or even unrecognizable problem situations.
severe and wide ranging learning problems. Analysis of these W h e n learners do not. or cannot, follow directions the prob-
problems, in turn, suggests research directions for the analysis of lems that arise can result in their losing track of what they are
real human learning within Cognitive Science and practical direc- trying to do. It is likely, of course, that this loss of task orienta-
tions In which computer word processing systems, and the educa- tion contributes to the overall failure of learning — as indicated by
tional technologies that support their training and use, might the trouble all learners had applying their learning experiences to
evolve. the routine typing "transfer task" after training. None of the
In this research project, ten office temporaries spent four learners were able to type, revise, and print a simple one page
half-days learning to use one of two possible word processing letter without some trouble with each of these basic skills.
systems in our laboratory. These people were highly experienced What Is more surprising perhaps is that even when learners
in routine office work, but quite naive with respect to computers were able to successfully follow instruction sequences out. they
in general and word processing in particular. W e asked them to still seemed to experience a loss of task orientation, as evidenced
Imagine a scenario in which a word processing system had recent- by comments like: "What did we do?", "I know I did something,
ly been introduced to their office and they had been asked to be but I don't know what it is!" or "I'm getting confused because I'm
the first to learn it (to then pass this knowledge on to colleagues). not actually doing anything except following these directions."
The point was that they were to learn to use the system using the For these subjects, the overall orientation toward accomplishing
training materials that accompany it as their only resource. meaningful tasks (e.g., type a letter, print something out) has
Our method involved prompting learners to "think aloud" as been subverted by a narrower orientation toward following out a
they worked through the training materials. They were to report sequence of instructions.
questions that were raised in their minds, plans and strategies they Learning bj thinking.
felt they might be considering or following out. and inferences Just as learners take the initiative to try things on their own,
and knowledge that might have been brought to awareness by so also are they active in trying to make sense of their experience
on-going experiences. W e remained with the learners, to keep with the word processor. Learning passively by rote assimilation
them talking and to intervene if at any time it appeared that a of information is atypical. Rather, learners actively try to develop
problem was so grave that a learner might leave the experiment if hypotheses about why it 0|>erates the way it does. These quests
we did not help out. Our prompting remained non-directive, and after meaning can be triggered by new and salient facts. They
indeed once learners got going we needed to prompt very infre- can be forced by discrepancies between what is expected and
quently. Our analysis consisted first of an enumeration of what actually happens. They can be structured by the learner's
'critical incidents", constrained by the consensus of the experi- personal agenda of goals and queries, referred to as new problems
menters, which were cataloged and classified in various ways. arise. In each case, learners' lack of knowledge about word proc-
The chief goal of this was to form a picture of the typical experi- essing makes it difficult for them to reason out coherent solutions
ence of a learner, and it is this induced "prototype" learning that accurately represent the objective operation of the system.
experience to which w e will refer in what follows. For example, learners have no basis for recognizing and
Learning by doing. ruling out irrelevant connections: their interpretations of word
Our learners relentlessly wanted to learn by trying things out processing systems are often influenced by spurious connections
rather than by reading about h o w to do them. Half of our lear- between what they think they need and what they perceive. In
ners tried to sign on to the word processor before reading how to one case, a learner tried to decide if a "File" command had stored
do so. In part this was impatience: they were reluctant to read a a document file away. It was not stored because the command
lot of explanation or get bogged down following meticulous direc- was entered in a text input mode where all typed strings are inter-
tions. But It also devolved from mismatched goals: Learners preted as text, and not executed as commands. But she assumed
wanted to discover how to do specific things at particular times. that the file had been stored, and adduced evidence to confirm
and this did not always accord with the sequence in which topics this premise. For example, at one point she notices a status
were treated in the manual. message "INPUT M O D E 1 FILE" which indicates that she is in
Learning by trying things out according to a personal agenda the text input mode. However, the word "file" matched her file
of needs and goals is not merely a preference. Learners who try command, and this was enough to suggest some kind of feedback
to follow out manual instructions are often unable to do so. The that her "File" (as in store document) command had worked.
instruction sequences are fragile in the sense that it is easy to get In such cases, reasoning appears to consist in adducing factu-
side-tracked and there is no provision in them for recovery. O n e al support to a premise the learner would like to hold as true.
example is a learner w h o inadvertently paginated (reformatted) a The learner above began with the hypothesis that she had stored
document at the beginning of an exercise on revising documents. the document file away, and sought evidence to confirm that this
This not only rearranged the lines in the file to make right mar- was the case. Her adduction here was incorrect because she did
gins even, it also stored the document away. The learner had not not know which facts were relevant to verifying the premise. In
94yet learned how to retrieve documents and the manual itself other cases, reasoning appears to consist in abducing a hypothesis
provided no recovery information for this (or any other) type of when it, together with other assumptions the learner may already
error. Accordingly, she was forced to try to discover how to hold, is consistent with some fact or observation. One learner
retrieve the document on her own. tried to move the cursor In a protected area of the display. W h e n
Once the document was restored, she was faced with an this locked the keyboard, she hypothesized that this fact meant
equally staggering problem: the pagination operation had rear- that she was at therightplace on the screen to do what she set
refer tothe
her file so Athat the revisinguser
n experienced instructions did not
w h o under- out to do.
Learners also set goals which they actively pursue by trying the final measure of success in the learning situation — is that of
to solve problems. They are hampered in this by their Innocence assembling these pieces into a coherent fabric, an understanding
of the appropriate problem space, or domain of possible actions of the word processor. Along the way, any prior bit of knowledge
and interpretations relevant to accomplishing goals and addressing is available for use as a basis for expectations concerning succes-
queries. Accordingly, their strategies are often local and fragmen- sive interactions with the system. O n e system we studied seemed
tary; they have difficulty integrating information or other experi- to flaunt Inconsistency in similar operations. Thus, to delete a
ences, and in formulating their concerns in ways that map trans- word, one positions the cursor under the word's initial character
parently onto system functions. W h e n learners cannot solve and keypresses W O R D D E L E T E . However, to underscore a
problems or answer questions, they add them to a personal agenda word, one positions the cursor under the final character of a word
of goals and queries as they go along. As new opportunities arise, and keypresses W O R D U N D . This inconsistency caused one
learners return to these standing queries and try to resolve them. learner to misexecute one and then the other of these two opera-
Learning by knowing. tions in a dismal cycle of negative transfer.
To this point, we have argued that a new user of a word Summary.
processing system relies on active exploration and ad hoc reason- Perhaps the most apt discussion of the world of the new user
ing as learning strategies. However, not all possibilities are ex- of a word processing system is that often quoted phrase of Wil-
plored and not all hypotheses that could be reached are reached. liam James: 'a bloomin' buzzin' confusion" People in this situa-
What constrains these strategies is a sense of what could be ap- tion see many things going on, but they do not know which of
propriate ~ and this devolves from prior knowledge on the pan of these are relevant to their current concerns. Indeed, they do not
the learner: knowledge about devices "like" word processors (e.g., know If their current concerns are the appropriate concerns for
typewriters), knowledge about office routine and work in general, them to have. The learner reads something in the manual; sees
even knowledge culled from interacting with the word processor something on the display: and must try to connect the two, to
up to that point In time. integrate, to interpret. It would be unsurprising to find that peo-
Our learners were unable to resist referring to their prior ple in such a situation suffer conceptual — or even physical —
knowledge about typewriters as a basis for interpreting and pre- paralysis. They have so little basis on which to act.
dicting experience with word processors. One came to a halt as And yet people do act. Indeed, perhaps the most pervasive
she read an instruction in the manual which said "Backspace to tendency we have observed is that people simply strike out into
erase." It seemed that she could not interpret this instruction for. the unknown. If the rich and diverse sources of available infor-
as she pointed out. B A C K S P A C E does not erase anything. She mation cannot be interpreted, then some of these will be ignored.
had irresistibly availed herself of her knowledge of how backspac- If something can be interpreted (no matter how specious the basis
ing works on a typewriter, unable to even consider that this for this interpretation), then it will be interpreted. A d hoc theo-
knowledge might be inappropriate for the present case. Other ries are hastily assembled out these odds and ends of paniaily
learners tried to use S P A C E and R E T U R N keys to move the relevant and partially extraneous generalization. A n d these
cursor ~ which insert spaces and blank lines ~ but merely move "theories" are used for further prediction. Whatever initial con-
the typing point on a typewriter. fusions get into such a process, it is easy to see that they are at
Our learners were experienced with conventional office work: the mercy of an at least partially negative feedback loop: things
typing letters, filing, etc. Their knowledge about how these rou- quite often get worse before they get better.
tine tasks are organized in the office creates expectations in them What's wrong? W e would argue that the learning practices
about how analogous tasks ought to be performed in the "office people adopt here are typical, and in many situations adaptive.
of the future" (as represented by the word processor in our labo- The problem in this particular learning situation is that new lear-
ratory). Thus, one response to revising a letter task is to retype. ners of word processors are innocent in the extreme. "Word
This is striking since it is the capability of the word processor to processor", so far as we know, is not a natural concept. People
store and retrieve documents ~ for revision, among other things — who do not know about word processors have little, possibly
that is its fundamental advance over previous office technologies. nothing, to refer to in trying to actively learn to use such things.
As a learning experience progresses, the learner is acquiring Innocence turns reasonable learning strategies into learning prob-
and organizing new bits of knowledge. The ultimate goal — and lems.


Edwlna L. Rlssland*
Department of Computer and Information Science
University of Massachusetts
Amherst. MA 01003

Abstract Oplnlona. Links to other cases. Links to legal

doctrine/rules/statutes. A slot can have a
In this paper, we discuss the use of examples simple filler, as In the Title, Citation or
in the law, in particular "hypotheticals" in Date slots, or a complex one as in the Opinions
contract law. We present a framework for which can be structured into main, concurring,
representing examples, show how this can be and dissenting opinions. Links to other cases
used to generate new hypotheticals, and discuss Include "procedural history" links, like
their role in the dialectic of refining or affirmed, reversed, amended, and "substance"
learning legal doctrine. links, like criticised, distinguished,
1. Introduction explained, harmonized, etc., which describe how
the courts through their opinions related the
Examples are important in many disciplines like cases.
Hypotheticals can also be represented by a
mathematics, law and linguistics. They are frame. The most important features of a hypo
central to reasoning and learning processes are the Fact Situation, the Arguments that
such as induction, concept formation, rule interpret the fact situation with respect to
refinement and theory formation [Hawkins 1980; particular legal doctrines, and the links to
Kuhn 1970: Lakatos 1976; Lenat 1977; Polya other hypos and real cases. Thus the frame for
1965; 1968; Rlssland 1978. 1982; Soloway a hypo is like that for a real case. The links
1978; Winston 1975]. between a hypo and a real case include
In the law, where much reasoning is done by "abstracted from". "particularized from".
example [Levi 19'»9] and analogy [Bernan 1968], "general1zed from".
examples — i.e., cases — are indispensable. One can also make a taxonomy of cases in the
Examples force one to consider possibilities law, much as in mathematics [Rlssland 1978].
and nuances. In teaching a legal doctrine, Such a taxonomy is not explored here, but the
they are used to point out its "gaps, conflicts categories might include:
and ambiguities" [Kennedy 1980]. They are used 1. standard oases (typically found in the
in restatements of the law, which are compendia casebooks);
of legal doctrine in the form of principles, 2. landmark cases that have far reaching
examples and references, e.g., Restatement, effects;
Second. Contracts [1981]. They are critical to 3. first impression cases that bring up
the "realist number" which shows both that the an issue for the first time;
law is much more than a set of clearcut 4. counter cases that show the limits of
concepts and rules [Llewellyn 1931], as the or the invalidity of a rule or
formalists of this century and before had doctrine;
hoped. 2. Epistenological Considerations 5. anomalous cases that don't seem to fit
The examples in the law that we consider are of in.
While we have used some of the link types used
two types: (1) "real" cases, i.e., oases in LEXIS [Sprowl 1976] and legal digests and
actually litigated; and (2) "hypothetical" case citators like Shepard's Citations, the
cases ("hypotheticals" or "hypos"). Both types framework and taxonomy we have described could
can be represented by a frane-like data be used to design a legal data base that
structure [Mlnsky 1975] and the frames can be reflects more of the structure of the law than
linked together by various types of relations. those currently in use.
In describing frames for cases, we are laying
out a conceptual framework to represent the 3. Hypotheticals in Contract Law
knowledge used by students and teachers of the
law. In contract law, one master question is "Which
The frame for a real case- includes the promises should the law enforce?", where
following slots: Title, Citation, Date, Fact enforcement means either making the promisor
Situation. Process Hi story/Outcomes, Arguments, fulfill his promise to the promisee (i.e.,
"specific performance") or make the promisor
pay "damages" to the promisee for his breach
[Fuller and Elsenberg 1981; Knapp 1976].
There are several ways of dealing with this
•Supported in part by the National Science question. The "gift-consideration" distinction
Foundation under grant IST-80-173t3. Opinions tries to relate enforceability with the
expressed in this report are those of the "consideration" given by the promisee in return
author and do not necessarily reflect views of for the promise [Section 17, Restatement.
the U.S. Government. Second, Contracts]:
"...the formation of a contract requires a Hypo5 introduces an emotional "heart rendering"
bargain in which there is...consideration..." aspect to show there are limits and exceptions
to consideration doctrine, such as duress.
Another approach is that of "reliance" in which Hypo6 introduces an element of reliance which
the (typically injurious) reliance of the leads to conflicting outcomes from reliance and
promisee upon the promise is highlighted consideration argumentation.
[Section 90, Restatement. Second, Contracts]. U. A Frame for Promise Hypos
A third is the use of "formalities" like the
legal seal [Section 96]. Each of these ways of In applying the framework of Section 2 for the
looking at the master question emphasizes domain of contract law, we used the following
different aspects of a promise and each has its facets in the sub-frame for the fact situation
own stengths, weaknesses, inconsistencies and of a "promise" case:
ambiguities. 1. the status of the PROMISOR
The following is a set of hypos (actually just 2. the subject matter of the PROMISE
the fact situations) typical of those used in 3. the status of the PROMISEE
law school to: (1) point out the H. the RETURN ACTION by the promisee
gift-consideration distinction; (2) show 5. the RELATION between the promisor and
doctrinal weaknesses and ambiguities; and (3) promisee
show possible conflicts between doctrines such
as consideration and reliance. The hypos are The full frame of the case would also include:
really caricatures of the real case of
Dougherty v. Salt, decided by the N. Y. 1. ARGUMENTS for various outcomes of the
Court of Appeals In 1919, which is a standard hypo according to various doctrines:
case in first year Contract Law (e.g., see 2. further NOTES/DISCUSSION of the hypo.
[Fuller and Eisenberg 1981]). such as historical significance;
In each of the hypos, one is to ask, "Is this 3. REUTIONS to other cases (real and
promise enforceable?" In other words, if the hypothetical).
promisor breaches, ought the promisee be Each of these major sub-blocks has facets;
awarded damages or performance? those for the PROMISOR and PROMISEE are
Hypol: similar; those for PROMISE and RETURN somewhat
Facts: Aunt Tlllle says, "Charlie, you are so. The PROMISOR and PROMISEE can be further
such a nice boy; I promise to give you described by such attributes as PERSONAL
Facts: Same as Hypol with the addition that can be further broken down. For instance,
Charlie says, "Dear Aunt Tillie, I can't take PERSONAL STATUS includes SEX. AGE. MARITAL
something for nothing, let me give you my third STATUS (these are for largely traditional.
grade painting." historical and common law reasons related to
the once unequal status of women under the
Facts: Same as Hypo2 except Charlie offers to The description of the PROMISE Includes the
mow T i m e ' s lawn. subject matter of the promise and conditions on
it. The RETURN action of the promisee can be:
Hypo«: (1) no action; (2) forbearance (i.e..
Facts: Same as Hypo2 except that Charlie's refraining from doing an act, like suing); (3)
last name Is Picasso. an action. An action Itself has aspects like:
(1) the action benefits or does not benefit the
Hypo5: promisor;
One can also (2) the action
structure the leaves
RELATIONthe facet
Facts: Same as Hypol with the addition Aunt worse off/better
the promise off/the
situation for same.
instance according to
T i m e ' s assets are in ruin and that keeping whether it is familial (e.g.. father-daughter)
her promise to Nephew Charlie means her own or non-familial (e.g., debtor-creditor, friends
children starve. or neighbors).
'The following is a fact situation sub-frame
Hypo6: instantiated for the first Aunt Tillie - Nephew
Facts: Same as Hypol with the addition that Charlie hypo:
Charlie makes an unreturnable deposit on a new
car. PROMISOR: Aunt Tlllle
PERSONAL STATUS: female, elderly, widow
If one argues from the standpoint of PERSONAL ATTRIBUTES: kind, rich
consideration doctrine, Hypol is a paradigmatic INTENTIONS: the best
example of a pure gift, "a gratuitous promise". PROMISE: $10,000
which would not be enforceable. Hypo2 is an CONDITIONS: none
attempt to make Hypol look enforceable under PROMISEE: Charlie
consideration doctrine. Hypo3 is another STATUS: male, young
attempt to alter Hypol into an enforceable RETURN: none
promise. Hypo4 is used to point out that one REUTION:
is making value Judgements on the consideration FAMILIAL: Aunt-Nephew
per contra the doctrine that one should not
inquire into the adequacy of the consideration. 97
5. Generating Hypotheticals hypotheticals; we have actually experimented
with our ideas in the domain of Contract Law,
It is apparent that one can generate new we feel that these methods are easily
hypotheticals — that is their frames — by transferable to other domains such as Property
changing slot fillers in a hypothetical frame. and Torts.
Since the possible fillers for a slot can often We feel our work contributes to: (1) a better
be arranged in hierarchies, many modifications understanding of the use. structure and
can be described in terms of super, sub and generation of examples in general and legal
sibling node substitution and thus lead to hypotheticals in particular: (2)
modifications affecting generality and epistemological analysis of legal domains; (3)
specifity. For instance, generalizing Tillie legal data base design; (U) hypothetical
and Charlie to abstract individuals A and B generation for teaching and ICAI (Intelligent
results in the following: Computer Assisted Instruction) systems.
Hypol': A promises B $10,000.
Making another change gives: 8. References

Hypol": "JR" promises B $10,000. Berman, H. J., "Legal Reasoning". In

International Encyclopedia of the Social
In the last, knowing that "JR" (as in Ewing) Sciences^
often has bad intentions creates a hypo very Fuller, L. L., and M. A. Eisenberg, Basic
different in "feeling" from the "Aunt Tllle - Contract Law. West Publishing Co., Minn.,
Nephew Charlie" or "A promises B" hypos; the 1981.
"JR" hypo introduces questions of "good/bad
faith". Hawkins, D., "The View from Below". For the
Elaborating the description of any of the Learning of Mathematics. Volume 1, NoT
elements of the fact situation is another way a 2, FLM Publishing Association, Quebec,
creating a new hypo. For instance, elaborating Canada, November 1980.
"Aunt Tille" to "old, senile Aunt Tillie" and Kennedy. D., "Utopian Proposal". Draft memo.
"Charlie" to "manipulative, black-sheep-of- Harvard Law School, 1980.
the-family Charlie" gives a very different
Knapp, C. L.. Problems in Contract Law.
character to the hypo.
Little, Brown and Co., 1976.
6. Computer-generated hypos
Kuhn. T. S., Thi Structure of Scientific
We are currently investigating the generation Revolutions. Second Edition. University
of hypotheticals using the CEG (Constrained of Chicago Press, 1970.
Example Generation) method of "retrieval plus Lakatos. I., Proofs and Refutations. Cambridge
modification". in which a new example is University Press, London. 1976.
generated by retrieving a known example (that
comes close to what is wanted) and then Lenat, D. B.. "Automatic Theory Formation in
modifying it to meet the current requirements Mathematics". Proc. IJCAI-77.
[Rissland and Soloway 1980. Rlssland 1982]. So Levi, E. H., An Introduction to Legal
far, we have been dealing only with constraints Reasoning. University of Chicago Press.
such as "more/less general/specific" "different
but of the same class" (e.g.. familial).
Higher level constraints are "heart rendering", McDonald. D. D., "Language Production: The
"more/less surprising" (e.g., against one's source of the dictionary." In The
default assumptions). Nineteenth Annual Meeting of the
We are experimenting with ways to generate AssociatTon for Computational Lingistics,
three or four sentence long hypos similar to Stanford University, 1981.
those found as exercises in casebooks and as Mlnsky, M. L., "A Framework for Representing
illustrations in the Restatements. To produce Knowledge". In The Pysehology of Computer
the English text from the frame, we are Vision, Winston (ed), McGraw-Hill, 1975.
currently using sterotypical precanned text 'Polya. G.. Mathematical Discovery. Volume II.
templates and then filling in the templates Wiley, Mew York, 1965.
with information from the hypo frame. An
example of such a template filled in the most Polya, G., Mathematics and Plausible Reasoning,
general way is: Volumes I and II. Princeton University
" A promises B X in return for Y ." Press. 1968.
More sophisticated — longer and subtler — Restatement, Second, Contracts. American Legal
hypos will need more sophisticated text Institute. Philadelphia. 1981.
generation such as McDonald's MUMBLE [McDonald
Rissland. E. L.. "Constrained Example
7. Summary and Conclusions Generation". Submitted for publication.
We have been studying the structure of legal
Rissland. E. L., "Understanding Understanding
knowledge, specifically real and hypothetical
Mathematics". Cognitive Science. Vol. 2.
cases, using a structural approach of frames
No. «. 1978.
and relations and how one generates
Hissland, E. L., and E. M. Soloway,
"Overview of an Example Generation
System". In Proc. First National
Conference on Artificial Intelligence.
Stanford, August 1980.
Soloway, E. M., "Learning s Interpretation +
Generalization: A Case Study in
Knowledge-Directed Learning". COINS
Technical Report 78-13, University of
Massachusetts, 1978.
Sprowl, J. A., A Manual for Computer-Assisted
Legal Research. American Bar Foundation,
Chicago, 1976.
Winston, P. H., "Learning Structural
Descriptions from Examples" in The
Psychology of Computer Vision, Winston
(ed), McGraw-Hill. 1975.

Learning Recursive P r o c e d u r e s
b y Middleschool Ciiiidren^

Yuichiro Anzai
Carnegie-Mellon University & Keio University
Yuzuru Uesato
Keio University

Introduction /acf(3) -/acf(2) X 3 tact(2) = tact{^)x2 /acf(1) - 1.

Recursion is a recurrent ttieme in human thinking. It has been In this case, although the data, if regarded declarative, can be
around tor a long time in some fields related to cognitive science: generalized formally to generate /acr(n)«/ac/(n-1)xn (neA/), the
for instance, it has taken place in information-processing models student needs to consider all the subformulas, tacHk) - lact{k-\)xk
of cognition, in the theory of computation, in cognitive and (k>2 n-1), to actually compute tact{n): the data allow direct
developmental psychology, or in teaching computer progrsunming generalization by converting, for example, 3 to i and 2 to n-1, but
to novices. he is necessary to organize the given segments of data to acquire
Intuitively, recursive formulation may lead to understanding of the recursive computational procedure. It may be much more
potentially infinite phenomena In compact, finite terms. O n the difficult than in the iterative case.
other hand, since recursive definition involves top-down, tightly However, w e can advance our speculation one more step. The
connected organization of knowledge, it may not be easy to learn, student, while he is engaged in the task of inducing the factorial
or to be applied to formulation of complex problems. These from the iterative data, might notice the regularity of embedded
expectations, however, are less well examined experimentally. pattern in the data. The left column of Fig. 1 illustrates it for an
Besides, there are some other points such as memory load for iterative data set. If this kind of structural emtiedding w a s
executing recursive procedures, the firmly established character discovered, acquisition of the iterative definition of the factorial
of recursive functions in the theory of mathematics, or practical may result In learning the nested procedural structure of the
application to teaching computer programming, which make factorial. Then, if the nested structure as shown in the left of Rg. 1
recursion an interesting theme for cognitive science. As one topic resides in memory, and if recursive data are presented, the data
related to recursion, this paper discusses the question of whether may match the nested structure fairiy eatsily as shown in Fig. 1.
recursive procedures are cognitively difficult to learn, based on a Thus, the recursive procedure may be learned by the successive
rule induction experiment conducted on middleschool children. It presentation of the iterative and recursive data sets in this order.
concludes that racursive procedures may be acquired based on '1
learning of the corresponding Iterative procedures.
Learning Recursive Procedures f=^tact{2}A3
A recursive function treated here is simply a function whose
definition includes the function itself. As a simple but
representative example, w e use exclusively in this paper the Nested structure emtiedded in an iterative data set
factorial function 'tact" defined on N, the set of positive integers, and its relation to the corresponding recursive data set
as follows: The preceding simple discussion gives us the hypothesis that if
factin) = fact(n•^) x n for any ntN, ^>^, and /acf(1) » 1. a student knew none of the factorial function, or the concept of
The above definition is recursive, but of course tact can be recursion, he finds it easier to learn the iterative procedure for the
defined itcrativoly: factorial rather than the recursive one, but after he learned it, he
lact{n) = 1 X 2 X . .. x n for any neN. must already be ready to assimilate the recursive procedure. In
the follovying rule induction experiment, w e examine this
The above two kinds of definitions are functionally equivalent,
hypothesis by using middleschool children.
but have many cognitivcly different points. Let us consider below
only the point relevant here: how people acquire the recursive Experiment
procadure for computing factorials, based on example data. First,
suppose that a student is given an iterative sequence of data for Subjects and procedure
factorieils: 88 middleschool children (age about 14) participated in the
fact{->) = 1 lact(2) = 1 x 2 lact{3) = 1 x 2 x 3 . experiment. The rule to be induced was the numerical function for
It may be easy for him to generalize the above simple patterned computing factorials of positive integers. Two kinds of formats for
sequence, and to obtain the general iterative definition, example data were considered. O n e was the iterative format, and
/acr(r7) = lx2x...xn (niN). Note that the induced definition itself the corresponding to-be-induced function was called WHITE in
can easily be interpreted to provide procedures (multiplications) the experiment. The other was the recursive format, and the
for actual computation. corresponding function was named BLACK.
O n the other hand, suppose that the student tries to induce the For each format, a sequence of three data sets was prepared.
factorial function based on the following recursively generated The first data sets for W H I T E and B L A C K were given as follows:
data: First data set for W H I T E
Let us think about the following computation for a
given number. The answer to the computation is called
Thanks are due to John Anderson, Robin Jeffries and Herbert "WHITE" For example, "WHITE of 2" is computed as
Simon for their comments on this work. Please address follows:
correspondence to Yuichiro Anzai, Faculty of Science and (1) Start with 1.
Technology, Keio University. 3-14-1, Hiyoshi, Kohoku, Yokohama,
(2) Multiply 1 by 2. The result is 2.
••WHITeof2"isZ. performance. Also, if the data for B L A C K were presented after
Now, compute 'WHITS of <". (Write the WHITE as for the group Gl, the performance was better than its
computation and the answer.) opposite: G1 for BLACK gave 16%, 2 9 % and 6 4 % of percent
correct for the data sets with two, three and four segments of
First data set for B L A C K
information, but G 2 for BLACK provided 0%, 1 4 % and 33%, which
Let us think about the following computation lor a were relatively smaller. O n the other hand, the performace for
given number. The answer to the compulation is called WHITE was similar for the two groups, regardless of the order of
'BLACK' For example. "BLACK of 2" is computed as presentation.
followa: The result is ttius generally in agree with our expectations. It
(1) 'BLACK of 2 ' is "BLACK of T multiplied by 2. was easier for the children to have worked on the iteratively
(2)'BLACK of ris 1. generated data sets, but acquisition of the recursive procedure
"BLACK of 2-is Z was facilitated by learning the iterative one.
Also, note that the WHITE data for Gl and G 2 show a similar
Now. compute 'BLACK of 4'. (Write the tendency, and the BLACK data for the two groups provide a
computation and the answer.) different sort of similar tendency: the rale of increase of the
In each of the above data sets, two segments of information, (1) percent correct decreased for the WHITE data with respect to the
and (2). for the factorial of 2, and the value of it were supplied. number of presented data sets, while it increased for ttie B L A C K
There was provided a problem at the last line, which was to data. This particular trend may have reflected the subjects'
compute the factorial of 4. If a subject gave the correct answer to relative difficulty in discovering regularity in a small number of
the problem, then he was considered to have acquired a factorial- information segments in a recursive data set.
computing procedure, iterative or recursive, depending on which Table 1
data set, WHITE or BLACK, was presented to him. Percent correct for the induction experiment
The second data sets for WHITE and BLACK included three
segments of information, and were designed as shown below: Gl 62
Second data set for W H I T E Data sat no. WHITE BLACK BLACK VIHITE
Let us think about the following computation for a 1 11(«) 18 0 9
given number. The answer to the computation is called 2 42 29 14 30
"WHITE" For example. "WHITE of 3" is computed as 3 80 84 33 47
follows: No. of subjects 46 43
(1) Start with 1.
(For almost all the subjects, if a subject gave the
(2) Multiply 1 by 2. The result is 2. correct answer for the /-th data set, he was also correct
(3) Multiply 2 by 3. The result is 6. for all the i-tb sets, where 1<, /</.)
"WHITE of 3'is 6. Thus, w e think that recursive computation may be apparently
difficult for children to learn, but also that it may tie acquired by
Now. compute "WHITE of 5". (Write the
inducing the nested structure, and interpreting it as a procedure,
computation and the answer.)
based on the recursive data. Let us provide one possible
Second data set for BLACK mechanism that generates the gross characteristics of the
Let us think about the following computation for a experimental results, which is essentially similar to the one briefly
given number. The answer to the computation is called described in the previous section. Suppose that the third data
"BLACK'. For example, 'BLACK of 3 ' is computed as sets for WHITE and B L A C K given in the experiment were
follows: represented as followK
(1) 'BLACK of 3" is "BLACK of 2' multiplied by 3. WHITE BLACK
(2) 'BLACK of 2" is "BLACK of 1' multiplied by 2. (equal (times 1 2) 2) (equal (black 4)
(3) "BLACK of 1" is 1. (times (black 3) 4))
"BLACK of 3'is 6. (equal (times 2 3) 6) (equal (black 3)
Now, compute "BLACK of S" (Write the (times (black 2) 3))
computation and the answer.) (equal (times 8 4) 24) (equal (black Z)
The third data sets, each of which contained four segments of (times (black 1) 2))
information, were defined in a similar manner.
(equal (white 4) 24) (equal (black 1) 1).
The subjects were divided into two groups called G1 (n3 4S)
and G 2 (n s43). The group G1 was given the data in the order of Assume that successively emtsedding the segments in the
W-l, W-2. W-3. B-1, B-2 and B-3. where W-i and B-/ denote the /-th WHITE data set, w e obtained the nested formula:
data set for WHITE and BLACK respectively. On the other hand, (equal (white 4) (times (times (times 1 2) 3) 4)).
G 2 was given the data in the order of B-1, B-2, B-3, W-1. W-2 and Note that if we identify (times (times 1 2) 3) with (black 3). and
W-3. Both groups were givenfiveminutes for each data set, which also identify "white" with "black", then the formula matches the
were ample enough for middleschool children. The data sheets first segment In the above BLACK set:
were collected from the subjects for each data set, and no direct
(equal (black 4) (times (black 3) 4)).
feedback of answers was given.
Results and discussion This kind of correspondence holds also for the first and second
The results are tabulated in Table 1. The more data sets data sets. Generalization at this point, which yields the
presented, the greater number of subjects who answered correspondence between (times (times (... (times (times 1 2) 3)...)
correctly, both for WHITE and BLACK. The percent correct was n-1) n) and (black n), provides the procedural basis for the
larger for Gl's WHITE (60% for the third set) than for G2's BLACK recursive definition of the factorial function, which is based on
(33% for the third set), but even the latter gave fairly good nested arithmetic calculation.
Discussion Which way of learning, by discovery or by instruction, Is better
has long been a controversial problem in instructional psychology.
The relation between conceptual and procedural understanding Learning by doing, which is along the line discussed here and in
in problem solving has raised many issues complex but central for Anzai & Simon, is basically a process of learning by discovery. In
cognitive science. At some deeper level of understanding, a this regard, as suggested in this paper, recursive procedures may
person can both handle with knowledge procedurally, and be learned by discovery. Recursive computation may be
appreciate it declaratively. Recursion provides a simple example intrinsically more difficult than iterative one, since execution of
for this matter since it is usually formulated in a compact fortn, its recursive procedures may require more memory resources. But it
declarative representation may be simpler than the corresponding does not mean that they can not be acquired by discovery.
iterative form. But such declarative representation must be However, of course w e do not deny the possibility of learning
accompanied by procedural knowledge for actual computation, recursive procedures by top-down instruction. The two ways of
and this knowledge might be cognitively complex. The argument learning are actually complementary in the real world, and both
presented in this paper suggests that such knowledge can be ways may play important and intertwined roles. Also, we should
acquired not directly, but by working on iterative data. be cautious when w e try to extend the consideration to more
An example of the process of learning a recursive strategy by complex domains such as computer programming. It is because a
discovering a nested structure in knowledge of results obtained by complex task necessarily involves many different cognitive
weaker, nonrecursive strategies was presented in Anzai & Simon subprocesses, and it is not always easy to extract from them only
(1979). The strategy acquisition process reported there is the part played by recursion.
essentially similar to the recursion learning process discussed in Reference*
this paper the thesis shared by the two studies is that complex
Anzai, Y. 4 Simon, H. A. 1979 The theory of learning by doing.
recursive procedures for solving a problem may be acquired by
Psychological Review, 86,124-140.
working on the problem, using already available, nonrecursive

Prior Knowledge Occupies Cognitive Capacity in
Chess Problem Solving, Reading, and Thinking
By Bruce K. Britton and Abraham Tesser
Abstract pacity of short term memory, with estimates ranging
Prior knowledge was varied in problem solving, from 2 chunks up to 20 (Lachman, Lachman & Butter-
thinking, and reading tasks in three experiments. field, 1978). It appears that with even a 20 unit
The hypothesis was that the prior knowledge used in limit, a body of prior knowledge of a size or com-
a cognitive task uses capacity in the same limited plexity that approached that limit ~ for example,
capacity active processing system that is used to the chess knowledge of an expert chess player — if
process the ongoing task. In a reading experiment, transferred to a short term store, would occupy so
prior knowledge about a target page was manipulated much of it that little or no capacity would be left
by controlling the preceding pages. In an experi- over for performing the ongoing cognitive task.
ment dealing with problem solving in the context of The result would be error, delay or failure on the
a chess game, prior knowledge was controlled by com- task. Cognitive psychologists may have believed
paring experts with novices. In a third study sub- that this outcome did not seem likely to occur, and
jects thought about personality descriptions of so the prior knowledge hypothesis may not have
persons and groups, and about women's fashions and seemed easily compatible with models that include
football plays; it was assumed that persons have a small limit on the capacity of the active pro-
more prior knowledge concerning the personality of cessing system. Other cognitive models are less
persons than the personality of groups, that women explicit about the capacity of the active process-
have more prior knowledge about women's fashions, ing system, so evidence that large bodies of acti-
and that men have more prior knowledge about foot- vated prior knowledge use capacity would be less
ball. In all experiments, use of cognitive capac- critical for them.
ity in task performance was observed with a second- Because the hypothesis that prior knowledge
ary task technique. uses capacity in the active processing system has
The results of all three experiments were con- not been prominent in cognitive theory, the conse-
sistent with the hypothesis that prior knowledge quences of it have not been thoroughly worked out,
uses capacity in the active processing system. The and some of them turn out to be interesting. One
prior knowledge hypothesis is consistent with some set of consequences is related to the use of cog-
aspects of current cognitive theory but not con- nitive capacity by persons who do or do not have
sistent with others. The results also suggest a prior knowledge about a particular cognitive task,
fundamental and unexpected limit on the cognitive i.e., experts and novices. The cognitive programs
processing of experts. of experts and novices have been investigated by
Information processing theories of cognitive protocol analysis techniques (e.g., Ericsson &
processing often assume that memories of prior ex- Simon, 1980), but these techniques do not provide
perience are stored over the long term in a rela- data on capacity usage. In the present experi-
tively inactive state. They also assume that the ments the secondary task technique was used. This
cognitive task that is undergoing processing at a technique was designed to provide data on capacity
particular time is being processed in an active usage. The prediction of the prior knowledge hy-
processing system, which some models identify as a pothesis is that experts will use more capacity
working memory or short term memory store. When than novices when they are performing cognitive
stored prior knowledge is to be used in the perfor- tasks for which the experts have activated large
mance of a particular cognitive task, the prior amounts of prior knowledge. Apparently this pre-
knowledge is brought from the inactive state into diction has not been tested previously. To test
an active state. In this active state the prior this prediction of the prior knowledge hypothesis,
knowledge can be effectively used in performing the in two of the experiments reported here, 'experts'
ongoing cognitive task. on chess, and on football, women's fashions and
In the standard model (e.g., Atkinson & implicit personality theory were observed as they
Shiffrin, 1968) this change of state of prior know- processed problems in their special topics and in
ledge is usually represented in a flow chart as an topics in which they were not experts. Use of cog-
arrow leading from a long term memory store (the nitive capacity was measured with a secondary task
inactive memory) to a short term or working memory technique. In a third experiment, differences in
(the active processing system). Other models of prior knowledge about a text topic were induced in
cognitive processing include a similar assumption; readers and the use of capacity was observed in
although the metaphor of a spatial transfer of in- reading later parts of the text.
formation is not always used, some change in the Another interesting consequence of the prior
state of activation of the prior knowledge is ex- knowledge hypothesis is that it suggests the exis-
pressed with other metaphors. tence of a potential limitation on the cognitive
The active processing system is widely be- processing of experts. If an expert has an ex-
lieved to be limited in capacity (Broadbent, 1958, tremely large amount of activated prior knowledge
1971; Navon & Gopher, 1979; Norman & Bobrow, 1975; for a particular task, the knowledge will presum-
Posner, 1978). If the active system is limited in ably use a correspondingly large amount of capac-
capacity, then it is plausible to deduce that any ity. If the prior knowledge uses enough capacity,
prior knowledge that is active in it will use some the capacity available for the ongoing cognitive
of the limited capacity. This paper reports three task will be reduced: this follows from the as-
tests of the hypothesis that the prior knowledge sumption of a limited capacity, A straightforward
used in an ongoing task uses cognitive capacity in prediction is that the ongoing task will be per- 103
the same active processing system that is used to formed more slowly by such an expert with a very
perform the ongoing task. This will be referred to large amount of prior knowledge than by a person
as the prior knowledge hypothesis. with less prior knowledge (assuming the prior
The prior knowledge hypothesis has not been knowledge is adequate to perform the task). In
included conventionally among the explicit assump- extreme cases of prior knowledge, so much active
tions of cognitive processing models. capacity may be occupied that the expert may not be
is because
the standard
a small limitPerhaps
and related the this
onmodels ca- an
the hypothesis
to complete
to by
at all.
ledgeable experts to solve problems that are solv- sary, at least if the active state has rapid decay
able in less time by somewhat less knowledgeable properties like those of conventional short term
experts, (2) the decreases in scholarly productiv- stores. The maintenance may be continuous, it may
ity that are sometimes reported anecdotally when be periodic, as if the activation is regularly
scholars reach extremely high levels of expert 'refreshed,' or it may be intermittent and depen-
knowledge about their special subject, (3) the In- dent on the time course of use of the knowledge in
cubation effect in problem solving, in which prob- the task. Fourth, the elements of the activated
lem solvers who take time off from a thoroughly body of knowledge themselves are likely to occupy
studied problem, presumably allowing some prior capacity, and the more extensive the knowledge is,
knowledge to be deactivated, report that when they the more elements it has, and the more capacity it
return to the problem, they have an increased can be expected to occupy.
chance of solution, (4) the reduction of usable Finally, the use of prior knowledge in the
cognitive capacity that may be associated with performance of the cognitive task may require addi-
aging individuals, who presumably have large tional cognitive operations that use capacity.
amounts of prior knowledge. A possible qualifica- These may involve the unpacking of chunks, searches
tion of this extension of the prior knowledge hy- through them, and decision processes associated with
pothesis is that experts seem likely to be able to their use in the ongoing task. Or the prior know-
chunk their knowledge more efficiently than nov- ledge may be in the form of programs of cognitive
ices, and chunks would presumably occupy less ca- operations that are to be carried out as part of
pacity. But in a very high level chunk, the usable the cognitive task. Such programs enable addi-
information may not be visible on the surface. In tional operations, and these may use capacity.
order to reach a level of information that act- The results reported here clarify the inter-
ually can be used in the performance of the ongoing pretation of some previous research on the use of
task, the chunk may have to be unpacked to the cognitive capacity in reading. In a series of in-
point where usable information 1s revealed (Estes, vestigations of the influence of text characteris-
1972; Johnson, 1972). The unpacking process may tics on the use of cognitive capacity in reading,
use additional capacity that the less expert can it was found that easy passages used more capacity
avoid. It should be noted that such extreme cases than difficult ones (Britton, Westbrook, & Hold-
of prior knowledge were not included in the pres- redge, 1978), where ease and difficulty were de-
ent studies. The levels of prior knowledge used in fined by cloze tests and ratings. This finding has
the present studies may be regarded as intermediate been replicated (Britton, 1980; Britton, Zeigler.
in size between the levels of novices and those of & Westbrook, 1980). It has been pointed out by
high level experts, and decreases in performance of Anderson and Armbuster (in press) that the easy
the ongoing task were not expected. passages used in those studies were about topics
It is well to state at the outset what conclu- for which readers are "more apt to have available
sions can be drawn from the various possible out- schemata or perspectives . . . than are those from
comes of the tests proposed here. If the prior the difficult passages." (p. IS). This interpre-
knowledge is not shown to use capacity, that Is tation is similar to the notion, based qn the pres-
consistent with the hypothesis that the cognitive ent results, that the readers had prior knowledge
task Is performed in one active system, and the about the easy passages. The results of Britton,
prior knowledge is active in a quite different sys- Graesser, Glynn, Hamilton, and Penland (in press)
tem that does not share capacity with the first. on genre differences can be interpreted along the
If prior knowledge is shown to use capacity, that same lines, as can the results of Britton, Westbrook,
is consistent with the hypothesis that both the Holdredge and Curry (1979) that passages with more
cognitive task and the prior knowledge are using discourse level meaning (but identical to passages
capacity in the same active processing system. with less discourse level meaning) used more capac-
The results of all three experiments were that ity.
subjects took longer to react to secondary task Some limitations of these conclusions should
probes in the high prior knowledge conditions. be noted. First, they may only apply to complex
Thus, the results of these experiments were all bodies of prior knowledge, and probably not to iso-
consistent with the hypothesis that the prior know- lated individual units. For such units, the re-
ledge that Is used in an ongoing cognitive task trieval, activation, maintenance and use of the
occupies capacity in the same limited capacity sys- knowledge may require so few cognitive operations
tem that is used to perform the cognitive task. that no observable capacity is used. Also, if the
There are several aspects of the cognitive handling use of the prior knowledge is very highly prac-
of prior knowledge that may make use of capacity. ticed it may use less capacity (Shiffrin & Sch-
First, the retrieval of the bodies of knowledge neider. 1977; Schneider & Shiffrin, 1977).
from inactive memory may use capacity. The re- Second, there appears to be a special case of
trieval process presumably includes both search and combinations of prior knowledge and cognitive task
decision components. Such a retrieval process may for which prior knowledge will probably reduce use
only occur once, at the beginning of the involve- of capacity. These are tasks for which the com-
ment of prior knowledge in the ongoing task, or it pleted solution of the task is already stored in
may be going on more or less continuously during memory and is easily accessible. For example, if
performance of the task. Multiple retrievals would the subject is asked to multiply 37 x 8, many men-
use capacity over a longer span of time than would tal operations will be carried out to arrive at the
a single retrieval episode. correct answer of 296. But if the subject is im-
Second, once a particular body of knowledge mediately asked again to multiply 37 x 8, the prior
has been confidently located, its change of state knowledgeof the answer will be retrieved from
from an inactive to an active status may use capac- memory, and the effect will be to reduce the num-
ity. Third, once that activation has occurred, the ber of mental operations and so the use of capac-
maintenance of the activated state may be neces- ity.
Dynamic Construction of Finite Aucomata
From Examples Using Hlll-Cllmbing
Masaru Tomlta
Computer Science Department
Carnegie-Mellon University
Pittsburgh, Pennsylvania 15213

Abstract We restrict our problem domain to be only over

{1,0}*. Furthermore, since every non-deterministic
The problem addressed In this paper is heurist- finite automaton has an equivalent deterministic
Ically-guided learning of finite automata from finite automaton (see [7]), w e deal only with
examples. Given positive sample strings and nega- deterministic finite automata, that Is, there is at
tive sample strings, a finite automaton is genera- most one 1-arrow and one 0-arrow from each state.
ted and incrementally refined to accept all posi- Thus, in this paper, the terms "finite automaton",
tive samples but no negative samples. This paper "automaton" or "machine" all mean "deterministic
describes some experiments in applying hill- finite automaton". Given a string s, if there is
cllmblng to modify finite automata to accept a a transition from the Initial state to any of the
desired regular language. We show that many prob- final states, then s is accepted by the machine,
lems can be solved by this simple method. otherwise s is rejected. For example, the machine
1. Introduction of the sample problem Is shown in figure 1.
Consider the following problem: Figu r« 1: The machine of the sample problem

Describe the property that all strings in the

right-list have but no string in the wrong-list
has. Does a string (1 1 0 1) have this property?
You may answer the question by using any of the
following: English, a regular expression, or a
finite automaton.^ { (g) smfinal )
0 (10) Each machine with n states is denoted by the following form:
(1) (101)
(0) (010) ((A,.B,.F,)(A2.ayF2)....(A„.B„.F^).
(01) (1010)
(11) (1110)
(0 0) (1011)
(100) (10001) Each (Aj^, Bj^, Fj^) corresponds to the state 1, and
(110) (111010) A^ and Bj^ indicate the destination states of the
(111) (1001000) 0-arrow and the 1-arrow from the state 1, respect-
(000) (1 1 1 1 1000) ively. If Aj^ or B^ is zero, then there is no
(100100) (01 1 1001 101)
(1 1000001 1 100001) (11011 1001 10) 0-arxow or 1-arrow from the state 1, respectively.
(111101 1 0 0 0 1 0 0 1 1 100) F^ indicates whether state 1 is one of the final
states or not. If Tj_ is equal to 1, the state 1
It might be possible to construct the machine by a is one of the final states. The initial state is
"typical" schema-filling method (i.e., finding always state 1. For instance, figure 1 is repre-
rough property in the samples first, comparing sented as follows:
these strings carefully). However, in this paper, ((1 2 1)(3 1 1)(4 0 0)(3 4 D ) .
1.2 The problem
^The answer is strings over (1 + 0)* without
odd number of consecutive O's AFTER odd number of We now are ready to describe the problem pre-
consecutive I's. Therefore ( 1 1 0 1) has the cisely. Given a right-list (a set of positive
property. sample strings) and a wrong-list (a set of nega-
we try to construct the machine directly by search- .tive sample strings), we can think of the follow-
ing in the problem space (i.e., a set of all finite ing three tasks:
automata) using hill-climbing, rather than by ana- 1. To find a machine that accepts all strings
lyzing the samples carefully. in the right-list but none in the wrong-
One of the biggest advantages of blll-climblng list.
is its simplicity, that is, we do not have to know 2. To find a machine with n states that
our problem space well, while a "typical" schema- accepts all strings in the right-list but
filling method requires us to provide all possible none in the wrong-list.
schemas, and therefore to know everything about 3. To find the machine with fewest states
our problem space. (simplest machine) that accepts all
We shall see that hill-climbing works much bet- strings in the right-list but none in
ter than expected in our problem space, and in fact the wrong-list.
solved most of the problems. The first task is trivial because one can easily
construct a trivial machine that accepts exactly
1.1. The finite automata used in this paper all strings in the right-list but nothing else. We
call the second cask c o n a c r u c d o n of finite auto- 4. any string without more than 2 consecutive O's.
mata, and the third task simplification of finite
automata. 5. any string of even length which, making pairs,
has an odd number of (0 1) or (1 0)'s.
6. any string such that the difference between the
1.3. Sample Problems numbers of I's and O's is 3n.
Throughout this paper, we consider the parti- 7. 0*1*0*1*.
cular seven problems shown in figure 2. We also consider the Inverse problem of those In
Figure 2: Sample Problems figure 2. The inverse problems are created by
Ppobl«« 1 exchanging the right-list and the wrong-list. We
use these U problems In our experiments and refer
(•> 10 to the inverse problem of problem 1 as 1-.
I >)
I) 2. Construction of Finite Automata
(t1 11 II
I I I)
II 1 In this section, we describe an experiment In
(1 constructing a finite automaton with n states from
1 I 1 1 I) ProblM :
111111) a given right-list and a wrong-list, using the
U ^ HI nfidiu hlll-cllmbing. In particular, we let n equal 8.
{ ') 0)
1 1 We shall see that each of the 14 problems can be
(1 0 1 01
g0 solved In at most a few thousands steps.
(1 » 1 « 1 1)
(1 0 1 0 1 t t 0) 0 1 2.1. Algorithm
(1 « 1 t 1 0 1 0 1 « 1 0 1 01 1 0 J)
ProblM 31 0 o( The hlll-cllmbing algorithm of this experiment
(TM prMlM •• l>tr<d«c«<(10tt0tIM10Mtlii«la«l
10) Is shown In figure 3.
Problwi 110
4 110)
(110 10 10 10)
in (1 I 0 0 0) R g u re 3: Rowchart of the HlllCllmbing
01 (0 0 0 II
1 01 (0 0 0 0 0 0 0 0 0) M : 3 remdom
0 i! jl 1 1 1 I 0 0 0 0 1 1)
(10 0 10 0) (10 10 0 10 0 0 1)
(0 0 1 1 1 1 1 I 0 1 0 ProblM
«) S(•0 0 0)
(0 t 0 0 1 0 0 1 0 •) (• 0 0 « •) M': =1 mutate(M)
ll I"-"n
1)1 0 0) (01
1» I0)0) II I ll
0 0 1) 10 1 0)
1 0 1) (O 0 0 0 0 0 0 0 0)
0 1 01 (10 0 0)
0 0 0 1 1 1 1 0 II 10 1)
0 0 110 0 9 0 11 ProblMi 9 II 01
11111) 'r7"i-i'n0 10 10 0)
IJ?!0 0 01 10 1 0 1 1 1 1 1 1 I 1 01
0 0 0 1) M:> M
II I 0 0|
(10 10 10) « 11)
1 0 0 1|
(1 1 n 1 I ll
(000000) 0 0 0 0 0 0 0)
M 0 1 1 II I 0 1 1 11
ProblM 7 W e first construct a random machine with 8 states.
10 1 1 1 1 0 1 1 1 n 0 1 1 1 I 9 1 1 1 1| We next make a copy of this machine, where the copy
11 0 0 1 0 0 1 0 0| rrr-Vil.
0 0 .1 0 0 1 0 0 11 is slightly altered from the original by a n opera-
(0 0 I 1 0 0 1 t 0 0 0) tor mutate. W e compare the new machine with the
(I I I 1 1) 0 10 10 10 10 1) original by a n evaluation function E . T h e better
If g I I 0 0 I I) 10 110 10) machine is called current generation and w e make a
it 1 • II 1 0 1 0 11 copy of this machine, and so forth. T h e worse
(fOOOlOOOOtlll 0 10 10 0) machine Is simply discarded. T h e operator mutate
(0 « 1 0 0) I 0 I 0 0 1) and the evaluation function Z a r e defined more
t 1 1 1 1 1 0 1 1 1 1 1) 1 9 0 1 0 0 1 1 0 1 0 1) precisely in mutate:
the following.
Operator Taking a machine ((Ai,Bi,Fi)
• •)
The solution of these problems a r e : . . .(Ag,Bg,Fg)) as Its argument, the operator
mutate chooses one digit randomly, and replaces it
1. 1* by another digit. That is, the mutation in our
2. (1 0)* algorithm is randomly one of the following: delete
any string without an odd number of consecu- an arrow. Insert an arrow, change the destination
tive O's AFTER an odd number of consecutive of an arrow to another destination, make a non-
I's. final state a final state, and make a final state
Into a non-final state.
Evaluation Function E: The evaluation function Figure 6: The Result of Construction
Z takes a machine as its argument and returns r -
w, where £ is the number of strings in the right-
list accepted by the machine, and w Is the number [(0 1 l)|Z 9 1)17 4 1H9 0 0)(7 I 11(1 S 0)(7 7 0)(8 9 01 »•
of strings In the wrong-list accepted by the 1(0 9 1)(4 0 0)12 9 0)19 0 1)11 6 0)12 2 0)(1 4 l)(fi 9 1) 134
machine. If r - w < 0 then It returns 0. 1(4 1 1)13 1 1)11 2 0)(7 I 1)16 0 1)(4 0 0)(0 3 1|(1 2 1) 442
1(6 3 1)15 4 11(9 1 0111 6 0)(0 1 01(1 7 0|(6 7 0)13 6 0)179*
2.2. Result ((» ! 1116 9 1)11 9 0117 9 0114 I oils 7 1)(8 3 11(4 1 0) 277
1(1 6 1110 0 1)12 4 1)13 7 0)(0 5 111) 6 1)17 9 1|(4 2 1)209
We show In this section the result of our 1(3 6 OHO 3 0113 7 1)(3 4 1110 3 1117 1 OHO 4 11(0 0 0| 300
experiments. We first shov In figure U the trace 119 9 0112 3 11(2 0 01(2 9 11(1 5 11(1 « 1)(0 4 0)(4 2 1) 8«
of the experiment of problem 3, to see how our 1(3 6 0)|4 i 1)(1 4 0)12 4 a|(6 I 1|(2 1 01(0 6 1)10 0 0) 19 J»
algorithm gradually refines a random machine into 1(4 7 0)|2 2 1)12 6 0)19 1 0117 5 01(7 < 11(1 ) 0)(2 0 0)249
2.3. Discussion
the desired machine. ((9 3 0)16 6 0)16 2 1)11 7 11(1 1 1)15 9 1)(5 4 0)(1 6 1) 1944
To see how1(2effectively
7 0)(7 I 1)13 2our
1 1|(0 3 01(3 7 01(1 9 1)(3 1 1) 9M
I .< A. .<a;0 .<a, ><8:and0 .<f. .<1. algorithm1(7has9 0)(4performed,
2 01(4 2 1)(4 3 we
1)(9 compare
2 01(2 0 0)(Iour
S 0|(9method
0 l| 32liwith
an exhaustive search. There are (9 x 9 x 2)8 «
about 5 X 10^7 machines in our problem space. We
R g u r e 4 : Sample Trace of Problem 3 now want to know the number of the desired machines
in our problem space, so that we can calculate the
(((1 expected number of steps until the exhaustive
111 •4 oiia
OKt 1 1X1 00 0)10
0)(0 000 9 01(4 1l)(3 ! 1) algorithm finds the first desired machine. This
0 01(0
00 OHO
OHO 00 11(4 I11(3 ! 1 can be done by the following "sampling" method:
1(4 1
Id 4 OHO
1H4 11(1 00 OHO 4 1 111(3 ! 1
• 111(4 take one machine In the problem space randomly, and
111 4 out 31 11(4 OHO 0
1111 0 OHO 00 01(3 t 1)
•(((1 < Oil! 0 OHO 0 II 4 1 test if this machine is the desired machine;
(1 44 1116 31 1H4 J nil 0 01(0 0 01(3 < D)
III 11(6 1H4 0 01(0
OHO 0 01 1(41)14 0001(3 4 1)1 repeat this procedure 100000 times.
nil 4 i)(s 3 11(4 0 We show the expected number of steps using
1(11 3 1)14 0 OHO 0 1 1)(4
11(4 5 )(3 ! 1
000 OHO
OHO 0 1 1)1 445 (33 !! 1)1
1(11 11(4 the exhaustive search calculated by this proce-
nil 3 IHO 00 01(0 1 dure in figure 7. Although the exhaustive search
((II 1)10 0 01(0
OHO 1114 S50)(3 ! 1
11 1)14 3 ! 1) works better on "easy" problems. It is obvious
1)10 00 01(0
OH1(0 S )(33 !! 1| 1)
(111 1(0 65 1 10 S 1(3 !I)1 in general that our hlll-cllmblng works much
(111 110 0 0)0 (3 I better than the exhaustive search.
0 0)10 05 li3 2! 1
(111 4 1)(0 )|0 99 11 00 62 00)10 (0 I 0113
01(3 !
Figura 7: The number of Steps to get ttie desired machine
((II 4 11(0 1)10 01(3 2 11
1)10 01(3 2
2 71)I)1)(9 0 1)) 12 20M Problon Hin-Cllmblng Exhaustivs-Saarch
(111 1)15 0 0)(4 B 1)(0 0 0H2 1 1|(2 0 01(3
nil 94 11(1 01(9!
(111 5< 1)15
3 11(0
nil 0 01(4 6 11(0 0 0H2 1 1)(2 4 0|(801(3 01(3 i 11)0 11) 12 2049
7 1)(5 PI 98 33
III! 54 11(1
3 1110
1)15 0 01(4 6 11(4 0 0H2 1 l)(2 4 0)(9 01 i! 01ij) 12 2090
7 2I|(5 Pi 134
> 50000
totil rwit1a»3 1)10
12I.0U00* IK 01(3 ; P3
1)15 0 oi(4 6 11(4 0 0H2 1 1)12 4 01(2 7I 1)(9 0 1)) 12 2091
(1(1t 45 1)(1 0|(I ! P«

> 50000
11(1 1)1 0 01(4 6 l)(4 0 01(2 1 1)(2 4 0)(2»)(»
(111 54 1)|7 2 10 1}) 13 2092
f 1)(5 P8 277 SOOOO
P7 208 90000
nil 4 1)(| Pl- 300 187
(((1 4 line
Each D d correspomls to the current generation n- 39 1862
M. The column E indicates E ( M ) , and G indicates n- 1939 > SOOOO
P4- 246 > 50000
the cumulative number of steps. The final machine P5- 1844 > 50000
p«- 888 > 50000
of this trace accepts all strings in the right- P7- 3726 > 50000
list but none in the wrong-list of problem 3
(figure 5 ) .
3. Simplification of Finite Automata
Figure 5: The final machine of problem 3 In the previous section, we saw that our hlll-
cllmblng method successfully produced a machine
that accepts all positive sample strings but no
negative sample strings. However, the final
machine of the result of problem 2, for example,
does not accept our desired regular set (1 0 ) * .
For instance. It does accept a string (1 1 0 0 ) ,
which is not in (1 0 ) * . We therefore want the
machine to be "generalized" so that it accepts
exactly ( 1 0 ) * . In fact, the final machines of all
problems except problem 1, 3 and 7, need to be
We define the generality of a machine In terms
We show the result of the 13 other proble in of its simplicity. The simplicity of a machine
figure 6, only by their final machines. is determined by the number of states the machine
has, and If two machines have the same number of
states, a machine with fewer arrows and final
states is simpler.
Our task is to simplify the machines we have
obtained in the previous section,-so that the
machines become the simplest or the most general.
We call this cask, simplification of finite auto- Let n be the number of states of the desired
mata, and It can be also done by using the hill- simplest machine. Then the expected number of the
clljnblng method as in the previous section. steps Sq Is:
3.1. Algorithm where Uj is the number of all possible machines
with j states, that is,
The algorithm of the simplification is essen-
tially the same as the algorithm described in the
previous section. The major differences are as ^Tha number of steps using hill-climbing in
follows: the evaluation function E(M) returns a this figure is the sum of the number of steps to
higher value if the machine M is simpler; if M does construct the 8 state machine and the number of
not accept seme strings in the right-list, or does steps to simplify it into the simplest machine.
accept some strings in the wrong-list, E(M) returns Although our problem domain has been regular
minus Infinity; the algorithm starts with the languages, we might be able to extend it to
result of the previous experiment instead of a context-free languages by constructing Push-Down
random macliine. automata (finite automata with stack, see [7J
3.2. Result using a similar method.
The final machines of these experiments are
shown in figure 8. Acknowledgements
Figures: The Result of Simplification I would like to thank Herbert A. Simon and
Jaime Carbonell for supervising this work;
({0 2 1)(1 0)) 7 Masakazu Nakanisbl, Yulchiro Anzai, Pat Langley
'It ((2 1 1)(3 l)(0 I 1)) 68 42 and Takeo Kanade for thoughtful comments on an
((4 3 1)(3 0)(2 1 0)(1 2 0)) earlier version of this work; and Cynthia Hibbard
((3 2 1)(1 0)(2 1 0)) 174 for helping to produce this document.
:p4- ((2
((3 1 0)(2 l)(2 3
1)) 146 1 0)) 363
P5- ((2 1)(4 1
3 0)(2 1)(1 2 I)) 971 10)) <MOr-SIMPlEST> References
'P7- ((1 35 0)(3
0j(6 6 Oj(6 2 0)(2
lj(l 4S 1)(2
1)(3 I 1)(5 4 1)) <IIOT-SIMPLEST>
((2 53 0)(3
0)(2 1 1)(1 2 0)(2
1)) 440 0)(1
((1 5 0)(4 6 a)(4 2 l)(4 3 1)(5 2 0)(4 0 0)) <IK)T-SIMPLEST> [l] Biermann, A.M. and Feldman, J.A.
On the Synthesis of Finite-State Acceptors.
3.3. Discussion
Al Memo 114, Stanford University, April, 1970.
We compare our method with an exhaustive [2] Cavicchio, D.J.
search. The exhaustive search generates all Adaptive Search Using Simulated Evolution.
machines in the order of simplicity, and the first PhD thesis. University of Michigan, 1970.
machine that accepts all strings in the right-list
but none in the wrong-list is considered the [3] Feldman, J.A.
simplest machine. Thus we can calculate the First Thoughts on Grammatical Inference.
expected number of steps until the exhaustive AI Memo 55, Stanford University, August, 1967.
search finds the desired machine^. The result is [4] Feldman, J.A.; Glps, J.; Horning, J.J.; and
shown in figure 9.^ The symbol " " indicates Reder. S.
that the algorithm fails to find the simplest Grammatical Complexity and Inference.
machine. This can happen when the hill-climbing AI Memo CS125, Stanford University, June,
algorithm climbs a "local hill". 1969.
Fi9u re 9: The Number of Steps to obtain the simplest machine [5] Fogel, L.J.; Owens, A.J.; and Walsh, M.J.
Problen Hin-Clinming Eihaustlva Artificial Intelligence Through Simulated
PI 98 4 Evolution.
P2 141 170 Wiley, New York, 1966.
2052 SS3933
P3 510 5624 [6] Holland, J.H.
P4 1810 553933
451 8524 Adaptation in Natural and Artificial Systems.
PS 206 563933 The University of Michigan Press, 1975.
2302 46593884 [7] Hopcroft, J.E. and Ullman, J.D.
P7 --- 553933
553933 Introduction to Automata Theory, Languages,
Pl- --- 8624
930 and Computation.
P2- 46693884 Addlson-Wesley, 1979.
P4- [8] Lindsay, R.K.
P5- Artificial Evolution of Intelligence.
4.P«- Concluding Remark Contemporary Psychology 13(3), March, 1968.
P7- new approach to construction of finite
automata from given examples has been shown to work
successfully, although It could not find the
simplest machines for some problems. To avoid
climbing a "local hill", it might be possible to
apply adaptive search ([6], [2J) Instead of our
simple hill-climbing.

Brian J. Reiser
J o h n B . Black
Robert P. Abelaon

Cognitive Science Program

Yale University

A n important upect of both comprehension and retrieval (Reiser & Black, 1982). Our hypothesis is that
learning is the utilization of one's own past experiences to establishment of a M O P as the context will flgure more
understand a current situation. In fact, being reminded importantly in the search process than other types of
of an experience often occurs in the process of retrieving structures, such as generalized scenes. T h e uniqae
generalizations from memory, suggesting that memories aspects of adults' experiences are more likely to be
of personal experiences should be encoded in terms of the deviations from context-specific knowledge (specified by a
generic knowledge structures that are utilized in M O P ) , than from the more abstract knowledge
comprehension. Retrieval of these memories should represented in generalized scenes. Furthermore, retrieval
therefore reflect the organization of generic knowledge of even those experiences which are stored as scene-
(Schank, 1982). This paper explores the use of one such deviations will require the utilization of a M O P to
knowledge structure in the recall of past experiences. reconstruct the context-specific aspects of the experience.
Schank (1982) proposed that M a n o r y Organization For example, one might remember not being able to find
PaeketB (MOPs) represent knowledge about c o m m o n the right credit card while paying at a cash register, but
activities. A M O P is represented as a sequence of initially fail to recall where the incident occurred, what
generalized aeenet, each of which consists of actions to was being paid for, etc. If a context such aa
accomplish a subgoal of the activity. For example, the D E P A R T M E N T - S T O R E or R E S T A U R A N T could be
R E S T A U R A N T M O P would contain the scenes retrieved, it would provide cues for reconstmcting other
Being-aeated, Ordering, Eating, and Paying. aspects of the experience. O u r view m a y be contrasted
Generalized scenes can be referenced by more than one with the position that experiences are stored as arbitrary
MOP. T h e generalized Paying scene contains the associations between conc^ts in networks, with n o
information that is true of paying in general, regardless of functional differences between different types of concepts
context. Each M O P consists of the generalized scenes in memory retrieval
that occur in that context, augmented by eontext-tpeeifie W e examined the roles of M O P s and generalized scenes
knowledge, a specification of h o w those scenes are in memory retrieval in two autobiographical m e m o r y
modified (eolored) for the particular situation. Each of experiments. If it b generally necessary to retrieve a
the M O P s that refer to the Paying scene (e.g., MOVIE, M O P structure to access a memory, then retrieval cues
G R O C E R Y - S T O R E , R E S T A U R A N T ) must contain the which do not specify a M O P should be inferior. If one is
information necessary to construct a specific colored asked to remember a reataurant-paying experience,
version of that scene. retrieval would be more eflicient if the processing begins
A n experience typically contains many differences with the R E S T A U R A N T M O P , rather than the generalized
from the generalizations stored in generic knowledge Paying scene. In addition, specification of the M O P
structures. Schank (1982) argued that these deviations containing a scene should lead to faster retrieval than
connect the contextualizing knowledge structure and specification solely of the scene.
memory for the individual experience. T h e (!i>nnection
serves as a retrieval index for the experience (Kolodner, Experiment 1
1980; Schank, 1982).
Subjects saw a pair of phrases separated by a 5 second
We propose that retrieval of an experience involves delay, then recalled a personal experience thatfitthe two
two types of processing: (1) Eetabliehing the context: phrases. O n e of the phrases named a M O P , and the other
The context necessary for retrieval will be provided by phrase referred to a scene; the order of presentation of
the specific knowledge structures that were utilized to the phrases was varied. T h e M O P cue named a c o m m o n
guide behavior in the experience. (2) Finding an index: activity {took a ride on a train, went out drinking). H i e
A retrieval index describing the deviation from the scene cue described an action sequence that could occur
generic structure provides a link to an indhridnal in a number of different contexts. T w o types of Scene
experience. For example, the concept restavrant plus the phrases were used. Regular Scene cues described actions
index / ate too mttcA latagna and felt sick might retrieve that are a normative component of an activity {picked
a particular restaurant experience. out what you wanted, paid at the cash register), while
The importance of a search context has been Failure Scene cues described the failure of some goal of a
suggested by previous researchers (Norman & Bobrow, scene {didn't get what you aaked for, couldn't find a
1079; Williams & HoUan, 1981), but is necessary to seat). All scene cues were carefully worded so as not to
examine whether there are any functional differences reveal any particular context.
between classes of knowledge structures in memory Forty M O P and scene combinations were constructed

from twenty M O P , ten Failure Scene, and ten Regular m a y not fit the M O P s . This is true for the Regular
Scene phrases. Each M O P was paired with both a Scenes, since restaurant experiences typically contain a
Regular Scene and a Failure Scene cue; and each scene Paying scene, but paying is experienced in contexts other
was paired with two M O P s : than restaurants. However, this is not true for Failure
Scenes, since an episode retrieved from a M O P cue would
la. MOP -t- Failure Scene: went oat drinkiar.
not be particularly likely to fit the given Failure Scene
didn't get what yon aaked for
description. Thus, the results are better explained by a
lb. M O P + Regular Scene: went out drinking;
model in which retrieval of the M O P is an essential stage
paid at the cash register
in remembering an individual experience.
2a. M O P -K Failure Scene: had your hair eat;
didn't get what yon asked for Since the M O P provides the context for retrieval, the
2b. M O P + Regnlar Scene: had yoor hair cut; scene cue provides a constraint on the use of the
paid at the cash register experiences that are stored with the M O P . Each M O P
Each subject receiyed ten combinations involving each contains a pool of available indices that specify very
type of scene cue, so that the M O P phrase was presented salient experiences in that context. Subjects search that
first for half of the trials for each type of combination. pool of indices to discover whether any of those
Each M O P and scene were used only once for a giren experiences could flt the scene cue. For the Regular
subject. (For example, a subject received items la and Scene trials, the subject is relatively free in drawing from
2b, or items lb and 2a.) this pool of indices — one must be sure only that the
Subjects were instructed to recall an experience that experience that is retrieved can be reconstructed to
nt the combination of the two phrases presented on each include the necessary scene. However, when a Failure
trial, and indicate whether they could remember such an Scene b presented, the use of available indices is severely
experience by pressing either the Ye» or N o key. W e constrained, since an index must be found that retrieves
emphasized that the m e m o r y be a tpeeifie experience, an experience containing the particular type of goal-
but that it was not necessary to recall all of the details of failure that b described in the scene cue. This requires
the experience before responding. After each Y u careful consideration of the pool of indices, and perhaps
response, subjects wrote a brief description of the some inferencing about the reasons that such a goal
experience. Retrieval times were measured from the failure would arise, thus adding extra processing to the
presentation of the second phrase until the button press. m e m o r y retrieval. Therefore, subjecta are slow« to
remember an experience for those trials involving Failure
Table 1 presents the mean retrieval times for the Yea Scene cues.
responses for 32 Yale undergraduates. Subjects recalled
experiences more quickly when the M O P cue appeared Experiment i
first [mitt F'(1,U) = 7 9 8 , p < .01]. Secondly, Regular
Scene trials yielded faster retrieval times than Failure If constraining the target experience to a particular
Scene trials [ m m /"(1,4S) = 9.48, p < .05]. T h e order of M O P context facilitates retrieval of an experience, then
presoitation equally affected the two scene types subjects should find it easier to remember an experience
[interaction F < Ij. when given both a M O P and a scene (presented
simultaneously) than when presented with a scene alone
However, if activation of a context is a simple matter of
retrieving associations of a scene, then there should be
NOP First Scene First Nean
little difference between presentation of a M O P and scene
NOP * Regular Scene 4.203 8.4S2 S.348 combination and the scene in isolation.
NOP * Failure Scene 5.986 8.394 7.120 T h e facilitative nature of the M O P was tested in a
second experiment by comparing retrieval times for three
Neen 5.094 7.443 6.269 types of cues: (1) Scene alone, (2) M O P alone, (3) M O P +
Table 1: Retrieral Times (in seconds) for Exp. 1 Scene combination. All M O P + Scene combinations from
Experiment 1 were used; in addition, each M O P and each
scene phrase was presented alone. Each subject received
10 trials of each cue type. (These trials w w e blocked by
T h e faster retrieval times when the M O P cue was
condition, to guard against the M O P of one trial
presented first confirm the prediction that a M O P
facilitating the scene of the next triaL) T h e instructions
structure provides the context aecessary to retrieve an
differed slightly from Experiment 1. Subjects were told
experience. W h e n the scene cue appears Hrst, extra
to recall an experience that fit the presented description
processing is required to reconstruct a M O P context,
consisting of one or two phrases. Since the materials in
slowing retrieval A n alternative explanation is that
the three conditions necessarily differed in length, both
when the scene cue is first, an episode is retrieved, but it
reading and response times were collected for each trial
m a y not match the M O P that is presented later. In
Subjects first indicated when they had read the cue, and
contrast, when the M O P is first and a m e m o r y is
then responded to indicate whether they remembered an
retrieved, it is m u c h more likely to match the scene cue.
experience that fit the cue. Retrieval times were
Hence, the scene first trials would be slower, because
measured from the subject's reading time button press
sometimes the retrieved episodes must be discarded and until the m e m o r y retrieval response.
m e m o r y search, resumed. However, this alternative
explanation faib to account for the Failure Scene results. Table 2 presents the mean retrieval times for Yta
It assumes that memories retrieved with M O P s are likely responses in the three conditions for 36 Yale
to fit the scenes, while memories retrieved with scenes undergraduates. A s predicted, subjects were able to

retrieve an experience more quickly when both a M O P indices m a y inereaae retrieval time, since the most
and scene were presented, than when the scene was accessible indices m a y not retrieve experiences that
presented alone [min f'(1,42) = 3.53, p < .10; f\l,3i) = satisfy the given cue. Thus, subjects are slower to
8.43, p < .01 for subjecU; f[l,l8) = 8.08, p < .05 for remember an experience that satisfies a Failure Scene cue
itemsj. Subjects were faster to respond to Regular than than a Regular Scene cue, and are slower to recall an
Failure Scenes, but this difference was only marfinally experience that satisfies both a M O P and a scene cue
signiHcant [/Jl,35) = 3.08, p < .10 for subjects; n« for than one that satisfies only the M O P cue.
items). In summary, w e have argued that knowledge
structures m a y be functionally distinguished by their
effectiveness in providing a search context. Accessing a
Sctn* A Ion* NOP * Scan* NOP Alone M O P is an essential part of retrieving a past experience
Sctni T/pi from memory, since it provides an optimal search
fttgultr Scan* 5.2S0 3.383 context, and can generate context-specific indices to
Ftilure Sctfl* 5.292 4.307 retrieve memories stored with a scene. Specifying the
Mean 5.294 3.846 2.154 activity type by naming a M O P is facilitative, but
constraining the t]rpe of experience that occurred in that
Table 2: Retrieral Times (in second*) for Exp. 2 context m a y require extra processing to generate
appropriate indices. W e suggest that research on the use
of memory in naturalistic tasks should focus on
Since the M O P provides a better search context than considerations of how the content of a genoic memory
the generalized scene, the combination is a better structure is utilized to find and reconstruct a memory for
retrieval cue than the scene alone. Subjects are slower to a speciflc experience.
respond to the combinations than to the M O P s alone,
because the scene cue provides an extra constraint on the
use of the indices that are stored with the M O P . T h e RefercBcea
subject must be sure that the recalled experience includes
the specified scene of the M O P when given a M O P + Kolodner, J. L. Retrieval and organitational stralegiea
Scene combination, but any of the indices m a y be used in conceptual memory: a computer m o d d . Technical
when given the M O P alone. Report 187, Department of Computer Science, Yale
Univenity, 1080.
Conelu*i€>nM Norman, D. A., ft Bobrow, D. G. Descriptions: A n
The different structures we hAV« discussed may be intermediate stage in memory retrieval Cognitive
considered in terms of the amount of constraint they Psychologg, 1979, 11, 107-123.
place on the search space — Le., the set of experiences Reiser, B. J., ft Black, J. B. Processing and structural
potentially satisfying the cue. A M O P constrains the set models of comprehension. Text, 1082, in press.
more than a generalized scene, since the scene can occur Schank, R. C. Dynamic memory: A theory of
in multiple contexts. A M O P is somewhat less reminding and learning in eomputere and people.
constraining than a M O P + Scene combination, since the Cambridge, M A : Cambridge University Press, 1982,
combination specifles a particular segment of the event in press.
sequence. In addition, Failure Scenes are more
constraining than Regular Scenes, since they specify a Williams, M . D., ft HoUan, J. D. T h e process of retrieval
particular type of occurrence within a given scene. from very-long term memory. Cognitive Science,
1981, 5, 87-119.
Our results suggest that a M O P constitutes the
optimal level of specificity for a memory cue.
Generalized scenes are not constrained enough, since they
become better cues when combined with a M O P , and the
scene stows retrieval when presented before the M O P .
Once a M O P has been accessed, constraints on the use of

Personal Memory, Generic Memory, and Skill: A Re-
Analysis of the Episodic-Semantic Distinction
William F. Brewer
Department of Psychology
University of Illinois
603 E. Daniel Street
Champaign, Illinois 61820
The purpose of this paper is to propose that Chree Cypes of memory. The nexC secCion attempts to
human memory must be aneilyzed into three basic give a general description of each type. This ap-
types: personal memory, generic memory, and skills. proach to human memory is an attempt to give a psy-
This analysis will only deal with productive mem- chological version of the relevant philosophical
ory systems and so will not cover recognition mem- works on memory in Che lasc 70 years (Bergson, 1911;
ory. After the classification is presented, it will Russell, 1921; Furlong, 1951; von Leyden, 1961; Mal-
be used as a framework to examine the initial work colm, 1963; Locke, 1971).
of Ebbinghaus (1885) and the episodic-semantic dis- Personal ry. A personal memory is a recol-
tinction proposed by Tulvlng (1972). lection of a particular episode In Che pasc of an
In order to make the distinction between the Individual. Personal memory is (always?) exper-
three types of memory clear, consider the following ienced in terms of some type of mental imagery—
example: An undergraduate goes to Che psychology predominantly visual. Ic usually also includes non-
building for a psychology experiment. He finds his Imaglnal InformaCion. The image is experienced as
way to the correct room, hesitates a minute, knocks Che represenCaCion of a particular time and loca-
on the door, and goes Inside. He sees the experi- tion. The personal memory episode Is accompanied
menter and a memory drum in a small bare room. Af- by a propositlonal attitude that 'this occurred in
ter some preliminary instructions, he is given a the past' and is accompanied by a belief that the
number of trials on a long paired-associate list. remembered episode was personally experienced by
One of the items on the list is the pair DAX—FRIG- the individual. A personal memory is also frequent-
ID. After the experiment is over he breathes a ly accompanied by a belief that it is a veridical
sigh of relief and leaves the experimental room. record of the pasc episode. Personal memory scate-
This one event can be used Co illustrate Che chree ments frequently flC Che linguistic frame: "I re-
types of memory: member X." Thus, in Che above example: "I remember
Personal memory. If, the next day, the under- Che expression on Che experimenter's face."
graduate were asked, "Do you remember the psychology Generic memory. A generic memory is Che recall
experiment you were in yesterday?" he might say of some item of general knowledge. Generic memory
something like: "Sure, I remember walking down to is noc experienced as having occurred at a particu-
the room from the elevator. I remember feeling ner- lar time and location and is not accompanied by a
vous as I stood there in front of the door. I re- belief ChaC Che InformaCion was personally exper-
member opening the door and seeing the experimenter ienced by Che individual. Generic memory state-
standing behind the table. I remember being sur- ments frequently fit Che linguistic frame: "I re-
prised she was a woman. She had a white laboratory member ChaC X." Thus, in Che earlier example: "I
coat on, etc." If he were asked, "Was anything go- remember chat I was in a verbal learning experi-
ing through your mind while you were telling m« all ment." Semantic memory is the subclass of generic
this?" Che undergraduate might say somechlng like memory which involves the memory for abstracc prop-
"Tes, I was seeing in my mind's eye much of whac I oslcional InformaCion—for example: 'good is Che
told you. I could see the door, the expression on opposlce of bad' or 'che speed of lighc is a con-
the experimenter's face when I opened the door, etc." scanc' The operaclon of semancic memory does noC
It is this type of memory that will be called per- typically carry along with it an experience of men-
sonal memory in this paper. tal imagery. Thus when asked, "Whac is Che oppo-
Generic memory. If, some months later, the slce of good?" the correct answer Is given wlchouc
undergraduate were asked, "Do you remember chat you reporc of any mencal imagery. Percepcual memory
were in a verbal-learning experiment several months Is Che subclass of generic memory which Involves
ago?" he might say, "Yes." If asked, "Was any- Che memory for perceptual Information—for exam-
Ching going chrough your mind while you were giving ple: a map of the United States or the Statue of
me this answer?" he might say, "No, I just knew that Liberty. The operation of generic perceptual mem-
I had been in the experiment. There were four ex- ory does typically involve mental imagery. Thus,
periments required for the course—two were filling If asked, "Is Oklahoma to the south of Kansas?" or
out social psychology questionnaires, one was a per- "Which hand of the SCaCue of Libercy holds che
ception experiment, and the other one was the ver- corch?", mosc individuals will report a "generic"
bal-learning experiment." This Is an example of mental Image. These generic images are not typi-
the type of memory that will be called generic mem- cally experienced as Involving a parclcular time
ory. and location. The similarities and differences be-
Skill. If, some days later, the undergraduate tween a generic perceptual memory and a personal
were asked, "When I give you a nonsense syllable memory can be examined by the following exercise.
you tell me what word followed. DAX?" , he will Recall the center of your university campus (1.=.,
probably say "FRIGID." If asked, "Was anything form a mental map); now recall your most recent
going through your mind when you gave the answer?" walk across that campus. The first is a generic
he might say, "No, I had practiced the list so many perceptual memory; che second is a personal memory.
times I just knew what the response was." This Is Skill. A skill is Che ability to perform a
an example of rote memory, one type of skill. given sequence of motor or cognitive actions. A
This example was intended to provide an intui- practiced skill is typically noc accompanied by men-
tive understanding of the distinction between Che tal imagery. There are a number of subtypes of
112 skill that need to be distinguished. Motor skills
refer to the ability to carry out a sequence of mo-
tor- actlooa. This type of memory underlies the a- example of generic memory ("I remember that DAX
billty to ride a bike or hit a tennis ball. Rote was the word paired with FRIGID") or an example of
skills refer to the ability to repeat a sequence of a rote skill (given DAX the subject says "FRIG-
linguistic objects. This type of memory underlies ID"). The latter Interpretation Is supported by
the ability to repeat the alphabet or give one's so- Tulving's statement that the typical memory exper-
cial security number. Cognitive skills refer to the iment In psychology Is an episodic memory task (p.
ability to carry out some sequence of cognitive op- 390). Thus, the term episodic memory as used by
erations. This type of memory underlies the ability Tulving apparently Includes personal memory, plus
to take the square root of a number or Co make Che semantic memories about autobiographical Informa-
verb agree in number with the subject in a spoken tion, plus skills. In sum, the analysis presented
sentence. Many statements involving skills fit the here suggests that the distinction between semantic
linguistic frame: "I remember how to do X." Thus, and episodic memory be replaced by the more analytic
"I remember how to ride a bike, how to say the al- distinction between personal memory, generic memory,
phabet, how to take a square root." In the next and skill.
section of the paper the framework developed above Research on personal memory. The classifica-
is used. tion of memory into three basic types has powerful
EbbinghauB. Gbblnghaus' 1885 monograph showed implications for empirical research. It is clear
that it was possible to carry out experiments on hu- that the Important topic of personal memory has been
man memory. However, In addition to this powerful little studied by experimental psychologists (prob-
achievement his work also served to limit the exper- ably because of the residual restrictions left by
imental Investigation of memory to a particular sub- Behaviorism). At Illinois we are currently trying
class of memory—that of skill. In the initial to ask some of the relevant questions: What are the
pages of the 1885 monograph Gbblnghaus contrasts basic parameters of personal memory? (Brewer, in
personal memory with skills. He apparently chose preparation) Are personal memories veridical? re-
to focus on skill memory for methodological rea- constructed? (Brewer, in preparation) How are gen-
sons (i.e., no need to use Introspective data). In eric memories derived from personal memories? (Brew-
fact, within the area of rote skills, he chose the er & Dupree, in preparation) What are the phenom-
savings method over the recall procedure because he enal properties associated with the different types
felt there might still be an Important phenomenal of memory? (Brewer & Pani, in progress).
component to recall tasks, whereas with the savings References
method he would just be comparing (behavioral) per- Bergson, H. Matter and memory. London: Allen &
formance measures. This initial methodological de- Unwin, 1911.
cision by Ebbinghatis had an enormous impact on psy- Brewer, W.F. Autobiographical memory. In prepar-
chology—for 85 years in psychology the study of ation.
memory was the study of rote skills. Brewer, W.F., & Dupree, D.A. Memory for episodic
Tulving. In the late 1960's a few psycholo- and generic information. In preparation.
gists were able to break out of the Ebbinghaus fo- Brewer, U.F., & Panl, J.R. Phenomenal reports in
cus on skills and began to carry out experiments on taska involving personal memory, generic mem-
semantic memory (e.g., Collins & Quillian, 1969). ory and skills. In progress.
In a seminal paper Tulving (1972) pointed out the Collins, A.M., & Quillian, M.R. Retrieval time from
fundamental difference in this type of experiment semantic memory. Journal of Verbal Learning and
and formulated the distinction between semantic mem- Verbal Behavior. 1969, 8, 240-247.
ory and episodic memory. The definition of seman- Ebbinghaus, H. Memory. New York: Dover, 1964
tic memory outlined above essentially follows Tul- ^original German edition 18857.
ving's usage. However, Tulving's restriction of
this type of memory to linguistic knowledge seemed Furlong, E.J. A study in memory. London: Thomas
too narrow, so I adopted the term generic memory for Nelson, 1951.
the larger class and the term semantic memory for Hlntzman, D.L. The psychology of learning and mem-
the propositlonal subclass (see Hlntzman, 1978, and ory. San Francisco: Freeman, 1978.
Schonfleld & Stones, 1979, for similar arguments).
The construct of episodic memory, as used by von Leyden, W. Remembering. New York: Philosoph-
Tulving, is harder to deal with. When it is defined ical Library, 1961.
in abstract terms, it seems close to personal memory Locke, D. Memory. Garden City, NY: Anchor Books,
as outlined above. Thus, Tulving states that epi- 1971.
sodic memory "stores Information about temporally
dated episodes or events and temporal-spatial re- Malcolm, N. Knowledge and certainty. Englewood
lations among these events" (p. 385) and proposes Cliffs, NJ: Prentice-Hall, 1963.
that statements from episodic memory refer to "a Russell, B. The analysis of mind. London: Allen
personal experience that is remembered in its tem- 6. Unwin. 1921.
poral-spatial relation to other such experienced'
(p. 387). However, the examples given by Tulving Schonfleld, D., & Stones, M.J. Remembering and
suggest that things are not that simple. Thus, one aging. In J.F. Kihlstrom and J.E. Frederick
of the 4 examples of episodic memory was the state- (Eds.), Functional disorders of memory. Hills-
ment, "Last year, while on my summer vacation, I dale, NJ: Erlbaum, 1979.
met a retired sea captain who knew more jokes than Tulving, E. Episodic and semantic memory. In E.
any other person I have ever met " (p. 386). Taken Tulving and W. Donaldson (Eds.), Organization of
at face value this appears to be an example of gen- memory. New York: Academic Press, 1972.
eric memory as the term has been used in this paper. 113
A clear example of a personal memory would have been
a statement such as, "I remember sitting on the
stool ac the bar, drinking a hot toddy while he
told the traveling sailor joke, etc." One of the
other examples suggests a more fundamental difficul-
ty. list
"I know
was the
word " that
Teaporal Judgments about Natural Events
Norman R. Brown
Lanoe J. Rips
Steven C. Shevell
University of Chicago

The Information one reaembers about the time kin to the Lack of Knowledge Inferences described
of an event Is rarely as precise as one would like. by Collins (1978) and to the AvaUabillty Heuristic
For a few consequential events, exact dates can of Tversky and Kahneman (1973). The difference Is
sometimes be recalled; for example, one might that while Lack of Knowledge and Availability are
remember that John Kennedy's assassination took used to draw conclusions about frequency or prob-
place on November 22, 1963 or that Pearl Harbor was ability, the Accessibility Principle yields conclu-
attacked on December 7, 1911. But aside from these sions about the age of unique events. In the
blockbuster events and from recurrent events like former case, one reasons that since one can't
birthdays and holidays, exact and explicit dates remember the event well, it probably happened
are usually unavailable. Even fairly Important infrequently or not at all. In the latter case,
events, such as Spiro Agnew's resignation or the one reasons that since one can't remember the event
DC-10 crash in Chicago, which could hardly have well, it probably happened long ago.
escaped our notice at the time of their ooourrenoe, SubJMtlTB Ago Qf Palrfld ETMta of tha 1970'a
now are difficult to date accurately. Things could A straightforward prediction of the Accessi-
be otherwise. Events could be logged in memory in bility Principle is that events that are retro-
the way they are recorded in almanacs, and in this spectively vivid and memorable should seem more
case determining when an event occurred would recent than events that are not (other things being
amount to simple table lookup. But since access to equal). Consider, for example, the DC-10 crash in
specific remembered dates is uncommon for ordinary Chicago and the DC-10 crash in Antarctica of about
events, it is of interest to examine the more the same period. Since the DC-10 crash in Chicago
indirect means that people use in reckoning bow is comparatively more memorable than the one in
long ago such events happened. Antarctica, the Chicago crash should be Judged more
With a few brave exceptions (e.g., Linton, recent, even though, in point of faot, it happened
1975), previous research on temporal memory has six months earlier (May 25, 1979 vs. November 28,
been limited to the study of short intervals (on 1979).
the order of minutes or hours) and to brief events We tested this prediction in an experiment
(usually words or syllables) presented to the sub- using 19 pairs of events like the two DC-10 crashes
ject in the laboratory. Examples are the "time that were matched as closely as possible for actual
perception* experiments of Fraisse (1963) and O m - time of occurrence and for the content of the
stein (1969), and the literature on recency Judg- events themselves. The pairs included sports and
ments in list learning (e.g.. Hacker, 1980). Our cultural events (e.g., Saul Bellow wins the Nobel
investigation focuses on people's accuracy in Prize vs. Burton Richter wins the Noble Prize) as
dating natural events over longer intervals. Like wall as standard news storlea, all of which
the earlier research, however, we employ experi- occurred between 1973 and 1980. Hltbia each pair,
nifl Ac methods
mental tbiUtY
to PrlflClBla
test iadivlduala' maaory for suoh one of the events was designated as more memorable
facta. In this respect, our studies parallel many than the other on the basis of ratings collected
current an event ofsuchspatial
investigations as the Chicago
and from two Judges, neither of whom were aware of the
crash, for which
cognitive maps. no exact date is retrievable. How hypothesis under investigation. A complete list of
could one go about estimating its relative time of the pairs, together with their true dates and meao-
occurrence? One possibility is based on the rabUity status, is given in Table 1. In the
obvious fact that, generally speaking, the longer experiment proper, the 38 individual events were
an event is retained in memory, the less one can read to subjects In random order, and the subjects
remember about it. Thus, given events that are were asked to respond to each with a number that
equivalent in other respects, the event about which beat represented how recently the event happened.
one remembers most is likely to be the one that The numbers were chosen from a O-to-9 scale, with
happened most recently. We call this rule the high values corresponding to recent events and low
"Accessibility Principle," since it asserts that values to old ones. We informed subjects before
the more accessible the information about an event, the start of the experiment that all of the events
the more recent that event will seem. Of course, took place after 1970. Since the 15 subjects were
this principle is hardly foolproof. Factors like of college- or graduate-student age, all of them
the initial salienoe of an event or its similarity had lived through the time of the target incidents.
to other events can influence the amount of inform- Mean recency ratings from these subjects are
ation retained about it, beyond any effect of sheer also displayed in Table 1. Although on average the
passage of time. There is even evidence that, true date of the memorable events is slightly ear-
under certain conditions, recallable information lier than that of the less memorable ones (a dif-
can actually increase with delay (Erdelyi & Klein- ference of .05 years), subjects' ratings place the
bard, 1978). Nevertheless, the Accessibility Prin- memorsthle events later. The overall mean rating
ciple may still be useful as a rough guide to the for the memorable events is 5.7, whereas the meeui
time of an event, even though subject to error from
114 for the less memorable events is 5.1. These
variables like salience (as we demonstrate below). ratings differed significantly when either subjects
We view the Accessibility Principle as a close or event pairs are considered a random effect [for
subjects, £.(1,14) = 20.43, a. < .01; for events,
1(1,18) = 4.58, a.< .05; however, quasi-Hd ,25) =
4.01, .05 < a < 'lO]' As an example of this
AcceaslbUltr outcoae, SOf of tha subjects rated significantly correlated at each of the three
the Chicago DC-10 craah aa occurring after the Intervals. Data from the first interval are espe-
Antarctica craah, despite the fact that the oppo- cially interesting since they are least likely to
site order is the correct one. Table 1 also be influenced by media retellings and follow-up
reveals a nuaber of exceptions to the Accessibility reports. Second, and somewhat surprisingly, the
predictions, although in oost cases these are fron number of propositions recalled is a better pre-
pairs in which the difference inffleoorabilityIs dictor of recency than the actual date of occur-
small. Aa one would expect, the correlation be- rence at All three intervals. In addition, a trend
tween neaorability and recency ratings is signif- in the rating data followed the prediction that
icant for these stiaulua itans [£(36) '•38, ji.<.05]. subjective recency would increase following recall.
Recall and PercBlTed Age ai Eynnta In 1982 The average recency rating after recall was 5.7 for
Although our prediction was confirmed that subjects in the 60-day Recall group; however, the
more accessible events seem more recent, measure- average rating from the 60-day Recency group was
ment of acceaslbility (the memorability ratings) 5.3. But although this trend was significant when
was fairly indirect for the events of the first tested over events [£(1,39) = 13.07, s. < .01], it
study. In a second experiment, we have evaluated waa nonaignificant when tested agalnat aubjeots
accessibility more directly by measuring subjecta' [£(1,28) = 1.28, a. > .10].
recall of eventa, rather than relying on ratinga. lapHcationa
We predict that the larger the number of prop- According to the Acoeaaibility Principle, the
ositions about an incident that a subject can apparent age of an event depends upon the amount of
recall, the more recent that incident will seem. information about It that one can bring to mind.
In this neu experiment, the basic recency Judgments This principle gained credence from the results of
and recall protocols were obtained from separate our first study. In which more memorable events
subject groups. Notice, however, that the act of were rated as taking place more recently than sim-
recall may itself make the associated events more ilar events of approximately equal objective age.
accessible. For this reaaon, it is of interest to The second experiment strengthened the case for
compare recency ratings from subjecta who have Juat Accessibility by demonstrating that the number of
completed recalling the events and recency ratings facts recalled about an event is a powerful pre-
from subjects who have not engaged in recall. If dictor of its subjective time of occurrence. We
recall increases accessibility, then ratinga of have little doubt that other cognitive processes
recenoy-after-recall should be systematically can also affect temporal Judgments for natural
greater than ratings of recency-without-recall. events like these. As we have acknowledged, cer-
The target events in this study were 40 tain influential or recurrent events may be tagged
headline-type incidents that were culled from the with dates; the time of lesser eventa may be eati-
front pages of the Chinas Tr•^hl•n^ and the Mew lorlt oated through their cauaal connections to these
Uiiaa. between January 4 and January 11, 1982. This influential ones. Still, a glance at the items in
collection of events included items such as: Table 1 suggests that causal links to datable
Richard Allen resigns aa National Security Advisor, events may not always be present, and in these
the first O.S. test-tube baby leaves the hospital, circumstances, the Accessibllty Principle may be
and the D.S. drops its antl-truat suit against IBM. the dominant method for temporal Judgments.
Since we were interested in tracking tha relation- The Accessibility hypothesis bears an analogy
ship between recency and recall at different to classical strength theories of time perception,
Intervala after the eventa took place, we tested which predict that the strength of the memory trace
several independent groups of subjects: one Recall at the time of test determines the apparent age of
and one Recency group during the week immediately the associated event (see the references cited by
following the last target event, a second pair of James, 1890, Pp. 632-633, and more recently, Hin-
Recall and Recency groups during the week beginning rlcha, 1970, and Morton, 1968). Pure strength
15 days after the last event, and a third pair 60 theories, however, have not fared especially well
days after the last event. To aasess our hypoth- in tests involving multiple list learning (Hintzman
esis that recall Increases apparent recency, we & Block, 1971; Flexaer & Bower, 1974). By Impli-
also asked subjects in the 60-day Recall group for cation, these earlier results suggest that the
recency ratings after they had completed their mechanism responsible for our aooesslbility effects
recall protocols. Recency ratinga were elicited in is not as simple as a unidimensional quantity con-
a way similar to that of the first experiment nected to one's memory for an event. Our experi-
(except that the subjecta were told that the events ments leave the exact nature of the underlying
happened in the 1980's rather than the 1970's). mechanism as an open question. Nevertheless, the
Recall subjects were given the same event names similarity mentioned above between the Accessi-
(e.g., Richard Allen resigns) and were asked to bility Principle, the AvaUability Heuristic, and
write down all at the facta they could remember Lack of Knowledge Inferences may Indicate that we
directly related to the named events. The recall are tapping part of a very general and complex
score for each incident waa calculated as the aver- Inductive procedure.
age number of true atoalc propositions recalled Acknowledgments
about It (see Cintsch, 1974). Stricter scoring We thank Martin Ringle and David Zager for
methods (e.g., counting only directly relevant true their advice and assistance. We also acknowledge
propositions) yielded the same pattern of results. the Sloan Foundation for its support of this
Fifteen subjects participated in each of the Recall research.
and Recency groups. References 115
The main results from this second study are Collins, A. Fragments of a theory of plausible
given in Table 2 in the form of Spearman corre- reasoning. In D. I. Waltz (Ed.),
lations between recency ratings and recall scores. Thaoretleal lasuaa 1n natural langiiaya nrn-
Also shown in Table 2 are the correlations between
recency and the events' true dates. Two facts
about these data stand out. First, as the Accessi-
bility Principle predicts, recall and recency are
n«i.«i«i ng-3. New Toric: Aaaoolatlon for Com- TABLE 1
puting Maohlnery, 1978. Stimulus Events, True Dates, and Mean Recency Ratings,
Erdclyl, M. H., & Clelnb«rd, J. Has Ebblnghaua Experiment 1
decayed with time?: Tbe growth of recall Event Pairs Date Recency
(hypemneala) over day a. Journal r»r Rating
Emarimantal Pavchologv: UUOOa—LflBmlng aM
H«.H.nrv, 1978, 1, 275-289. 1. Reagan and Bush nominated by the
?1.»xa»r, &. J., i Bower, 0. H. How frequeaoy Republican convention. 7/80 8.2
affects recency Judgments: A model for Carter and Mondale nominated for a
recency discrimination. .rnnr-nni nc second term by the Democratic 8/80 7.5
Erporimantal Pavohologv. 1974, ^Sli 706-716. convention.
Fralsse, P. Th« DaYohol o<pr of Uoft. New York: 2. Dustin Hoffman won an Academy Award 4/80 7.8
Harper & Row, 1963. for gramar v.i. gpamap.
Hacker, M. J. Speed and acouraoy of recency Judg- Sally Field won an Academy Award for 4/80 7.2
ments for events in short-term memory. Jour- 5/79
3. A DC-10 crashed in Chicago. 7.1
aaX S£ ETParlmantal PaYChfllOgT; QiMMO. 11/79
A DC-10 crashed in Antarctica. 5.5
lifwrnlng and Hflmonr. 1980, i. 651-675.
Hinrlchs, J. V. A two-process memory strength 4. Lord Mountbatten assassinated in
Ireland. 8/79 5.9
theory for Judgients of recency. Pavoho-
iQglcal RflTlw. 1970, n , 223-233. U.S. Ambassador Adolph Dubs 2/79 6.7
Hlntzman, D. L., & Block, R. A. Repetition and assassinated in Afghanistan.
memory: Evidence for a multiple trace 5. The Supreme Court affirmed a lower
hypothesis. .Innrnal at ETporiiiMnhal court decision ordering
PaYflhQlnrr. i97i, M , 297-306. California Medical School to 6.5
James, u. Tha BrlBfllBlfla of BarchQlagy. Vol 1. admit Allan Bakke. 6/78
New Tork: Holt, 1890. The Supreme Court ruled that labor
Cintseh, W. Tha raDP»« a£_ unions could distribute material 5.1
oaa&CZ.. Hillsdale, H.J.: Erlbaum, 1974. of a political nature at an
employment site. 6/78
Linton, M. Henory for real-world events. In D. 6.7
6. David Berkowitz was arrested on a
A. Norman and D. B. Ruaelhart, Expinpationa
murder charge. 8/77 5.3
in ^ninyjtinr,. San Franclsoo: FreeMn, 1975.
Gene Leroy Hunt was arrested on a
ftorton, J. Repeated items and decay in memory. 7. West German
murder terrorists
charge. hijacked a 4/78
10/77 6.0
P^Tnhnnn-ir .S»H«n>.«- I968, IJL 219-220. Lufthansa edrliner.
Ornstein, R. B. On ».h« »Tn«p<«n«« «f tim». Bal- An alleged bank robber, Thomas
timore: Penguin, 1969. Hannan, hijacked an airplane 10/77 4.1
in Nebraska.
Tversky, A., i Kahneman, 0. AvallabUlty: i 9/77 6.5
8. Hoota won an Emmy Award.
heuristic for Judging frequency and
Blaanor and Frnnlflln won an Ema^ 9/77 6.1
probabUlty. Cngn^^^•^v^ Pavohoiogv. 1973, 5a
Award. 6.1
207-232. 4/77
9. Annin opened on Broadway. 3.5
Tha OlB Gaaa opened on Broadway.
10.Saul Bellow won a Nobel Prize in 5.4
Burton Rlchter won a Nobel Prize 10/76 4.3
in physios.

TABLE 1 (cont.) TABLE 2
Spearman Correlations between Recency Estimates, True
11.Bruce Jenner won an Olyapio Goid Dates, and Number of Recalled Propositions,
Medal In tbe decathlon. 7/76 5.1 Experiment 2
Evelln Scblaak won an Olympic Gold Number of Propositions True Date
Medal in the discus throw. 7/76 3.9
12.Mao Tse-tung died. 9/76 4.8
Chou En-lal died. 1/76 5.4
Recency Rating
13.Muhamnad ill COs Joe Frazler. 10/75 i».3
Huhaoaad U i EOs Jean-Pierre
4.6 •0 Days ,80»«» .18
Coopnan. 2/76
14.E. L. Doctorow's Ragt-ima published. 7/75 4.7 +15 Days .69"« .41»»
Irving Stone's The Greek Treaaure
9/75 5.0 +60 Days .68»«« .34«
15.Unda Ronatadt's aaart Llto a WbCfll
won a Gold Record. 1/75 6.3
John Denver's An Evttnin^ with John •p < .05
2/75 4.6 ••p < .01
Danver won a Gold Record.
•••p < .001
16.Aristotle Onaasis died. 3/75 4.8
H. L. Hunt died. 11/74 5.1
17.Stave Garvey wins baseball's
Most Valuable Player award. 11/74 5.3
Jeff Burroughs wins baseball's
11/74 4.1
Most Valuable Player award.
18.Patty Hearst kidnapped. 2/74 4.1
J. Reginald Murphy, editor of tbe
AtianU CoBatltutAon, kidnapped. 2/74 5.1
19.Spiro Agnew resigned as Vice Prea. 10/73 3.0
Nelson Rockefeller resigned aa
Governor of Mew lork. 12/73 4.5
Saia.' The first member of each of the pairs was
rated as the more memorable. Tbe standard
error of the above means is .46.

Psychological Issues Raised by
an AI Model of Reconstructive Memory
Janet L. Kolodner
Department of Computer Science
Georgia Institute of Technology
Atlanta, Georgia 30332
Lawrence W. Barsalou
Department of Psychology
Emory University
Atlanta. Georgia 30332
1. Istzodnction Dry). The generic component consists of
generalizationa describing most of its members
this paper presents soma psychological (i.e., some members may exhibit violations of these
iaplications of an AI model of reconstructive "norms"). Host "diplomatic meetings" discuss an
Descry. Psychologists have characterixed htnuin international contract, for example, but a
Deaory as reconstructive for years (e.g., [1], particular meeting might be called to plan an
[6]). AI siaulation of reconstuction goes further international event. An E-MOP's second component
since ccaputer iaplenentation requires explicit is its organization of member episodes. Episodes
specification of processes and representations. are organized based on how they differ from Che
The particular AI model ve consider here is Kolod- E-MOP's norms. An episode is indexed and retrieved
ner 's (5] E-MOP based model, iaplsaented in a com- from an E-MOP by its relevant differences. When
puter program CYEDS. The model has three inter- more than one episode has the same difference, a
related components: a retrieval process, an under- new sub-MDP is formed based on their similarities
lying memory organization, and processes for and differences. The figure below illustrates this
developing memory organization with the encoding of organization:
new events. The retrieval process vas designed to 'diplomatic meetings'
imitate reconstructive retrieval strategies obser-
ved in people. The moaory organization both sup- norms: the actor is Cyrus Vsnce
ports and causes reconstructive retrieval. Proces- (MOPl) participants are foreign diplomats
ses for developing msaory organization build nev topics are international contracts
knowledge structures (i.e., learn) as new events participants Calked to each other
are encoded. These new knowledge structures enable diffs: goal was/ to resolve disputed\ contract
subsequent reconstructive retrieval of the new participants topic
One iaportanc reason such a model should be of ./ (1)1 \ / I \ (2)
interest Co psychologists is that it makes claiaa / Day an Gromyko SALT I Jerusalem
about human memory organization and processes. I I I I
These claiaa stem frca the process of simulating
human reconstructive memory. Because Che available (3)1 (M0P2) (4) I (IOP3)
model was incoaplete, building CTBOS required fil- Begin Camp David Accords
ling it in in on the baais of intuition. We now norms: norms:
ask whether the added assumptions that fill holes partic inclnd* Begin topic is the CDA
2. TiM B-MOP modml topic concerns Israel partic are Israeli
in psychological accounts are psychologically
2.1 Memory Orgaaixatioa and Arabs specializatn of HOPl
specialization of HOPl diffs: I
A memory organization for reconstructive diffs: I I partic
retrieval must both support and cause reconstruc- I topic (7) / \ (8)
tion. It must generate clusters in recall, locate (5) / \ (6) Begin Dayan
and develop retrieval cues, cauae confusions in Jerusal^ Camp David Accorda (M0P4) I
recall and recognition, and emulate other charac- I - (M0P4) EV4
teristics of human remembering. Kolodner's m ^ o r y EV3
organization uses conceptual categories called The norms are features characteristic of
Episodic Memory Organization Packets, or E-MDPs diplomatic meetings. The episodes are indexed
(similar Co Scbank's m ? » [7]) that organize according Co Cheir similaricies and differences in
episodes in memory. A central assimiption is that Copic and participants. Meetings with Che same
there is one E-(CP for each type of activity a Copic or participants form E-HOPs whose norms are
person may be involved in, where type is defined as ccnposad of their similarities. Below these norms,
events Chat achieve a similar goal. Diplomats are a set of similar instances are subsequently
involved in "diplomatic meetings", "diplomatic differentiated by their differences (i.e., mapped
crips", "negotiationa", and "state dinners". Each into indices). CTRDS organizes E-MDPs in three
individual event is stored in Che E-MOP(s) it fits ways: (1) hierarchically, as just described; (2)
into. by causal, temporal, and containment relationships
E-MDPs incorporate both episodic and generic between normative features; (3) by Chese same
memory (coi^nly, but incorrectly called semantic relations between indices.
2.2 Maintaining
Encoding new memory organisation
episodes requires over
both time
The work of the first author was partially sup- the old organization to some extent and accomodat-
ported by NSF under grant Ho. IST-8116892.
ing it to Che new input. When a new episode is does. Nevertheless, many issues remain untouched
indexed identically to an old one, a new E-MOP must or at least require further attention. How does
be formed to subsume them. That is, E-MOP forma- the organization of a set of events constrain the
tion is triggered by "rounding" [7], which occurs manner in which people elaborate on retrieval cues?
when the new episode retrieves the other similar That is, to what extent are such strategies
one. The new E-MOP'a norms are the similarities content-dependent? Do people use the elaboration
between the two items, and its indices are their strategies used by CYBDS? Do they use others?
differences. Because generalizations about a kind What strategiea are used most often, in what order,
of event based on only two i t ^ s may be inaccurate, and for what reason? Similar to the content-
subsequent episodes encoded with this E-MOP are dependence issue is the context-dependence issue.
used to refine these norms. If a feature not a To what extent is elaboration affected by
norm for the first two episodes turns out to be imiMdiately previoua searches for other events?
normative for most others, what was initially an for the same event? Given retrieval failure while
index can become a norm. Similarly, if a false searching an organized set of events, how do people
generalization were made, norms for the first two select new parts of the organization for search?
instances can be relegated to indices. How does a retrieval access change organization?
2.3 Imtzirrml How sensitive is retrieval to incorrectly specified
Retrieval cues are abstracted from requests to cues? Is the model too dependent on correctness?
remember an event. Such requests can be partial or Perhaps the most central issues the model
complete specifications of the event to be raises are: How are events organized in memory?
retrieved. A request specifies an E-MOP to be And how does this organization change over time?
searched and which indices within the E-tOP are to c n n S assumes that events are the fundamental
be traversed to find the event. An important organizing units in memory. Is this true of human
assumption is that an E-MDP index cannot be traver- memory? If not, then what are the fundamental
sed unless it is specified. In this way, retrieval units? There may be several types of organizing
is directed by the information in the request and principles. Others to be considered are: location
further information that can be derived from it. (e.g., a local restaurant or bar); time (e.g.,
Since this process can fail in several ways, Christmas, summer); participants (e.g., Nixon, a
reconstructive strategies are proposed to deal with spouse, a close friend)? If there are several ways
various types of failure. First, if the informa- events are organized, what determines which will
tion in a request does not specify an E-HOP to be apply to a given set of events? The content of the
searched, then one or a small set of E-MOPs must be events? The goal the organization will serve?
chosen. This process sees if any of the features Perhaps several organizations simultaneously exist
stated in the request have E-HOPs associated with over a set of events.
them (i.e., schema triggering). A related issue concerns kaowing what
A second type of failure stems from E-(OPs feature(s) should be used to discriminate two
being untraversable unless their indices have been episodes sorted to the same E-MOP. There may be
specified. A retrieval cue may specify features numerous features that distinguish two events, but
that don't correspond to E-MOP indices. Or, a only those that will be useful in the later evolu-
retrieval cue may be so general that it doean't tion of generic knowledge should be chosen for
specify enough features to direct traversal proces- indexing. How can such features be chosen?
ses to a unique item. In that case, plausible Another related iasua is how many indices are grown
featurea corresponding to E-MOP indices must be each time reminding occurs. Another central issue
inferred from the given retrieval cues. A "fleeting concerns B-MOP construction. Is a new E-MOP
with Menachim Begin" might plausibly have taken constructed every time someone is reminded of an
place in Jerusalem. The strategies which make old event by a current one? To what extent ia
these inferences capitalize on an E-MOP's norma and generic structure automatically acquired from and
knowledge about plausible relationahips between imposed on events? Or is conscious attention
different event featurea. Once such information necessary to abstract normative information from
has been specified, the corresponding indices are previous events, organize it into E-MDPs, and apply
traversed. Interestingly, both types of strategies it to new eventa? Human data may be informative on
mentioned so far can lead to retrieval confusions these points.
and false starts. In the proposed memory organization, MOPs and
A third type of strategy derives from the their sub-MOPs form hierarchies in which conmon
relationships between events in memory. Individual properties are stored once at the highest possible
events refer to other events they are related to. point in the hierarchy. This economy of storage
If an event related to the requested event can be parallels what psychologists call "cognitive
better specified, the related event can be used to economy" [2]. To date, it appears that the
further specify the requsted one. To recall a organization of semantic memory (i.e., lexical
particular museum visit, for example, one might meaning) violates cognitive economy [31. But to
att^pt to recall the trip it was part of. what extent is this violation true of other types
3. Psychological Issues of generic knowledge? Does human organization of
This model stems from observation* of how events reflect cognitive economy? Or do people
people remoiber, and what they forget. Although have much looser, less integrated and non-inclusive
the processes and organization used to construct a organizations for events?
complete model of reconstruction seaa to work, are An important aspect of E-MOPs is that they
they really psychologically valid? One aapect of combine "episodic" and "generic" maaories. This 119
the model that has received empirical investigation implies that episodic and generic memory are not
to a large extent is reconstructive retrieval separate entities but are intimately connected. If
strategies [81. People appear to elaborate upon chis is so, what exactly is the connection? When
requests to remember in many of the ways CtioS does episodic information (e.g., E-MOP indices)
become generic (e.g., E-MOP norms or frame
information)? When and how does generic informa-
tion become confused with episodic information to
genarate confusiona? In E-tOPa, both happen as tion of generic information from old E-HOPs to new
generalizaciona are refined and corrected. ones. Generic knowledge associated with a,
There are a number or topics not covered in particular E-lOP might be uaeful, however, in
the original oodel which are nonetheleaa important creating a aew related E-HDF or in understandinf
to a theory of human memory organization and something in a similar E-MOP. To what extent doea
retrieval. One such iasue concerns the roles of "generic" structure generalize from one set of
automatic versus conscious processes that encode eventa to another? Muat a completely new structure
information into memory. Temporal, spatial, and be built for each new set or does transfer occur?
frequency information appear to be automatically Vfhat procedures trsnsfer the structure of an old
acquired — even without Vmoving they are doing it, E-MOP to a new one? How can knowledge in one E-MOP
people encode these fundamental aspects of events (e.g., for Vance) be uaed to understand something
[41. In contrast, the acquisition of content about a related referent (e.g., Haig)7
information often seems to depend more on the use 4. Pature Dlreetiona
of conscious attention. When such information We are currently designing experiments that we
doesn't receive attention, the information la not hope will help answer the questions above. The
acquired. How do these two types of processes experimants, no doubt, will raise additional
interact to store events? Conscious attention may queationa. Aa a joint Artificial Intelligence and
be responsible for the construction, organization, Psychology project, we will address these questions
and reorganization of generic structure, since it in the same way we have found it profitable to
usually containa content information. Automatic consider their ancestors --by building computer
processes may be responsible for the strengthening prograas and by collecting human data.
of generic knowledge and the integration of spatial
and temporal iaformation into it. Finding 11] Bartlett, R. (1932). '^flBffllfrtrilH• 4 Studv in
algorithms for these latter phenomena and interfac- Experimental and Social Psychology. C^bridge
ing them with content-oriented processes appears to [21 University Press, London.
be an interesting problem. Collins, A.M., & Quillian, M.R. (1969).
A related issue is the role of similarity Retrieval time from semantic memory. JVLVB
among events. This factor can facilitate people's [31 (8), 240-247.
memory performance on some occasions and interfere Conrad, C. (1972). Cognitive Economy in
with it on others. Observing such phenomena in [41 semantic msory. JS£(92), 149-154.
people's memory for eventa may further constrain Hasher, L. & Zacks, R.T. (1979). Automatic
the way in which we view generic knowledge of [51 and effortful processes in m^ory. JEP;G
events and its us* during retrieval. In E-tOPs, (108), 356-388.
when a property doesn't correlate with other Kolodnmr, J. L. (1980). R