1
- Printed
in the Netherlands
University
Abstract
Indirect requests vary in politeness; for example, Can you tell me where
Jordan Hall is? is more polite than Shouldnt you tell me where Jordan Hall
is? By one theory, the more the literal meaning of a request implies personal
benefits for the listener, within reason, the more polite is the request. This
prediction was confirmed in Experiment 1. Responses to indirect requests
also vary in politeness. For Can you tell me where Jordan Hall is?, the
response Yes, I can - its up the street is more polite than Its up the street.
By an extension of that theory, the more attentive the responder is to all of
the requesters meaning, the more polite is the response. This prediction was
confirmed in Experiments 2, 3 and 4. From this evidence, we argued that
people ordinarily compute both the literal and the indirect meanings of
indirect requests. They must if they are to recognize when the speaker is
and isn t being polite, and if they are to respond politely, impolitely, or
even neutrally.
When people make requests,
they tend to make them indirectly.
They
generally avoid imperatives like Tell me the time, which are direct requests,
in preference for questions like Can you tell me the time? or assertions like
Im trying to find out what time it is, which are indirect requests. The
curious thing about indirect requests is that they appear to have one meaning
too many. Can you tell me the time?, as a request, has the indirect meaning
I request you to tell me the time. Yet it also possesses the literal meaning
I ask you whether you have the ability to tell me the time. If the speaker
*This research was supported
in part by Grant MH-20021
from the National Institute of Mental
Health, the Center for Advanced Study in the Behavioral Sciences, and a National Endowment
for the
Humanities
Fellowship.
We thank Eve V. Clark and Ellen M. Markman for their helpful advice in the
writing of this paper *and Susan L. Lyte for carrying out most of the experiments.
Dale H. Schunk is
now at the University
of Houston.
Requests
for reprints
should be sent to Herbert
H. Clark,
Department
of Psychology,
Stanford University, Stanford, CA 94305, U.S.A.
is merely requesting the time, why the extraneous question about ability?
How does it figure in the listeners understanding
of that request? It was
these two questions that prompted the present study.
These questions
suggest two general kinds of processes by which an
indirect request might be understood.
The first kind, which we will call
idiomatic processes, creates one and only one meaning - the indirect meaning.
0111 JOU tell me the time.?, used as a request, would be understood directly
and solely as Please tell me the time. At no point would the listener create
and use the literal meaning Do you have the ability to tell me the time?
The second kind of process, which we will call multiple-meaning
processes,
creates both the literal and the indirect meanings, though not necessarily one
after the other. By this kind of process Can you tell me the time? would be
understood
as involving both a question (Do you have the ability?) and a
request (Please tell me the time).
Each kind of process is needed in certain clear cases. An idiomatic process
is probably required for Ilow do you do.?, which is a question indirectly used
as a greeting. Although the historical vestiges of the literal question (How
are you?) are still present, the question no longer has any force; it isnt
answered sensibly by Fine, thank you. On the other hand, a multiple-meaning
process is probably required for the use of Its late, isrz t it? to request the
time. There seems to be no way of figuring out the request without knowing
what the speaker meant literally. However, on the continuum
from frozen
idioms like fIo\v do JOUdo? to novel requests like Its late, isrz t it ,? there are
intermediate
cases in which a sentence is conventionally
used for an indirect
purpose. For these, either kind of process might apply.
For conventional
indirect requests like Curl _~ozltcfl me tile time:?, which
kind of process is used? Within linguistics, the earliest proposals by Sadock
(1970) required
an idiomatic
process, but more recent ones, by Searle
(1975) and Morgan (1978) for example, require a multiple-meaning
process.
Within psychology,
Schweller (1978) and Gibbs (1979) have proposed
idiomatic
processes,
but Clark & Lucy (1975) and Clark (1979) have
proposed
two different
processes of the multiple-meaning
variety. Thus,
there is an issue here to be resolved.
The feature that makes the multiple-meaning
processes distinctive is their
assumption that literal meaning plays a role in comprehension.
But if it does,
what is that role? For indirect requests, one answer has been offered by
Lakoff (1973, 1977) and by Brown & Levinson (1978): The literal meaning
is important in conveying politeness. As requests for the time, Ma) I ask you
what time it is.7 is ordinarily more polite than Wont JOU tell me what time it
is? Since the two requests have the same indirect meaning, the reason must
lie in their literal meanings. The literal meaning of the first, roughly 1
113
request permission to ask you what time it is, presumes very little on the
requestee and offers him the power to grant permission. The literal meaning
of the second, roughly I ask you if you do not intend to tell me what time
it is, presumes a good deal on the requestee and expresses a not-so-hidden
criticism. By this logic, conventional
indirect requests get their politeness
rather directly from the literal meanings.
In a roundabout
way, responses to indirect requests may get their politeness from the literal meanings too. When Ann asks Bob Gzrz you tell ~2e the
time?, Bob might ordinarily respond with a single move, Its six. But if he
wanted to be especially polite, it is our intuition that he would add a first
move, as in Yes, 1 can - its six. Let us call Yes, / cuyl the literal move, and
Its six the indirect move. If we assume that Bob couldnt give the literal
move without computing
the literal meaning, then he must have taken in
Anns request by a multiple-meaning
process. But are responses with both
moves actually more polite, and if so, why?
In this paper, then, we will investigate two issues jointly. The first is comprehension. Does literal meaning play a role in the understanding
of indirect
requests, and if so, what? The second issue is politeness: What makes some
indirect requests, and some responses, more polite than others? In the first
half of the paper, we will take up the politeness of indirect requests, and in
the second half, the politeness of their responses.
114
I, 3, and 4
Descriptive category
_____
Request
1. Permission
2. Imposition
3. Ability
4. Memory
5. Commitment
6. Obligation
Shouldnt
type
Hall is?
115
literal yes answer for compliance, and others take a no. We will use the first
few words of each request as its abbreviation, like May Z ask you? for May
I ask you where Jordan Hall is?
Since all 18 requests have the same indirect meaning, their differences lie
in the literal meanings. Indeed, these requests can be ordered, on a priori
intuitive grounds, for how much their literal meanings, if taken seriously,
would benefit B or reduce the costs to B. Note that all of them have one
cost in common. They impose on B by asking a question he must answer
with yes or no. Otherwise, the requests can be sorted into six broad categories (see Gordon & Lakoff, 1971; Searle, 1975), as shown in Table 1.
These categories can be ordered approximately
for their benefit to B.
1. Permission. With the literal meaning of May I ask you where Jordan
Hall is?, A is offering B the authority
to grant her permission to make her
request. This is obviously a great benefit to B. He now has a higher status,
or authority,
than he had the moment before, and the status entitles him
to give permission to A even to make a rather trivial request. Such a benefit
makes this and the other two requests in this category particularly polite.
2. Imposition.
With the literal meaning of Would you mind telling me
where Jordan Hall is?, A is no longer offering B the full authority to permit
her to ask him for the wanted information.
Still, she is offering him the
authority
to say that her request imposes too much. This benefits B. A is
thereby admitting that she is imposing on him, and the admission benefits
B too. So Would you mind? should be relatively polite too, although not as
polite as May I ask? and its kind. The authority to grant permission, on the
face of it, benefits B more than the mere chance to say that the task is too
imposing.
3. Ability. When A says C&I you tell me where Jordan Hall is?, she is
literally asking B to say whether or not he has the ability to tell her where
Jordan Hall is. By giving him the opportunity
to deny this ability, the
question both benefits and costs B a little bit. It benefits him by allowing
him to avoid the embarrassment
of being asked a request he couldnt comply
with. But it costs him a little by suggesting that he may not be competent to
comply. Compared to May I ask? and Would you mind? with their great
benefits to B, Gzn you tell me? should be less polite. In so far as the other
three ability requests reflect the same rationale, they should be similar in
politeness. We will take up this qualification later.
4. Memory. The literal meaning of Have /already asked you where Jordan
Hall is? makes a subtle demand on B. It asks him whether or not he can
remember
whether A asked him earlier for the location of Jordan Hall.
Most of the time he wont find this literal demand easy to fulfill, and
anyway, why should he be expected to keep track of what he has told her
116
117
Experiment
Method
Thirty
Stanford
University
undergraduate
students
rated the politeness
of 54 requests, three of each of the 18 types of requests in Table 1.
The 54 sentences used each requested different information.
The information was ordinary,
but fictitious everyday
information
of a relatively
simple kind about who someone was, what something was, or where or
when something happened.
There was one each of these three kinds of
content for each of the 18 types of requests. Examples: May Zask you where
you bought your jacket? and Did you tell me who went to the party last
/ziglzt.T These 54 requests were typed in random order, 18 to a page, on three
mimeographed
sheets, which were stapled in random order for each student.
The students wrote their ratings next to each request.
The students were instructed to rate each request on the following scale:
1 ~ very polite; 2 - fairly polite; 3 - somewhat polite; 4 ~ neither polite
nor impolite; 5 ~ somewhat impolite; 6 - fairly impolite; and 7 ~ very
impolite. They were either paid $2.50 or given credit for a course requirement, and were the same students who participated
in Experiment
4. They
completed
Experiment
4 first and then Experiment
1, all within an hour.
Results
The ratings of politeness turned out very much as predicted. This can be seen
in Table 2, which lists the mean rating for each type of request and for each
category. These means were submitted
to an analysis of variance in which
both subjects and items were random effects (Clark, 1973). It showed that
the means differed reliably from one another, F (17,7 1) = 15.66, p < 0.001.
The mean ratings for the six categories of requests were expected to order
themselves from permission to obligation, and except for a minor reversal,
they did: 2.16, 3.04, 3.85, 3.80, 4.20 and 5.77. These ratings are significantly correlated
with the predicted rank order (Abelson & Tukey, 1963),
F (1,7 1) = 166.08, p < 0.001. The predicted rank order accounts for 57%~of
the variance among the 18 means. If instead of taking all the means we
consider only the two most polite forms within each category, the ordering
is still as predicted, except for a different minor reversal: 1.94, 3.04, 2.92,
3.50, 3.82, and 5.77.
The three subsidiary predictions were also generally upheld. Conditional
modal verbs raised politeness an average of 0.54 units, F (1,7 1) = 5.87,
p < 0.001. The increase was 0.17 units for may/might,
0.59 units for can/
Table 2.
_
Category
-_____
119
_____
Request
___-
type
_______
____
Mean
_______
_____Category
__-
Permission
2.00
1.87
2.62
2.16
Imposition
3.31
2.77
3.04
Ability
3.22
2.63
5.58
3.98
3.85
Memory
3.48
3.51
3.99
4.24
3.80
Commitment
4.24
3.39
4.41
4.76
4.20
Obligation
__I_
Shouldnt
5.17
5.77
mean
120
12 1
The politeness
of responses
Just as there are many ways of making requests, so there are many ways of
responding to them. For As request Gzn you tell me the time?, B could
respond in any of these ways, among others: six: six oclock; its six; its
six oclock,. yes, six; yes, its six; sure, its six; and yes, I can, its six. How
does B choose? One way is by the seriousness of As literal meaning (Clark,
1979). If B understands
A to have intended the literal meaning of her
request to be taken seriously, then to be cooperative
he should include a
literal move such as yes or sure or yes, 1 can. If the literal meaning was
intended merely pro forma, he neednt include such a move. Another way
is by how polite he wants to be. Some of these responses seem more polite
than others. These differences,
we propose, reflect the costs and benefits
theory of politeness as applied to responses. The more Bs response raises the
benefits or lowers the costs. to A, within limits, the more polite B is. The
question is how A is benefitted by Bs response.
We propose an attentiveness
hypothesis:
The more attentive B is to all
aspects of As request, within reason, the more polite B is. For indirect
requests for information,
there are at least four ways B can benefit A.
(1) Precision: B should provide the requested information
as precisely as
required. In the time example, Its six would be more polite in most contexts than Its late afternoon.
(2) Clarity: B should express the requested
information
clearly. It S six oclock, for example, is clearer without being
122
H.
unnecessarily
wordy or redundant
than Six, where ellipsis could interfere
with As comprehension
of the information.
(3) Completeness:
B should
take seriously the literal meaning, as well as the indirect meaning. Ordinarily,
that means including a literal move, making Yes, its six more polite than a
mere Its six. Other times, including a literal move may lead to less politeness, as we shall show. (4) Znformdity:
B should put A at ease by not being
too formal, or too informal, for the occasion. In casual conversations among
acquainted peers, Sure, its six might well be more polite than Yes, its six.
B should ordinarily be much less polite when he doesnt comply with As
request. To be attentive to As request is, ideally, to comply with it. There
are, however, several ways in which B can mitigate the negative consequences
of not complying. (5) Apologies: B should apologize for not complying. In
the time example, Im sorry, Z,can t would be more polite than a simple
I cun t. (6) Explanations:
B should explain why he is not complying.
Responses that contain a good reason, like I can t, I dont have a watch,
would be more polite than ones without, like I cant. Apologies and explanations benefit A in different ways. Apologies place B in a deferential position
and give A the benefit of increased status. Explanations
tell A that B isnt
refusing to comply merely to snub, put down, or otherwise do in A.
Explanations lower the cost to A of Bs refusal.
Experiments
2, 3, and 4 test several aspects of the attentiveness hypothesis. Experiment
2 explores the range of factors involved, while Experiments
3 and 4 examine more closely how politeness is related to literal meaning.
Experiment
Met hod
Students were asked to rank order for politeness three to five alternative
responses to each of eight requests. The eight requests are shown in Table 3.
For each we composed two sets of three to five responses. One set consisted
of compliant responses, and the other set of refusals to comply. These sets
are also listed in Table 3. In composing the responses we tried to find ones
that sounded as natural as possible.
We constructed
two different
questionnaires.
Each one contained
the
eight requests typed four to a page in random order on two mimeographed
sheets. Under each request were three to five responses also in random order.
For one questionnaire,
four of the requests were followed by compliant
responses, and the other four by non-compliant
responses. For the other
questionnaire,
that assignment wasreversed. For each response set separately,
Table 3.
Mean politeness
123
ment 2)
Request
._____-
Response
1.63
1.94
2.56
3.75
1 .Ol
1.93
3.07
3.93
1.13
2.00
2.87
4.00
No, Im sorry,
No, I cant.
No.
I cant.
1.00
2.00
3.00
Sure, here.
Yes, I can. Here it is.
Yes, here it is.
Here it is.
Here.
1.81
2.19
2.31
3.94
4.15
1.60
1.60
2.93
3.87
Sure,
Yes,
Yes,
Tom
1.61
2.27
2.33
3.73
4.
No, Im sorry.
No, I couldnt.
No.
5. Could you tell me what time you close?
I couldnt.
Mean rank
1.25
1.94
2.81
1.87
2.07
2.07
3.80
1.13
2.00
2.88
(Continued
overleaf)
124
Table
3 (continued)
Request
Response
Mean rank
1 .40
1.87
2.13
No, 1 wouldnt.
No, I wont.
No.
2 .oo
2.00
2.06
I .07
2.20
2.93
3.80
1.06
2.19
2.81
3.94
_______
it is.
1.69
1.81
2.50
3.81
1.07
2.07
3.33
3.53
______.__-__
the students ranked each response for politeness by writing 1 next to the
most polite response, 2 next to the next most dolite response, and so on
was
down to, at most, 5. They were not to give ties. One questionnaire
completed by 15 students and the other by 16 students, all Stanford University undergraduates
who were either paid or given course credit. The task
took less than 15 minutes.
Results
The mean rank for each response is shown in Table 3. Within each set the
responses are listed from most to least polite. The differences within each
set were tested by the Friedman analysis of variance by ranks (Siegel, 1956).
Of the 16 analyses, 14 were significant at the 0.001 level and one at the
0.01 level. The only set not significant
was the set of noncompliant
responses to Would you tell me your name? We will take up the most robust
125
126
average of 1.03 ranks. When the two other pairs of responses with and
without explanations
are included in this comparison, explanations
had an
edge of 1.25 ranks.
Discussion
The attentive response, these data tell us, is a polite response. For Gzn you
tell me what time it is?, B could reply simply Six. He will be more polite,
however, if he: (1) makes his information
clearer with Its six; (2) answers
the literal question with Yes, or more clearly with Yes, I can; and (3) softens
the formality of this literal answer with Sure. If he intends not to comply,
he will be more polite if he: (4) apologizes with Im sorry; and (5) gives an
explanation
with I dont have a watch. Each added move signals more
concern with As full request. Some of them are attentive to the indirect
meaning, and others to the literal meaning.
If to be polite B has to be attentive to As literal meaning, then he must be
computing both the literal and the indirect meaning. He must be using a
multiple-meaning
process, not an idiomatic process. Is this conclusion justified? Not completely.
It might be argued that just as there are conventional
ways of making indirect requests, there are conventional ways of responding
to them politely. The link between the two is historically based but by now
entirely
conventional.
By this argument, B could be using an idiomatic
process. However, in Experiment
1, we found reasons for doubting such an
idiomatic
hypothesis
for indirect requests, and the same reasons should
make us suspect the idiomatic hypothesis for responses. Experiments
3 and
4 were designed to dissect this argument more incisively.
Experiment
The politeness of a response need not work the same way for every indirect
request. For example, while a literal move may add politeness for one
indirect request, it may not do so for another. In this experiment
we will
take up two factors that should affect response politeness. We will use the
18 request types in Table 1.
The first factor is conventionality.
Indirect requests, according to Clark
(1979), Morgan (1978), and Searle (1975), differ in how conventionally
they are used for making requests. Although Gzn you tell me the time.? and
Is your watch still working? can both be used in the right circumstances
for
requesting the time, the ordinary, usual, or conventional
form for that pur-
127
pose is Cizn you? and not 1s your watch? These two indirect requests differ
in conventionality,
and so do the 18 requests in Table 1.
The politeness of a response should depend on conventionality.
According
to Clark (1979), the conventionality
of an indirect request is one piece of
information
B uses in deciding whether or not to take that utterance as a
request. Because Gzn you? is highly conventional
as a request, B can be
fairly confident
that it is indeed being used to request the time and not
merely to ask a question, and hence that he is expected to comply. By the
attentiveness
hypothesis,
it would be impolite of him not to comply. But
because IS your watch? is not conventional
as a request, he cannot be so
confident that it is being used as a request and that he is expected to comply.
This utterance may not be a request at all, so it would? be so impolite to
answer it literally and do nothing more. The prediction,
therefore, is this:
The more conventional
the indirect request, the more polite B is to provide
the requested information.
This prediction is tested in Experiment 3.
The second factor is the politeness of the literal move of the response.
For each request in Experiment
2, a response with a literal move (e.g., Yes,
I can) was more polite than a response without. But how much politeness
should a literal move add? That depends, we propose, on what the literal
move asserts. Compare Cizn you tell me? and May 1 ask you? from Table 1.
In response to the first, the literal move Yes, 1 can is really an abbreviation
of the assertion / can tell you where Jordan Hall is. In response to the
second, the literal move Yes, you may is an abbreviation
for You may ask
me where Jordan Hall is. Of these two assertions, the first would ordinarily
be more polite among peers. The second presumes B has the authority to
permit or forbid As asking where Jordan Hall is, whereas the first doesnt
presume much at all. When the literal moves to the 18 requests in Table 1
are each spelled out this way, they will vary in how polite they are judged
as assertions. We propose that the more polite the assertion, the more politeness that literal move should add to the response as a whole. This prediction
is also tested in Experiment 3.
Experiment
3 is therefore
divided into three parts. In Experiment
3a,
people were asked to rate the 18 requests in Table 1 for conventionality.
In
Experiment
3b, other people were asked to rate the assertions corresponding
to the literal moves in responses to these same requests for politeness. And
in Experiment
3c, still other people rated the full responses themselves for
politeness.
128
Experiment
The
3a
18 requests
Table 4.
Park
Request
Mean rank
type
Category
means
_~___
Permission
8.6
8.5
7.6
8.2
Imposition
1.2
9.6
8.4
Ability
2.2
2.5
13.3
3.8
5.4
Memory
15.0
11.3
13.7
17.3
14.3
Commitment
6.8
3.4
12.4
12.6
8.8
Obligation
Shouldnt
15.2
15.2
Note - Rank
1 is most conventional,
and rank
18 least conventional.
The mean ranks of the 18 requests are listed in Table 4. The student
raters were highly consistent in their rankings. Kendalls coefficient
of concordance W was 0.76, p < 0.001. There was an average rank order correlation of 0.73 between any two student raters.
The most conventional
of the requests in Table 4 are Can you?, Could
you?,
Would you.7, and Do .~ou know:,
in which the category of ability
dominates. These requests are of middling politeness of Experiment
1. This
suggests that even though these mean ranks correlate 0.51 with the polite-
129
Experiment 3b
Corresponding
to the literal moves in the responses to the 18 requests in
Table 4 are the 13 assertions in Table 5. As we stipulated in Experiment 3c,
May I? and Might I? both had the literal move Yes, you may; Gzn you?,
Could you? and Gmt you? all had Yes, I can; and Will you?, Would you?,
and Wont you? all had Yes, I will. That is why there are five fewer assertions than requests. Each assertion was typed on a separate file card, and
the deck was shuffled and presented
to each of ten Stanford University
students with these instructions:
On each card there is a different statement
a person might make in the middle of an ordinary conversation.
Some of
these statements are polite things to say to someone in the middle of a conversation and others are not so polite. We would appreciate
your rank
ordering these 13 statements
from most to least polite. Just put the cards
in the order you think is most to least polite to say to someone in the middle
of a conversation.
Table 5.
Assertion
Mean rank
Permission
10.5
9.6
Imposition
3.8
Ability
3.6
6.1
Memory
Commitment
Obligation
Note - Rank
1 is most polite,
and rank
13 least polite.
1.5
6.1
11.4
3.2
12.1
1.6
7.1
7.8
The mean ranks of the 13 assertions are listed in Table 5. The raters were
highly consistent in their rankings. Kendalls coefficient of concordance
W
was 0.73, p < 0.001; there was an average rank order correlation of 0.70
between any two students.
These rank orders make good sense. The more an assertion benefits and
doesnt cost A, the more polite it ought to be. So when B says that he has
the ability to provide the wanted information,
or that it wouldnt bedifficult for him to do so, that should benefit A a great deal without any cost.
These indeed were the two most polite categories. On the other hand, telling
A that he intends to give the information
regardless of her wishes, or that he
is obligated to give it to her, or that she has his permission to ask him for
it, or that she has forgotten to ask for it - all these cost A, and the assertions
should be correspondingly
less polite. Indeed, they were.
Experiment
3c
Method
Thirty students were each given 54 pairs of requests and responses and were
asked to rate the politeness of each response on a 1 to 7 scale.
The 54 requests were the same as those used in Experiment
1, with three
examples for each of the 18 types of requests in Table 1. For each request
we composed three plausible responses. One had a full literal move followed
by the requested information;
a second had only a half literal move, either
yes or 110, whichever was appropriate
for compliance; and a third consisted
of the requested information
alone. The three responses to Could Iask you
who ute all the eggs.? were: (1) Yes, you CUII. It was my boyfriend. (2) Yes.
It wus my boyfriend. (3) It was my boyfriend. These will be called the full,
half, and null literal responses, respectively. As mentioned earlier, we used
the indicative CUIZ,will, and may instead of the subjunctive could, would,
and might for the literal moves, except for Would JOU mind? and Would it
be too much trouble.:. where we retained would.
The 54 responses each student rated consisted of one full, one half, and
one null literal response to each of the 18 types of request in Table 1. The
assigntnent
of the full, half, and null responses to the 54 requests was
counterbalanced
in a Latin square design over three groups of ten subjects
each. The 54 requests paired with their responses were typed in random
order 18 to a page, the request on one line and its response on the next,
and the pages were shuffled for each student.
131
type
Response
type
Category
means
____
FUll
Half
Null
Means
Permission
2.61
2.80
2.93
3.30
2.90
3.21
3.83
3.63
3.60
3.18
3.11
3.27
3.19
Imposition
2.80
2.70
3.51
3.20
4.03
4.00
3.47
3.30
3.38
Ability
2.53
2.83
2.87
2.87
3.30
3.13
3.20
3.21
3.90
4.20
4.13
3.13
3.16
3.39
3.40
3.29
3.31
Memory
3.17
3.23
2.93
4.07
4.30
4.10
3.93
4.13
2.90
2.80
3.10
3.10
3.60
3.67
3.80
3.90
3.68
3.58
3.50
3.96
3.22
3.17
3.31
3.39
3.68
Commitment
3.57
3.40
3.63
3.67
3.17
3.03
3.03
3.17
Obligation
Shouldnt
3.21
3.33
4.10
3.57
3.57
2.98
3.26
3.92
3.38
Overall means
3.27
As predicted,
the mean response politeness for the 18 request types
(column 4 in Table 6) correlated very highly with the mean conventionality
for the same 18 requests (Table 4). The correlation was 0.72, min F (1,76)
= 19.40, p < 0.00 1. The variance in response politeness not accounted for by
conventionality
was not significant, min F (16,76) = 1.13. Although the
132
correlation
between response politeness and request politeness (Table 2)
was a moderate 0.42, when conventionality
was partialled out, this correlation reduced to a negligible 0.09. There was virtually no correlation,
0.19,
between response politeness and the politeness of the literal assertion (Table
5). The main predictor of response politeness was conventionality
: the more
conventional
the request, the more polite it was for B to provide the wanted
information.
Overall, the half and full literal moves - for example, Yes and Yes, Ican
~ each added politeness to the response with no literal move. The half
literal moves added an average of 0.67 units, and the full literal moves
another 0.29 units. Both increases were significant, min F (1,75) = 16.9 1,
p < 0.001, and 2.97, p < 0.05, respectively. These data reinforce Experiment 2 in showing that the more complete
the literal move in general,
the more polite the response.
The politeness
added by the full literal move, however, varied from
0.06 units for Do I know.7 to 1.37 units for Can you tell me? and Could
you tell me? As predicted,
this variation was highly correlated with the
politeness
of the assertion made by the literal move (see Table 5). The
correlation
was 0.73, which is highly significant, F (1,17) = 19.39, p <
0.001. The conventionality
of the request, however, was also moderately
correlated,
0.43, with the increase in politeness from the literal move,
F( 1,17) = 3.48, ns. With both assertion politeness and conventionality
as
predictors, the multiple correlation is 0.8 1.
Which part of the full literal move accounts for these variations in added
politeness - the affirmation
or denial _res or no, or the elliptical assertion
I carz, You muy, or whatever? Let us call these two parts yes/no
and
assertion fragment.
The increase from the yes/no alone correlated a negligible 0.22 with assertion politeness. But the increase from the assertion
fragment correlated
0.70 with assertion politeness. This correlation
is only
slightly less than the 0.73 correlation
for the increase from the full literal
move. The correlations
for conventionality
follow the same pattern, being
0.12 and 0.42, respectively.
It is the assertion fragment, then, that seems to
account for how much politeness is added by the full literal moves.
Discussion
According to these results, the politeness of responses to indirect requests
fits the attentiveness hypothesis. First, the more conventionally
a sentence is
used for making requests, the clearer it should be that A wants certain information, and the more polite B should be to provide it. That was confirmed.
For example, giving the requested
information
was more polite for the
133
conventional
Can you tell me? than for the less conventional Have Ialready
asked you? Second, the more polite it is to assert what is literally being
asked, the more polite it should be to add the literal move. This too was
confirmed.
Adding a pleasant Yes, I CUMin response to Gzrz you tell me?
increased politeness more than did adding an insulting No, you dont in
response to Do I know?
Literal moves like Yes, I Carl and No, you don t, we noted, divide into two
parts - the yes/no and the assertion fragment. It was largely the assertion
fragment that governed how much politeness was added. There are two
possible reasons for this. The most obvious is that I can and You dont are
clearer than the bare yes or no about what B is asserting with the literal
move. A less obvious reason is that yes and no alone may be ambiguous.
Yes in response to Cizn you tell me? might indicate either Yes, I can tell
you, which is the assertion fragment, or Yes, Ill tell you if you like,
which is not. The second sense indicates a mere intention to comply, which
shouldnt vary so much from one request to the next.
These findings implicate literal meaning even more than before. If B wants
to respond to As indirect request politely, he must hear at least the
literal form of her request. Without that, he has no way of figuring out which
literal move to include.
But to account
for Experiment
3, he must
truly understand
her literal meaning. He needs this in order to decide
whether or not it would be polite to include the literal move. In short, he
is required
to use a multiple-meaning
rather than an idiomatic process.
Experiment
134
request, like Can you tell meP, she is very likely signalling that she doesnt
intend the literal meaning to be taken seriously - it is merely pro forma ~
and so B isnt expected to deal with it explicitly. But when she uses a less
conventional
form, like Have I already asked you.?, she may well intend the
literal meaning to be taken seriously, and if B is to be polite, he ought to
deal with it explicitly. This theory leads to a straight-forward
prediction:
The less conventional
the request, all other things being equal, the more
likely B will take the literal meaning seriously and the more likely he will
include the literal move.
But as we showed in Experiment
3, it isnt always so polite to include
the literal move, since this may make B sound presumptuous
or superior.
It wouldnt be particularly
polite to tell A that she doesnt know where
Jordan Hall is, which is what the literal move for Do I know? would do.
Accordingly,
the more polite the literal move is, the more likely it should
be included. But these considerations
come into play when B is thinking
of including the literal move anyway. That is, the predictions
based on
politeness of the literal move should merely modify the predictions based
on conventionality
that we just presented.
Finally, there is the ellipsis of the response. A complete sentence like
It is up the street is ordinarily
deemed more polite than an incomplete
one like Up the street (see Experiment
2). If people trying to be polite
know this, then they ought to turn incomplete sentences like Up the street
into complete ones like It is up the street.
Method
Thirty
Stanford
University
undergraduates
were
paired with responses
that provided
only the
Example:
A. Can you tell me where your parents
B. Theyre in the front row.
each given
information
54 requests
requested.
are sitting?
For half the students, all of Bs responses were expressed in complete sentences, as in this example. For the other half, all of them were expressed in
fully appropriate
but incomplete
sentences, such as In the front row. The
students were asked simply to revise each response to make it more polite
and to write their revision on the blank line below Bs response. The 54
requests were the same as those used in Experiments
1, 3a, and 3c. They
were typed, in the format just given, six to a page on nine mimeographed
sheets in random order, and the nine pages were given to each student in a
random order.
135
The most obvious outcome was that there was an almost universal tendency
to fill out the information
requested. Fully 92% of the incomplete sentences
given to the one group of students were turned into complete sentences. And
although the complete sentences given to the other group of students could
have been turned into perfectly acceptable incomplete sentences (by revising,
for example, Theyre in the front row to In the front row), only 2% of them
were. Indeed, the sentences for both groups of students tended to be filled
out with material that was redundant with the request. Pronouns tended
to be turned into complete noun phrases, as when Theyre in the front row
was revised to My parents are in the front row, and missing verb phrases
tended to be filled in, as when My roommate did was revised to My roommate cut my hair. There was a strong consensus that to be more polite, one
should be clearer and more explicit about the information
provided. Otherwise, the two groups of students didnt differ reliably, and so for the
remaining discussion they will be lumped together.
Table
7.
Category
The most frequent literal moves and the percentage of people supplying a
literal move in responding to 18 types of requests (Experiment 4)
Request
type
Most Frequent
Literal Moves
Half
Full
Percentage
Literal Moves
__.
Permission
Sure.
Sure.
Yes.
49
56
41
Imposition
Not at all.
Not at all.
No, I wouldnt.
Of course, it wouldnt.
51
82
Ability
Sure.
Yes.
Sure.
Yes.
Sure
Yes,
Sure
Yes,
I can.
I can.
I can.
I do.
48
33
68
52
Memory
No.
No.
No.
Yes.
64
66
61
54
Commitment
Yes.
Sure.
Sure.
Sure.
Yes,
Sure,
Sure,
Yes,
41
48
52
56
Obligation
Shouldnt
Yes.
Yes, I should.
I .will.
I could tell you.
111tell you.
I do.
59
Although the bare responses presented to the students did not contain
literal moves, many of their revisions did. Each of the 1620 revisions was
checked for this feature, and the percentage for each request type is shown
in Table 7. These percentages provide rather striking confirmation
of our
predictions.
First, there was a 0.57 correlation between the percentages of
literal moves in Table 7 and the conventionality
ranks of each request type
from Experiment
3a (Table 4). This correlation
accounted
for a highly
significant proportion
of the variance among the percentages
in Table 7,
F (1,42) = 11.72, p < 0.005. Second, there was a -0.24 correlation between
these percentages
and the politeness ratings of the corresponding
literal
moves from Experiment
3b (Table 5). This correlation,
however, is spuriously low because of the correlation between conventionality
and politeness themselves.
With conventionality
partialled
out, as our prediction
requires, the correlation between the percentages in Table 7 and the politeness ratings of the literal move rises to -0.50.
This too accounts for a
significant proportion
of the variance, F (1,42) = 6.08, p < 0.05. The
variance not accounted for by these two factors is not significant, F( 15,42)
= 1.23. In short, the less conventional
the request, the more literal moves
were added, and then the more polite the literal move, the more often it
was added.
There was other evidence that the students were sensitive to the literal
meanings of the requests, some of it so obvious that it hardly needs to be
pointed out. In Table 7 are listed the most frequent half and full literal
moves that turned up in the revisions. These show that the literal moves the
students selected were selected because they were appropriate
to the literal
meanings of the requests. Consider the half moves first. Most of the requests
- 13 of them - were answered with yes or sure. The five that were answered
no were just the ones for which a negative answer was appropriate.
And
among these five, only Would you mind.? and Would it be too much trouble?
were provided with Not at all, which wouldnt have been appropriate
as
literal answers to the other three. Then consider the full moves. In them the
use of can, ma), will, do, didn t, haven 7, wouldnt, and shouldnt were
always appropriate
to the literal question
asked. May / ask you.7 was
answered with you ma) and not I will, while Will you tell me? was answered
with I will and not you may. Yet the auxiliary verb in the question - can,
may, havent, and the like - is not always appropriate
for a literal move of
compliance.
Accordingly,
Might I ask you? was answered with you rnaJ>,
not you might, and Would you tell me? with I will, not I would. The
students
didnt turn the literal questions into answers by a mechanical
algorithm. They chose literal moves appropriate
to what they intended to
convey.
137
This conclusion
is even more evident in the literal moves not listed in
Table 7. Consider those for the permission requests. Generally,
it isnt
terribly polite to assert You may ask me where Jordan Hall is. To soften
its authoritarian
tone, the students used marks of reassurance - of course,
certainly,
and sure - fully 64% of the time. Nor is it very polite, for the
memory requests, to assert I havent told you where Jordan Hall is. To
soften this move, the students often used such hedges as I may have forgotten to, I dont think I have, and Im not sure. These relieve the implicit
criticism that is otherwise
conveyed
by a bald ~10. For the imposition
requests, on the other hand, it is all right to assert It wouldnt be too much
trouble to tell you where Jordan Hall is, but even better to be more insistent, as many students were in such moves as No trouble at all, Certainly
not, and Of course not. The critical point is that there are several ways of
hedging, softening, and strengthening
literal moves, and they are not interchangeable.
Which way is appropriate
depends on the meaning of that
particular literal move.
These findings argue even further for a multiple-meaning
process, since
the literal meaning of the request was used in so many ways. It was used
initially by the students in deciding whether or not to make a literal move.
Then it was used in selecting the right form of that move and in deciding
how to strengthen or soften that move appropriately.
It seems difficult to
account for this constellation
of decisions with a process that used the
indirect meaning and nothing more.
General Discussion
It is time now to draw out the three main threads that have been running
through
these experiments:
the politeness of requests, the politeness of
responses to requests, and understanding
indirect requests.
The politeness
of indirect
requests
JQU ?nindP She can give B the chance to say that he is unable to carry out
the request, as in chn JYIU tell me? And so on. These devices are graded in
their costs and benefits, and their politeness follows suit.*
This neat picture is complicated
by conventionality.
If literal meaning
were the sole determinant
of politeness, then Cm you tell me.7 and Are you
able to tell me?, whose literal meanings are roughly synonymous,
ought to
be equally polite. But they arent. While both of them ask B whether or not
he has the ability to give the wanted information,
Are you uble to tell me?
signals that A more likely intends the question to be taken seriously and
expects B to respond with a literal move (Clark, 1979, Experiment
3). As
literal meaning is a deliberate
request for another piece of information,
which should cost B something.
So Are you able to tell me,7 should be
slightly less polite than Cizn JOU tell me? Similar logic applies to the other
categories of request types too.
In an informal experiment similar to Experiment
1, we asked ten students
to rank order for politeness the following indirect requests (each of which
was completed with where Cadlestick
Park is):
1.
2.
3.
4.
5.
6.
7.
8.
2The request forms we used, of course, can take on ironic, sarcastic, or even impudent
meanings
when uttered in just the right contexts.
In assuming requests among acquainted
peers, the students
in our experiments
appear also to have assumed ordinary
contexts
in which the requests have their
usual meanings.
It is an important
question,
however, when and how these requests take on ironic,
sarcastic, or impudent
meanings.
of factors
139
of
of responses
140
or if he hedges
to tell
indirect
requests
141
times composed by people who had not computed the literal meaning. On
these occasions, the requests were understood in the same idiomatic way we
suggested How do you do? is ordinarily understood.
The critical question for indirect requests, then, is under what conditions
could an idiomatic process be used. Such a process requires two things.
First, it requires the form of the indirect request to be conventional enough
to be recognized as a request. This requirement is satisfied by many indirect
requests (see Clark, 1979). Indeed, the same requirement
is needed in a
multiple-meaning
process to account for how seriously the literal meaning is
to be taken. Second, it requires that, on the occasion on which the request is
uttered, politeness and other things associated with the literal meaning do
not matter to the listener. For indirect requests, it isnt obvious whether
this second requirement is ever satisfied.
Politeness almost always matters - if only by default. In our experiments,
it mattered a great deal since that was what the students were asked to judge.
But in ordinary circumstances,
it matters too. People appear to have strong
expectations
in each kind of circumstance
about the forms of request A
would ordinarily use. When asked for the time, for example, B might expect
the highly conventional
f%z you tell me the time?, which asks about his
abilities. When A uses a form he does not expect, regardless of how conventional it is, he takes her as signalling, by her contrast in form, a contrast in
meaning. If she had used Would you tell me the time?, querying his conditional intentions
instead, he should see that she had perhaps expected him
to tell her the time and was wondering why he hadnt. Unlike the contrast in
meaning between the idioms Hi and How do you do?, the contrast here is
signalled by the difference
in literal meaning. Our conjecture
is this: Any
contrast with the default, or expected, form of request indicates a contrast
in meaning; if B is ever to recognize that contrast, it must be on the basis
of the literal meaning via a multiple-meaning
process.
Even aside from politeness, highly conventional
forms of indirect requests
are not interchangeable
from one situation to the next. In asking B for his
middle name, for example, A could use the highly conventional
Could you
tell me your middle name? but not the equally conventional
Do you know
your middle name? The second request is odd because of its literal meaning,
which supposes that B might not know his middle name. There are probably
subtle contrasts like this between virtually any two indirect requests that
can be made in a particular circumstance.
To show that B uses an idiomatic
process in any of these circumstances,
we would have to show that he is
indifferent
to subtle distinctions
conveyed by the literal meanings - for
example, that he isnt stopped for even the slightest moment by the oddness
of Do you know your middle name? Such a hypothesis should be difficult to
prove.
142
Thus, the idiomatic processes, however promising they look at the outset,
should not be assumed too readily. In one field experiment
(Clark, 1979,
Experiment
l), 50 merchants were telephoned
and asked Could you tell me
the time you close tonight? Only four of them, or S%, included a literal
move in their response. One might be tempted to conclude that the other
92% had used an idiomatic process. Yet in another field experiment (Munro,
1977), students on the UCLA campus were approached and asked Could you
tell me the time?, virtually the same request. Of these, 57% included a literal
move, presumably
because the face-to-face
situation led them to be more
polite. One might now be tempted to conclude that people use an idiomatic
process except when they anticipate they will have to be particularly polite.
But if politeness is an inherent part in every interchange of this sort, as it
seems to be, it is more parsimonious to conclude that people use a multiplemeaning process regardless.
References
Abelson, R. P., & Tukey, J. W. (1963) Efficient utilization
of non-numerical
information
in quantitative analysis: General theory and the case of simple order. Annals of Mathematical Statisfics,
34, 1347-1369.
Bolinget, D. L. (1975) Aspects of language (2nd ed.). New York, Harcoutt Brace Jovanovich.
Brown, F., & Levinson, S. (1978) Universals in language usage: Politeness phenomena.
In E. Goody
(Ed.), Questions and politeness. Cambridge,
Cambridge University Press, pp. 56-324.
Clark, E. V., & Clark, H. H. (1979) When nouns surface as verbs. Lang., 55, 767-811.
Clark, H. H. (1973) The language-as-fixed-effect
fallacy: A critique of language statistics in psychological research. J. verb. Learn. verb. Behav., 12, 3355359.
Clark, H. H. (1979) Responding
to indirect speech acts. Cog. Psychol., 1 I, 430-477.
Clark, H. H., & Lucy, P. (1975) Understanding
what is meant from what is said: A study in convetsationally conveyed requests. J. verb. Learn. verb. Behav., 14, 56-12.
Gibbs, R. W. (1979) Contextual
effects in understanding
indirect
requests. Discourse
Processes, 2.
l-10.
Goffman,
E. (1955) On face-work:
An analysis of ritual elements in social interaction.
Psych., 18,
213 -231.
Goffman,
E. (1967) Interaction ritual: Essays on face-to-face behavior. Garden City, NY, Anchor
Books.
Gordon,
D., & Lakoff, G. (1971) Conversational
postulates.
In Papers from the Seventh Regional
Meeting, Chicago Linguistic Society, pp. 63 -84.
Green, G. M. (1975) How to get people to do things with words: The whimpetative
question.
In
P. Cole and J. L. Morgan (Eds.), Syntax and semantics, Vol. 3: Speech acts. New York, Seminar
Press, pp. 107-141.
Hetinget,
J. (1972) Some grammatical
correlates
of felicity conditions
and presuppositions.
Working
Papers in Linguistics (The Ohio State University),
11, l- 110.
Lakoff, R. (1973) The logic of politeness;
ot minding your ps and qs, In Papers from the Ninth
Regional Meeting, Chicago Linguistic Society, pp. 292-305.
Lakoff,
R. (1977) What you can do with words: Politeness,
ptagmatics,
and petfotmatives.
In A.
Rogers, B. Wall, and J. P. Murphy (Eds.), Procedings of the Texas Conference on Performanfives, Presuppositions, and Implicatures. Arlington, Va., Center for Applied Linguistics, pp. 79105.
143
R&sum6
Les demandes
indirectes peuvent etre formulees de facon plus ou moms polie. Par exemple Can you
tell me where Jordan Hall is? (P ouvez-vous
me dire oi se trouve Jordan Hall?) est plus poli que
Shouldnt
you tell me where Jordan Hall is? (Ne devriez-vous pas me dire oti se trouve Jordan Hall?).
Une approche
theorique
propose
que plus le sens litt&a_l de la demande
implique davantages
personnels
pour Iauditeur,
dans les limites du raisonnable,
plus polie est la demande. Cette p&diction
est confirm&e par IExperience
1.
Les reponses aux demandes
indirectes
varient aussi en politesse. Pour Can you tell me where
Jordan Hall is? (Pouvez-vous
mc dire oti se trouve Jordan Hall?) la reponse Yes, I can - its up the
street (Oui, je peux vous le dire, il se trouve en haut de la rue) est plus polie que Its up the street
(Cest en haut de la rue). Une extension
de la theorie permet de predire que plus celui qui rtpond fait
attention
i tous les sens impliques par la requete, plus la reponse est polie. Les Experiences
2, 3 et 4
contirment
cette prediction.
Avec ces preuves, nous proposons
que les gens calculent les sens directs et indirects des demandes
indirectes.
Cela est necessaire pour reconnaitre
quand le locuteur
est poli ou ne lest pas, et pour
pouvoir repondre poliment, impoliment
ou de facon neutre.
Cognition,
@Elscvier
8 (1980)
145 - 174
Sequoia S.A., Lausanne
- Printed
in the Netherlands
WILLIAM
Simon
of Chicago
TURNBULL
Fraser University
Abstract
Previous studies of semantic memory have overlooked an important distin&
tion among so-called property statements.
Statements
with relative adjectives (e.g., Flamingos are big) imply a comparison to a standard or reference
point associated with an immediate superordinate
category (a flamingo is big
for a bird), while the truth of statements with absolute adjectives (e.g., Flamingos are pink) is generally independent
of such a standard. To examine
the psychological
consequences
of this distinction,
we asked subjects in
Experiment
1 to verify sentences containing either relative or absolute adjectives embedded
in either predicate-adjective
(PA) constructions
(e.g., A tlamingo is big (pink)) or predicate-noun
(PN) constructions
(e.g., A flamingo
is a big (pink) bird), where the predicate noun was the immediate superordinate. Reaction
times (RTs) and errors for relative sentences decreased
when the superordinate
was specified,
but remained constant for absolute
sentences.
These data also suggest that the truth value of relative sentences
depends,
not just on the superordinate,
but also on a more global standard
for everyday, human-oriented
objects. Experiment
2 extends these results in
showing that ratings of the truth of relative sentences are a function of the
difference
in size between an instance and its superordinate
standard (e.g.,
between the size ofaflamingo
and that of an average bird) and the difference
between the instance and the standard for everyday objects. Experiment
3
replicated
these findings
using reaction time as the dependent
measure.
146
L.
J. Ripsand W. Turrzbull
Two major views have evolved about the way we remember the properties
of common
objects. On one hand, most theories of semantic memory
(e.g., Anderson, 1976; Collins & Loftus, 1975; and Kintsch, 1974) represent
properties
as unitary mental predicates. According to such theories, we are
for example, that flamingos are pink ~~
able to recall property information
because a predicate for pink is attached to the concept flamingo in longterm memory. A predicate may not be stored directly with every concept to
which it applies, and in these cases, recall of the property
may require
memory search. Nevertheless,
the predicates themselves are atomic, having
no underlying semantic structure.
On the other hand, theories of mental comparison
(e.g.. Moyer. 1973;
Paivio, 1975) imply that property
information
is calculated rather than
simply stored and retrieved as a unit. For example, these theories claim that
in order to decide whether a flamingo is larger than an eagle, we compare
their respective values along a mental scale for size. While we could store
the complex predicate is-lurger-than-an-eagle
with flamingo, this possibility
is seen as unlikely for both empirical and theoretical reasons. (Consider the
flamingos
enormous number of comparative statements we know to be true
are also larger-than-turnips,
larger-than-clothes-pins,
and so on.) Although
some comparison
theories allow relations to be stored intact (see Banks,
1977), property
information
is generally computed
rather than pre-stored
(in the terminology
of Smith, 1978).
There are several factors that could account for this difference in the way
properties are characterized.
First, different constructions
are involved since
in semantic memory experiments
subjects are asked to verify sentences containing simple (one-place)
predicates such as f+mingos
ure pink, while in
mental comparison
experiments
subjects verify two-place relations (Humingos are lurger than eagles). It may be that one-place predicates are prestored, and two-place predicates computed.
But a second, and perhaps more
important,
difference
is the type of properties that have been employed.
Semantic memory has focused on absolute adjectives (e.g., those denoting
color, such as pink), while mental comparison has employed relutive adjectives such as lurge (Katz, 1972, pp. 254- 26 1).
To see why the relative/absolute
distinction might be important, we begin
by describing some linguistic and logical differences between these adjective
types. Next, we consider possible psychological mechanisms for representing
this distinction.
Finally, we report three experiments
whose goal is to
examine these mechanisms. While the experiments are solidly in the semantic
memory tradition in using sentences of the form S-V-Adj, we explore the
differences
between relative and absolute properties
in Experiment
1 by
varying the adjective (e.g.. Flatningos are pink versus Flamingos are large).
147
In Experiments
2 and 3 we show that symbolic distance effects, like those
found in studies of mental comparison, can also be obtained in a semantic
memory context.
a.
b.
a.
b.
A
A
A
A
grasshopper
grasshopper
grasshopper
grasshopper
is
is
is
is
a
a
a
a
large
large
green
green
insect.
animal.
insect.
animal.
Despite the fact that grasshoppers are insects and insects animals, being a
large insect does not mean being a large animal; the attribution of a relative
property
like large does not automatically
generalize
to superordinates
(Vendler,
1968, p. 96). However, absolute adjectives like green do permit
generalization
of this sort, so that (2a) entails (2b) and, in addition, any
other sentence
in which a more inclusive superordinate
(e.g., object) is
substituted for the predicate noun.
One possible way to explain this difference
is to assume that relative
adjectives convey an implicit reference
to a norm or standard associated
with the modified noun. (This idea is traceable to Leibniz - see Wierzbicka,
1972 - and appears in the work of many modern semanticists, e.g., Bierwisch,
1967; Fillmore, 197 1; Katz, 1972; Langford, 1942; Ross, 1930; Sapir, 1944;
Vendler, 1968). Large insect in (la) means that the designated insect is larger
148
than some normal size for insects. Since what is the normal size for insects
will be different
from the normal size of animals and other objects, a
creature thats a large insect (i.e., large for insects) may be small relative to
other animals or objects. For this reason, (1 b) does not follow from (1 a).
On the other hand, standards for absolute adjectives do not shift (or do not
shift as much) from one noun class to another. So while objects may be
more or less green, what is green about insects will be green with respect to
other things as well. This means that (.2b) will be true on the basis of (2a).
For convenience
in discussing relative adjectives, lets call the implicit
norm the reference
point, and the associated category the reference
class with respect to the adjective in question. For example, in the phrase
large insect, insect provides the reference class and the normal size of insects
provides the reference point from which largeness is determined.
We note that the reference class is not always explicitly mentioned when a
relative adjective is used. To see how the reference class is determined in
such situations, we can examine the sentences in (3) and (4):
(3)
(4)
a.
b.
a.
b.
This insect
Insects are
This insect
Insects are
is small.
small.
is a small insect.
small animals.
When a singular term is the sentence subject, as in (3a), the noun itself provides the appropriate
reference class for the predicate adjective, so that (3a)
is synonymous
with (4a). This is not true, however, when the subject is an
unmodified
plural noun, as in (3b), as is clear from the fact that (3b) is not
equivalent to Insects are small insects. According to a proposal by Bierwisch
(1971) and Katz (1972), the appropriate
reference class in this situation is
the immediate
superordinate
of the subject. Assuming animal to be the
immediate superordinate
of insect, we predict that (3b) means the same as
(4b). Since this proposal for determining
the implicit reference class will be
important
in what follows, we will label it the Immediate
Superordinate
hypothesis.
This account of relative and absolute adjectives skims over many details
that would be required by a formal semantic theory (see R. Clark, 1970;.
Cresswell, 1976; Kamp, 1975; Parsons, 1972; Wallace, 1972; and Wheeler,
1972, for attempts at such a theory). For example, it may be an oversimplification to assume a strict dichotomy between relative and absolute adjectives, since it is difficult to tell for many items in which class they belong
(extreme
adjectives like gigantic and miniscule provide examples - see
Higgins, 1976; Huttenlocher
& Higgins, 1971). Intuitively,
there appears
to be a continuum
between relative and absolute types (Miller & Johnson-
149
150
a. An insect is small.
b. An insect is six-legged.
The Pre-storage model, on the other hand, verifies both sentences in the
same way (by retrieving predicates of insect) and therefore does not predict
a difference in time to confirm them.
Unfortunately,
though, differences
in frequency,
imageability,
and the
like confound the comparison of absolute and relative adjectives in sentences
such as (5a) and (5b), so a more indirect approach is necessary. One possible
test of the models that gets around this problem makes use of sentences of
the form An S is an Adj P, where the predicate noun P is the immediate
superordinate
of the subject noun S. For example, corresponding
to the
predicate-adjective
sentences in (5), we have the following predicate-noun
sentences:
(6)
To see why sentences of this type are helpful, consider first the Computation model. According to this approach, verifying both (5a) and (6a) means
retrieving the reference class animal. The two sentences differ only in that
(6a) specifies this class explicitly, while (5a) does not. Reading time for (6a)
will be longer than for (Sa) because of this extra word. But this disadvantage
1.51
for (6a) may be offset if mentioning the reference class decreases the time
needed to access it. However, verification of (Sb) or (6b) does not require a
reference class since the adjective six-legged is absolute. Adding the superordinate in (6b) merely increases the number of words to be processed,
so (6b) provides no advantage over (5b). Thus, the Computation
model predicts an interaction
between the syntactic form of the sentences (predicateadjective versus predicate-noun)
and adjective type (relative versus absolute).
By way of comparison,
the Pre-storage model does not predict an interaction for the sentences in (5) and (6). According to this model, retrieval
of the superordinate
is not needed to determine the truth of either (5a) or
(5b) since both the relative predicate (small) and the absolute predicate
(six-legged)
are stored with insect. Consequently,
adding the superordinate
in (6a) and (6b) is redundant
and should slow processing by an equal
amount.
Since these predictions
are independent
of factors like frequency,
they
seem worth testing, and we proceed to do so in the following experiment.
Experiment
152
reference class for relative sentences like (5a). Although they may not be the
most direct superordinates
in a scientific
taxonomy,
nevertheless,
they
appear to be the ones subjects would naturally use in verifying relative statements. This assumption seems to us to preserve the spirit of the Immediate
Superordinate
hypothesis.
To decide whether such sentences were true or false, we need to know
whether the normal value of the subject noun exceeds that of the superordinate with respect to the given adjective. We determined this by asking
another group of subjects to compare the referents of the two nouns along
a set of relative properties including size, width, thickness, height and length.
Method
Superordinate
generation
task
In the first preliminary study, we presented subjects with a list of nouns and
asked them to write below each a one-word category in which objects of that
type belonged. Subjects were told that for the word water they might write
liquid, and for steak, meat or food (neither of these examples appeared in
the experimental
list).
To compose the lists, we began with a set of 426 nouns, most of them
drawn from Battig and Montagues (1969) category norms. Nouns were
chosen from 24 of the Battig-Montague
categories (e.g., birds, flowers,
vehicles, etc.) that could plausibly be modified by both absolute and relative
adjectives denoting physical properties.
In addition to the items from the
norms, we used nouns from four categories of our own: building (e.g., skyscraper), car (e.g., Cadillac), rodent (e.g., rat), and road (with the instances
drawn from the local area). We sampled from 3 to 3 1 items from each category, attempting
to eliminate unfamiliar or ambiguous items and to maximize the range of properties
among the items represented.
Because of the
large number of items, we divided them randomly into two lists of 213 each.
The items on each list were themselves randomly ordered and typed in an
eight-page booklet. A blank line appeared beneath each item on which the
subject wrote his response.
We tested 22 subjects in a single group, half of them receiving one of the
booklets and half the other. Subjects were asked to complete the task in an
hour, and to help them keep pace, a signal was given after each eighth of an
hour had elapsed. The subjects were recruited by an advertisement
in the
University
of Chicago student
newspaper
and were undergraduate
or
graduate students or nonstudents
of comparable age. All of them were native
153
speakers of English, and each received two dollars for his participation.
(Subjects in the remaining parts of the experiment
were drawn from the same
subject pool, but none was involved in more than one part.)
Rating task
Sentence
verification
task
154
The experimental
procedure
was identical for the P4 and PN groups. A
subject initiated a trial by pressing the central button of a three-button
response panel with his right index finger. This button-press
brought a
fixation point into view on the left side of the tachistoscope
field where it
remained for two seconds. At the end of this interval, the stimulus sentence
appeared automatically
with its first letter in the position previously occupied by the fixation point. We had instructed the subject to read the sentence and to decide whether it was true or false. To register his decision, he
pressed one of the two outer buttons of the response panel with his right
index finger. For half the subjects in each group, the right-most button was
labeled True and the left-most button False, while for the remaining subjects these positions were reversed. Subjects were instructed
to execute
their response as quickly as possible, but without making any mistakes. The
second button-press
ended the sentence presentation
and stopped a clock
that had been activated by the onset of the sentence. In the interval between
trials (approximately
10 seconds), the experimenter
informed the subject of
his reaction time and of the accuracy of his response. The experimenter
then
recorded this information,
replaced the stimulus card, and signaled that the
subject could begin the next trial. At the very beginning of the experimental
session, the subject was given 12 practice trials (6 true and 6 false ones) to
acquaint him with the procedure. The practice sentences were of a variety
of syntactic
types, some similar to those of the experimental
sentences;
however, there was no overlap in the semantic content of the experimental
and practice sentences.
Half the PA and half the PN sentences contained relative adjectives, taken
from the six pairs of polar adjectives listed above. For each pair (e.g., hrgesmall),
we selected two of the superordinates
(e.g., instrument
and fruit)
that had been rated with them, and for each superordinate,
two items (e.g.,
flute
and xylophone
for instrument,
and plum and grapefruit
for fruit),
one of which had been rated greater than the average and the other less
than the average category member. We created an octet of sentences by
combining
these items with the two adjectives in both PA and PN form
(e.g., A plum is (al small (fruit), A plum is (a) hrge (fruit), A grapefruit
is
(a) small (fruit),
and A grapefruit
is (a) [urge (fruit)).
There were a total of
12 octets, and within each, four of the sentences were PA and four were
PN. Within these two syntactic types, two of the sentences were true and
two were false. On the 1 l-point scale, the mean rating for the greater-thanaverage items was 6.5 and the mean for the less-than-average
items was 3.5,
SE = 0.10 (recall that 5.0 had been designated
as the size of an average
member of the category). Median word frequency was 3.5 and 2.5 tokens
per million words for the greater-than-average
and less-than-average
items,
155
and 212 tokens per million for the relative adjectives (KuEera & Francis,
1967).
The remaining sentences were formed in a similar way from the absolute
adjective pairs fragran t-odorless, airborne-flightless,
dark-pale, curved-straight,
shiny-lusterless,
and hilly-flat. The same 12 superordinate
categories were
employed with the absolute adjectives as with the relative adjectives, two of
them being assigned to each adjective pair (e.g., tool and instrument were
assigned to curved-straight).
As before, two items were chosen from each
category so that one item was true of the first member of the adjective pair
and the other item was true of the second (e.g., pliers was chosen as the
curved tool and screwdriver as the straight tool, while tuba was the curved
instrument
and piccolo the straight one). Octets of sentences were again
generated by combining the two instances in each category with the two
adjectives (e.g., A piccolo is (a) straight (instrument),
A piccolo is (al curved
(instrument),
A tuba is (a) straight (instrument),
and A tuba is (a) curved
(instrument)).
Though drawn from the same categories, the individual items
were different from those used with the relative adjectives. The median word
frequency of these nouns was 5.5 tokens per million words, and the median
frequency for absolute adjectives was 5.0.
Each sentence was typed in lower case Orator letters on a white 6 X 9
inch card. The length of the PA sentences varied from 32 to 65 mm (2.0 to
4.1 degrees of visual angle) while the length of the PN sentences varied from
55 to 85 mm (3.4 to 5.3 degrees). The sentences measured 3 mm (0.2
degrees) vertically. PA and PN sentences were separately randomized at the
beginning of the experiment,
and each set was reshuffled after it had been
presented to a subject.
Forty-eight
subjects participated
in the sentence verification
task, half
in the PA and half in the PN group. All of the subjects were right-handed
members of the subject pool described above. The experiment
took about
45 minutes to complete (including a short break after the first 48 sentences),
and subjects were paid two dollars each.
Our principal interest is in mean correct RTs and error rates for relative and
absolute adjectives. Figure 1 presents these data separately for the PA and
PN groups. We note first of all the large error rates in this experiment,
averaging 19.3%. An error rate of this size is, of course, unlikely to be caused
156
54
2050
4Oi
30.
B
:
E
F
g
zo-
The Computation
model predicts that the difficulty in verifying sentences
with relative adjectives should be greater for PA than PN constructions.
In
line with this prediction,
error rates increased from 21.3% for relative PN
sentences to 28.6% for relative PA sentences. At the same time, errors were
very nearly constant for absolute adjectives across the PN condition (13.3%)
and the PA condition (14.1%). The RTs exhibited a similar trend. With relative adjectives, subjects took 1920 msec to verify PN sentences, but 1980
msec to verify PA sentences. However, with absolute adjectives, RTs were
about equal: 1958 msec for PN sentences and 1955 msec for PA sentences.
To assess these effects, we carried out analyses of variance on the errors and
RTs with both subjects and sentence octets serving as random effects (H.
Clark, 1973; Winer, 197 1, p. 375). In these analyses, the interaction between
syntactic form and adjective type was significant in the error data, but not
in the RTs. (For errors, SE = 1.5%, F (1,3 1) = 5.35, p < 0.05, where F is
the quasi-F ratio -Winers F. For RTs, SE = 3 1 msec, F (1,24) = 1.36, p
> 0.10).
Error rates were larger for sentences with relative adjectives (24.9%) than
for those with absolute adjectives (13.7%), SE = 1.6%, F( 1,13) = 12.96,
p < 0.01. The Computation
model easily accommodates
this difference since
it employs a more complex (and, presumably, a more error-prone) process in
handling relative properties. However, as we remarked earlier, this difference
is confound.ed
by imagery and other variables. Moreover, no comparable
difference appeared in the RTs. Subjects took 1950 msec to verify relative
adjectives and 1956 msec to verify absolute adjectives, SE = 30 msec,
F < 1. There was no significant main effect of syntactic form in either the
error data (SE = 1.9%, F( 1,5 1) = 2.5 1, p > 0.10) or the RTs (SE = 85 msec,
F < 1). Neither dependent measure showed a reliable effect of the sentences
truth (SE = 0.9%, F < 1 for the errors, and SE = 13 msec, F( 1,27) = 1.70,
p > 0.10 for the RTs) nor any interaction
of truth with syntactic form or
adjective type.
Although the Computation
model is consistent with these data, the high
error rates are grounds for suspicion, and they prompt us to take a closer
look at the relative items, where most of the errors arise. One cause of the
errors becomes apparent if we put ourselves in the place of subjects verifying
the following sentences:
(7) a. A spruce is tall.
b. A dogwood is tall.
c. A poinsettia is tall.
d. A petunia is tall.
Since a spruce is taller than the average tree and a poinsettia taller than the
average flower (as determined
by our ratings), (7a) and (7~) should be true
according to Bierwischs and Katzs theory (the Immediate
Superordinate
hypothesis),
which we used to construct our stimuli. Similarly, since dogwoods are shorter than average trees and petunias are shorter than average
flowers, (7b) and (7d) should be false. But while this analysis seems quite
reasonable for (7a) and (7d), there is something odd about affirming that
poinsettias
are tall while denying that dogwoods
are tall. To put this
another way, the Immediate Superordinate
hypothesis
stipulates that the
truth value of the examples in (7) should be the same as that of the corresponding PN sentences in (8):
(8) a. A spruce is a tall tree.
b. A dogwood is a tall tree.
1.58
c.
d.
But intuitively, the truth value of (8b) and (8~) is more clear-cut
counterparts
in (7b) and (7~).
Consistency
effects
than their
norm
These observations suggest that the large number of errors for relative PA
sentences may have been due to faulty linguistic analysis rather than to
subjects mistakes.
One possible source of difficulty for (7b) is that while
dogwoods are shorter than average trees, theyre nevertheless taller than the
size of most objects with which people typically interact (including the size
of people themselves). In the same way, poinsettias in (7~) are tall flowers,
but are short for most everyday objects. If subjects apply this alternative
reference point in deciding on the truth of (7b-c),
we would expect their
decision to differ from the experimentally
defined answer. In (8b-c), however, the reference class is explicitly provided, and subjects responses should
coincide with our appointed answer. Use of a human reference point for
relative adjectives has been discussed by Suzuki (1970), who notes that a
sentence like Giraffes have long necks is often understood
to mean that
giraffes have longer necks than people. A similar anthropomorphic
standard may apply to the relative adjectives in our own PA sentences (see also
Miller & Johnson-Laird,
1976, p. 324). In the remainder of the paper, we
refer to this standard as the object reference point, meaning the normal
size of everyday, human-oriented
objects.
Sentence sets like (7) allow us to examine our data for the effects of this
alternative reference point. For Sentences (7a) and (7d), a decision based on
the immediate
superordinate
will be the same as one based on average
objects, since spruces are tall and petunias are short with respect to both
standards. We will therefore label such instances as consistent
items. However, as we have seen, dogwoods and poinsettias are tall with respect to one
reference point and short with respect to the other, and we will call instances
of this type inconsistent
items.
For sentences containing relative adjectives, 11 of 12 octets in our experiment contained one consistent and one inconsistent item (where consistency
Feedback
during the experiment
may have caused subjects to change their response criteria in
the direction
of the Immediate
Superordinate
hypothesis.
Thus, the obtained error rates may be conservative estimates of the proportion
of trials on which subjects judgments
differed from the experimentally
defined correct
answers. Underestimates,
however, are unlikely to affect the conclusions
that we draw from these data. (See Experiment
3 for a different approach to the feedback problem.)
159
was determined
by ratings to be described in Experiment
2). Figure 2
exhibits RTs and error rates for these relative octets, with consistent and
inconsistent items plotted separately. Looking first at the error rates, we find
an increase for inconsistent
items from 24.8% in PN sentences to 36.9%
in PA sentences. Errors on consistent items increase only slightly from 17.6%
for PN sentences to 19.9% for PA sentences. Although this interaction
is
not significant (SE = 3.3%, F( 1,22) = 2.59, p > O.lO),
these data suggest
that the PA-PN difference
observed in Figure 1 for relative adjectives is
largely attributable
to inconsistent items. The same conclusion can be drawn
from the RTs. Inconsistent
items exhibit an increase from 1934 msec (PN
sentences) .to 2074 msec (PA sentences),
while consistent items increase
from 1896 msec (PN) to only 1905 msec (PA), SE = 40 msec, F( 1,49) =
3.17,0.05
<p < 0.10.
Figure 2.
Reaction time and error rate far sentences containing relative adjectives as a
function of syntactic form and consistency, Experiment I.
21001
50-
2050
40-
30I
!
w
Inconsistmt
&
;
20\Consistent
Predicate
Adjective
Syntactic
Implications
Predicate
Adjective
Predicate
Noun
Predicate
NOUll
Form
In the light of the consistency effects, our brief for the Computation
model
appears weaker than at first. Relative PA sentences were indeed difficult to
verify
in this experiment.
but the difficulty was largely due to inconsistent
items. Sentences with consistent
items resemble sentences with absolute
adiectives in showing little, if any. difference in errors or RTs between their
PN and their PA versions (compare
the slopes for absolute sentences in
Figure
1 to those for consistent
relative sentences
in Figure 2). The
Computation
model has trouble explaining this resemblance.
Of course,
these results are also no comfort to the Pre-storage model, although the
problem with this theory is of the opposite sort. Since this model treats
relative and absolute adjectives identically, it can predict the data for consistent relative and absolute sentences, but founders in explaining the inconsistent items.
Our results suggest that a revised Computation
model should incorporate
two comparisons in handling relative sentences, one to the superordinate
and
the other to the object reference point. The simplest assuInption is that both
comparisons are carried out for PA sentences, while only the superordinate
comparison
is executed
for PN sentences.
But in most serial or parallel
models, this would predict faster RTs for PN than for PA sentences, a difference that amounts to only 9 msec for consistent items in Figure 2. A
second possibility is that both comparisons
are performed
for PA and PN
sentences alike. With respect to consistent
items, these two comparisons
would produce the same outcome (a consistent item, by definition. exceeds
both reference
points or falls short of both), and no further processing
would be needed for either syntactic form. But for inconsistent
items, the
outcomes differ, forcing a subiect to choose between them or to combine
them according to some decision rule. We can imagine that such a decision
is easier for a PN sentence, since its superordinate
signals that the result of
the object comparison
can be ignored. If so, this would account for the
PA-PN difference for inconsistent
items. Of course, this modified Computation model is little more than a description of the data. but as we shall see,
it yields predictions
that will prove useful in the following experiment.3
Multiple reference points can also help us bail out the Pre-storage model.
Along these lines, we can continue to assume that the property list of a
consistent item contains a single predicate for each relative property (e.g.,
3The consistent
items are something
of a problem for this second model as well. We earlier assumed
that the superordinates
in PN sentences facilitate access to the corresponding
concept. To explain the
data for consistent
items, we must either assume that this advantage
is canceled by superordinate
reading time Cb = d > 0 in Footnote
1) or that both effects are nil (b = d = 0). Alternatively,
we could
posit a compromise
Computation-Pre-storage
model in which absolute and consistent
relative information is stored and inconsistent
relative information
computed.
But if anything,
this latter model is
more ad hoc than the one outlined above. and we stick to the former alternative
in the discussions
that follow.
I61
Experiment 2
The main feature that distinguishes the Computation
model from the Prestorage model is its extra comparison step. Previous studies have identified
factors that affect this step, and if we can show that these factors also affect
verification of sentences with relative adjectives, we will have obtained some
prima facie support for the Computation
model. This is the strategy that we
pursue in the experiments
reported below, using symbolic distance as the
critical factor. In Experiment
2 we look for evidence of this effect in ratings
of the truth of relative sentences, while in Experiment
3 we use reaction
time data.
Symbolic
distance predictions
162
163
Method
We began with 172 of the items from Experiment
1 for which there had
been good agreement about immediate superordinates
(one noun from the
original set was inadvertently
omitted). For each of these items, separate
groups of subjects were asked to provide ratings of the following variables.
(a) the size of the items with respect to an average member of their immediate superordinate
(e.g., the size of apples with respect to the average
fruit); (b) the size of the items with respect to average objects; (c) the truth
of PA sentences of the form IS are big (e.g., Apples are big); (d) the truth of
PN sentences of the form Is are big Ss (e.g., Apples are big fruits). The first
two of these measures were used to determine symbolic distance. The truth
ratings in (c) and (d) serve as dependent variables.
For the size ratings of Task (a), we followed the procedure described in
Experiment
1 (see the section entitled Rating Task). The procedure for
Task (b) was somewhat similar; however, in this case, subjects received a
dittoed set of instructions together with a computer-generated
list consisting
of noun-adjective
pairs (e.g., apple-big).
The instructions asked subjects to
compare each item to an object of average size with respect to the indicated
property. The subject used an 1 l-point scale for his response, with 0 designated much less than average, 5 average, and 10 much more than average.
All of the 172 nouns were paired with the adjective big, but a number of
these nouns were repeated with other relative adjectives. These additional
pairs were used to examine the consistency of the items in Experiment
1,
as described in the previous Results and Discussion section. Altogether the
list contained 306 pairs. Order of the pairs on the list was randomized in a
new order for each subject. (This was true as well for the lists associated with
Tasks (c) and (d) below.)
In the remaining tasks, subjects evaluated the degree of truth for a set of
PA sentences (Task (c)) or PN sentences (Task (d)). All 172 nouns appeared
with the adjective big, but as in Task (b), some of the nouns were repeated
with other adjectives. The ratings were made on the usual 1 l-point scale,
with 0 denoting definitely false and 10 denoting definitely true.
Forty-eight
subjects provided these ratings, twelve in each task. The subjects were part of the same population as those in Experiment
1, but had not
taken part in the earlier study. They were tested in groups of from I to 12
individuals and were paid $2.00 for an hour-long session.
164
Experiment
105
Method
We again used the adjective big to test the above predictions. To one group
of subjects (analogous to the PA group of Experiment
1) we presented plural
nouns (e.g., apples) singly on a CRT. On each trial, the subject was to decide
whether the sentence frame _ are big would be true if the noun was substituted in the blank. A second, PN, group viewed the same nouns this time
accompanied
by their immediate superordinates,
(e.g., apples-jkuits),
and
they were asked to determine the truth of the sentence L____ are big __
when the instance filled the first slot and the superordinate
the second (the
frames themselves did not appear during the trial).
To select the stimuli, we employed the ratings collected in Experiments
1
and 2. From the pool of 172 nouns used in the second experiment,
we
selected 1 12 according to the following criteria. First, for each item, both
the predicate-adjective
and predicate-noun
sentences
that contained
it
received a mean truth rating greater than 5.00 (True items) or both received
a mean rating less than 5.00 (False items). This rule was adopted to simplify
the analysis of the results, since the truth of a given item is fixed across PA
and PN groups. The True items and False items were then separately classified as large or small with respect to rated superordinate
and objecr size.
This classification produces eight categories (e.g., true, small superordinate
size, large object size; false, large superordinate
size, small object size; etc.),
166
and the final set of items was chosen with an equal number of instances
(viz., 14) in each category. For the True instances, mean superordinate
size
was 8.24 for large items and 6.26 for small ones; mean object size was 6.52
for large and 4.72 for small items on the 0 to 10 scale (SE = 0.24). For False
instances, mean superordinate
size was 4.38 for large and 2.52 for small
items, while object size was 3.56 for large and 1.77 for small items (SE =
0.23). Median word frequency
for the True instances was 9.5 tokens per
million words for small superordinate
size, 5.5 for large superordinate
size,
4.5 for small object size, and 26 for large object size. The corresponding
frequencies for False instances were 0, 2, 0, and 3.5 tokens per million. To
these critical items we added 34 fillers so that for most (9 of 16) superordinate categories, half of the instances in each were True and half were
False. Over the entire set of 146 items there were also an equal number of
True and False instances.
This set of instances was presented to subjects four times in successive
blocks of trials, two during a first days session and two on a following day,
with stimulus order randomized
anew at each presentation.
The procedure
during a trial was similar to that of Experiment
1, but with a few minor
changes. The subject was seated this time at a CRT terminal with a response
apparatus that consisted of a button at his left and three buttons about 18
mm apart on his right. He initiated the trial by pressing the left-hand button
with his left index finger, and for a 2 set interval thereafter he saw the word
ready presented on the screen about 400 mm away. At the end of the
warning interval, the ready signal was replaced by either a single instance
(for the PA group) or a superordinate-instance
pair (for the PN group) with
the superordinate
just above the instance. The subject made his true or false
decision by moving his right index finger from the center of the three
buttons on the right to one of the neighboring buttons. The position of the
True button was at the right of center for half the subjects in each group
and at the left for the other half. The response terminated the display and
was followed by a 2 set period in which the reaction time for that trial (but
no indication of accuracy) was presented to the subject as feedback. At the
end of a session (i.e., two blocks of trials) the experimenter
informed the
subject of both his mean reaction time and error rate. Delaying accuracy
feedback
until the end of the session was intended
to discourage rote
learning of the assigned truth value, while at the same time encouraging
correct responses. The experiment was preceded by a 20-trial practice session
during which subjects were asked to press the appropriate button in response
to the word true or false.
The PA and PN groups consisted of eight subjects each. These subjects
were right handed and belonged to the same subject pool as those of Experi-
167
ments 1 and 2; however, none of them had been involved in the previous
experiments.
They received $4.00 apiece for participating,
plus a 50 cent
bonus for each block in which their error rate was less than 10 per cent.
Subjects received an average of $4.47.
168
Table 1.
Mean Reaction Time (msec) and Percent Errors (in ParenthcsesJ for Prediute
Adjective (PA) and Predicute Noun (PN) Sentences in Experiment
3, b.v
Truth, Superordinate Size, and Object Size.
SUpUordinate
Size
True PA Sentences
..__~~~~~
Object
ITake
PA Sentences
Objwt
SIX
True PN Sentences
SIX
Object
Size
I.&e
PN Scntenccc
Object
Size
Small
L31pJ
SIIl~ll
La1g:u
Small
Lllrgc
SInall
LClQ!C
1175
(31.U)
1CD.)3
(Y.6)
922
(2.0)
1142
(1 X.5)
135
(27.2)
112
(18.1)
6Y1
(3.8)
688
(12.7)
10x0
(13.6)
988
(1 1.6)
~__ __-
984
(5.6)
1189
(17.8)
656
(6.0)
634
(5.6)
716
(16.5)
763
(2 1 SJ)
_
.~_
truth. examined above, should be larger for the PN than for the PA group,
while the object size by truth interaction
should show the reverse effect.
Turning to the data, we find that the relevant difference in reaction time for
superordinate
size is not reliable, though it shows a trend in the predicted
direction (S&subjects)
= 29 msec, SE(items) = 24 msec, min F < 1). The
errors show a somewhat stronger effect, with the interaction increasing from
4.6Yc for the PA group to 13.7% for the PN group; however, this effect is
only nlarginally
significant
by the min F test (XQsubjccts)
= 2.6(,%,
SE(items) = 2.0%, nzin I;( 1.34) = 3.73, 0.05 < p < 0.10). The object size
predictions,
however, are clearly confirmed since the size of the interaction
is 172 msec for the PA group and only 22 msec for the PN group (SE(subjects) = 27 msec, SE(items) = 24 msec, min F( 1,38) = 9.32, p < 0.01).
The error rates show a parallel difference, though in this case not a significant one (SE(subjects)
= 2.8%, SE(items) = 2.0S, min F( 1,31) = 2.15, p >
0.10).
Taken together, the above results provide rather strong support for the
Computation
model. Moreover, by partitioning the items, we have been able
to show effects of both superordinate
and object size when these factors
vary orthogonally.
General
Discussion
Computarion
versus Pre-Storage
ModcJls
The Computation
model evolved from the basic idea that the truth of sentences with relative adjectives is determined
by mental comparison. For the
169
sentence Spruces are tall, this would mean comparing the stored height of
spruces with that of its immediate superordinate
tree. However, the results
of Experiment
1 led us to modify this assumption by suggesting that two
comparisons
were involved - one to the normal value of the superordinate
and the other to a normal value for everyday objects. Experiments
2 and 3
lent some support to this prospect. RTs, errors, and truth ratings all showed
effects of symbolic distance to both the superordinate
value, and to the
object value as well.
The Pre-storage model stacks up less well against the evidence. While it
was able to explain the results of the first experiment on the assumption that
two relative properties are stored, it ran into difficulties in accounting for
the symbolic distance effects in Experiments
2 and 3. Of course, our results
do not imply that relative properties are never pre-stored; what the evidence
rules out is pre-storage for all relative properties of common object concepts.
Although the results favor a Computation
approach, there are a number of
residual problems with such a model that we should consider carefully. One
of these concerns the inefficiency
of Computation,
for it seems redundant
to calculate the truth of a relative sentence in the elaborate manner that the
model dictates. Why not store the result of an initial computation
once and
for all so that it can be referred to as needed? The question of efficiency,
however, depends on the relative costs attached to storage and processing.
If storage consumes a large share of the systems resources, it may prove
more efficient
to store a minimal amount of information.
By analogy,
mental arithmetic would be computationally
easier if one memorized
the
multiplication
table for all pairs of numbers less than 100. The fact that few
of us do so indicates that computational
simplicity must trade against
storage economy. Furthermore,
while storage is not out of the question for
the kinds of sentences considered here, we should remember that relative
information
is used in other ways as well - for example, to compare two
instances (A spruce is taller than a refrigerator) or to compare an instance to
a metric reference point (A spruce is more than six feet tall) or to a contextually
established reference point (Spruces are the tallest trees on this
block). Since there is an unlimited number of such propositions,
not all of
them can be pre-stored. Given that computation
is needed in these cases,
it would not be surprising if a similar process were applied to sentences such
as those considered here.
One can grant the plausibility of a computation
process, however, and still
object to the model outlined above. In particular, the idea of two distinct
comparisons seems odd, since in functional terms a single comparison would
be easier to perform and would simplify communication
about relative
facts. It may be possible to formulate the Computation
model in a way
170
that omits the object comparison and that is still consistent with the experimental results. For example, we can suppose that instead of the double
comparisons,
a subject weights the result of a single superordinate
comparison by the absolute size of the instance, with instances at the extremes of
the size continuum
receiving high weight. But of course in these terms, the
question then becomes why any weighting is needed in determining the sentences truth value. Note, too, that something akin to an object reference
point is still required in this alternative model to decide what constitutes
the upper extreme of a dimension like size that is unbounded above.
A second possibility
is that the object comparison
be explained away
as an artifact of the experimental
situation. In all of the studies reported
here, subjects received a randomized set of instances drawn from a variety
of categories, and the range of instances may itself provide a context against
which any given instance will appear big (tall, thick, and so on). We can
think of this as a type of adaptation
level that could be absent in more
naturalistic settings.
While we have no firm evidence against the adaptation theory, Experiment
3 provides some suggestive data. If the effect of object size is due to adaptation, we should find that this effect increases over the four blocks of trials.
But in fact, the opposite trend appears in the results: the crucial interaction
of object size and truth decreases steadily (though not significantly)
across
blocks. Moreover,
there are independent
reasons why an object reference
point might be important.
First, for very atypical instances, the immediate
superordinate
may be uncertain or inaccessible. Second, even if the immediate superordinate
is obvious, its reference point may not be. For example,
the superordinate
size of a category like weapon will vary greatly depending
on which instances we are willing to include in this category (the size will be
quite large if such things as missiles are included). Both problems can be
avoided by using the object reference point.
Properties
in Semantic
Memory
While our experiments have tried to determine the status (pre-stored or computed) of relative properties, we have simply assumed that absolute properties are pre-stored. However, it is possible to challenge this assumption, and
in fact, there are several good reasons for doing so.
First, if we consider properties like being non-pink or being-a-resident-ofa-state-beginning-with-l,
then it becomes clear that not all absolute properties can be pre-stored. Non-pink is an absolute property if pink is, but it is
unlikely that concepts such as gruss and snow contain non-pink in their pro-
17 1
perty lists. While we could memorize the fact that grass is non-pink, we need
not do so, but can infer it from other sources of information.
Second, it is easy to imagine how even common absolute properties could
be computed
rather than stored. In answering the question Is a banana
yellow? we may compare the hue of bananas with some prototypical
yellow
in much the same way as we would compare the size of a banana in determining whether it is big. Te Linde and Paivio (1979) have obtained clear distance effects when subjects must determine the similarity between color chips
and a named color. Stephens (Note 1) has also found distance effects for
absolute properties of named objects (by asking questions like Which is more
yellow - a Iemon or a banana?) that parallel those for relative properties.
These possibilities suggest that the substantive difference between relative
and absolute adjectives may depend, not on whether they are computed or
pre-stored,
but on the kind of computation
involved. In this respect, the
notion of two reference points provides one way that this difference might
be framed. As a first approximation,
we can suppose that adjectives vary in
the importance
attached to the superordinate
and object points during the
comparison.
Relative adjectives would depend most on the superordinate
point, but for reasons described above, influenced to a lesser extent by the
object point. Absolute adjectives, on the other hand, would be dominated
mainly by the object point so that judgments would be indifferent
to category membership
of the modified noun (cf. Wheeler, 1972). In this way,
we can account for the logical distinctions
proposed by Katz (1972) and
Vendler (1968) and, at the same time, explain our intuition of a continuum
between absolute and relative adjectives, as discussed earlier.
However, viewing adjectives in this way leads us to a number of difficult
questions. Clearly, not all properties can be computed, since if this were true
there would be nothing for the comparison process to operate on. But while
some core of data must be present to make the computations
possible, it
appears to be a very difficult task to get at these core properties. Perhaps
there is some underlying level of analysis in which all properties are prestored. But it is equally possible that pre-storage occurs with just a few landmark instances. For example, in determining
whether an object X is big,
we could try to recall its relation to some other object Y that we have
already determined
to be big. If X and Y share the same superordinate,
and
if we can show that X contains Y, or that Y is a part of X, or that X completely occludes Y when X is immediately in front of Y, then we can deduce
that X is also big. Such a process may be less elegant than a simple comparison, but it is not out of the running (see Banks, 1977).
Another
question
concerns
the ultimate
grounds for the distinction
between relative and absolute adjectives. Why, for example, are color terms
172
absolute
and dimensional
adjectives relative? The difference
apparently
does not lie in our ability to distinguish variation in the corresponding
qualities, for we can certainly discern degrees of yellowness. Number of underlying dimensions is also immaterial since big, which depends on three dimensions, is no less a relative adjective than tall, which depends on one. One
possibility is that the difference has less to do with the type of attribute than
with its distribution
among objects. For relative adjectives, variability of the
corresponding
property
may be greater between superordinate
classes than
within them, so that a comparison to the superordinate
reference point will
convey valuable information.
For absolute adjectives, variability
may be
equally great within as between superordinates,
so that such a comparison
is irrelevant. This question is far from settled, however, and the distinction
may depend also on the integrality
of the property
(Garner, 1974), the
salience of its component
dimensions (Kamp, 1975), or the way in which
the reference point changes with exposure to new instances (Wheeler, 1972).
Finally, it is important
to realize that absolute and relative adjectives
do not exhaust the range of adjectives in English. For example, we have not
considered
fictionalizing
adjectives like mythical that map real entities
like fake or pseudo that signal noninto imaginary ones or negators
membership
in a given category (R. Clark, 1970). Adjectives like these
probably
call for a very different
kind of analysis than the one offered
above. However, these items take us further from the traditional view of
adjectives as properties stored with the nouns they modify, and in this way,
they echo the message of the preceding studies.
References
Anderson, J. R. (1976) Language, memory, and thought. Hillsdale, N.J., Lawrence Erlbaum Associates.
Banks, W. P. (1977) Encoding and processing
of symbolic information
in comparative
judgments.
In
G. H. Bower (Ed.), The psychology
of learning and motivarion (Vol. 1 I), New York, Academic
Press.
Banks, W. P., and Flora, J. (1977) Semantic and perceptual
processes in symbolic comparisons.
J. exp.
Psychol.: Human Perception and Performance, 3, 278-290.
Battig, W. F., and Montague, W. E. (1969) Category norms for verbal items in 56 categories.
A replication of the Connecticut
category norms. J. exper. Psychol. Mono., 80, (3, Pt. 2).
Bierwisch, M. (1967) Some universals of German adjectivals. Found. Lang., 3, 1-36.
Bierwisch, M. (1971) On classifying
semantic features. In D. D. Steinberg and L. A. Jakobovits
(Eds.),
Semantics: An interdisciplinary reader in philosophy, linguistics, and psychology. Cambridge,
Cambridge University Press.
Clark, H. H. (1973) The language-as-fixed-effect
fallacy: A critique of language statistics in psychological research. J. verb. Learn. verb. Behav., 12, 335-359.
Clark. R. (1970) Concerning
the logic of predicate modifiers. Nous, 4. 31 l-355.
Cohen, J., and Cohen, P. (1975) Applied multiple regression/correlation
analysis for the behavioral
sciences. Hillsdale, N.J., Lawrence Erlbaum Associates.
Relative
Collins.
properties
in memop
173
174
L. J. Ripsand
W. Turnbull
Reference Notes
1. Stephens,
D. (1978) Processing of pictures versus words in a comparative judgment
lished manuscript,
University of Chicago.
task. Unpub-
Cognition,
@Elsevier
8 (1980) 175-185
Sequoia S.A., Lausanne
3
- Printed
in the Netherlands
RHIANON
ALLEN**
The Graduate
ARTHUR
Brooklyn
Center of CUNY
S. REBER
College of CUNY
Abstract
Very long term memory for abstract materials was examined by recalling
subjects who had served in a synthetic grammar learning experiment two
years earlier. In that study (Reber & Allen, 1978) we differentiated among
several cognitive modes of acquisition, their resultant memorial representations, and their associated decision processes. Two years later and without
any opportunity for rehearsal or relearning, subjects still retain knowledge
of these grammars to a remarkable degree. Although some differences have
become blurred with the passage of time, the form and structure of that
knowledge and the manner in which it is put to use remain strikingly similar
to the original. That is, differences traceable to acquisition mode and conditions of initial training can still be observed. As in the original study, these
results are discussed within the general context of a functionalist approach
to complex cognitive processes.
This paper is a report of rather remarkably persistent long term memory for
highly abstract and complex materials; specifically,
the knowledge of the
grammatical structure of two artificial languages after a two year hiatus.
In researching the area of very long term memory we were struck by the
lack of attention which has been paid to memories of this kind. For the most
part, the study of long term memory has dealt with real world knowledge
176
177
Each acquisition mode results in a particular form of memorial representation and an attendant set of operations for making decisions. Let us review
each.
(a) Explicit
rule induction
memory
strategy
This acquisition mode consists of the unconscious abstraction of the underlying rule system inherent
in the exemplars
presented
during learning.
Characteristic
of this mode is that little or no specific concrete information
178
about the actual learning items is retained, and decisions about the wellformedness
of test strings are made largely on an intuitive basis. Although
there was evidence that some learning of this kind accompanied
the PA
procedure,
the abstraction
strategy was strongly associated with the OBS
training procedure
which, unlike PA, has no specific task demands. The
advantage in dealing with old strings found with the analogy strategy was
totally absent here; all strings from the learning set are dealt with as if they
were novel strings.
In this study, then, we are looking for evidence with respect to three
important
issues in the study of very long term memory. First, can a body
of unconscious knowledge be retained for an extended period of time without the opportunity
for rehearsal? Second, how important is the mode of
acquisition
of original knowledge in determining
what is retained? Third,
how closely does the form of two-year-old
knowledge
resemble that of
original knowledge?
Method
Subjects
materials
The stimuli used were the letter strings from the two tests for well-formedness in the original study. In that experiment,
the knowledge of grammatical
structure acquired during learning was evaluated by presenting each subject
with a set of 100 strings of letters (actually only 50 distinct items were used,
each being presented twice), one-half of which conformed with the rules for
letter order (the grammatical strings) and one-half of which contained one or
more violations of those rules (the nongrammatical
strings). Details of these
test items are given in Reber and Allen (1978). For our purposes here note
that five of the grammatical
strings had been used as part of the original
learning
occurred
20 grammatical
179
strings
Procedure
Prior to testing, subjects were told which grammar they would be responsible for and asked in all cases to respond yes or no depending upon
whether or not, as best as they could recall, each item conformed to the
rules of that grammar. All subjects were reminded that half of the items
were acceptable and half were not. There was no opportunity
for relearning
or refamiliarization
with the materials. No other information
about the
materials or the task was given; no mention was made of the repetition of
test items or about the existence of the old items; no feedback about the
correctness
of their responses was given; and no reference was made to the
fact that these test strings were the same ones which had been used two
years ago. Both grammars were tested in exactly the same manner, each
time reminding the subject about the procedure used to learn that particular
grammar two years ago. After completing the well-formedness
test, subjects
were asked to provide an estimate of how well they thought they had done
by estimating how many of the 100 items they classified correctly.
Counterbalancing and notation
The order of running was counterbalanced
with four of the subjects first
tested with the strings based on the grammar learned by the PA procedure
two years earlier (denoted as PA-1st subjects) and the remaining four subjects beginning testing with the one learned with the OBS procedure (OBS1st subjects). Following testing on the first grammar, subjects proceeded
directly to the task for the other grammar (denoted as PA-2nd and OBS2nd). Note that subjects referred to as OBS-2nd are the same subjects as
PA-I st and similarly for the PA-2nd and OBS-1st subjects.
All subjects were run in the same order condition
as two years ago.
For example, PA-1st subjects here are the same subjects as those who were
described as PA-1st in Reber and Allen (1978). This point will be important
later since we will report on some effects that can be traced back to the
order of running in the initial training sessions.
180
Results
Introspections
At the outset, only one or two subjects thought that they were now capable
of performing
above chance on this task. However, as testing continued all
reported that they were, to their surprise, becoming more and more aware of
their ability to make accurate decisions and all but one of the subjects
estimated their performance
to be above chance. However, unlike two years
ago, the overall correlation between estimated and actual performance
was
not significantly
different from zero:.our
subjects knew they were performing above chance but they had no accurate sense of just how well they
were doing.
The pattern of justifications
offered two years ago had revealed some
strong differences
in the types of reasons given for making decisions following the two learning procedures. Here, no differences were observed. For
both grammars we received a mixture
of justifications
like, Im just
guessing, This one somehow feels right (or wrong), I think I remember
this one, and so forth. However, the frequency of such justifications
was
very low. Unlike two years ago where roughly 40% of all responses could be
justified, a concrete reason for a decision in this follow-up was a relatively
rare event. This apparent loss of conscious contact with at least some sense
of what is known probably accounts for the lack of confidence that subjects
had in their knowledge and the generally poor ability to assess actual performance.
Finally here, virtually all subjects felt that the task became easier as
testing proceeded and they thought their performance
improved consistently.
Although there was a trend in this direction over the full course of testing,
it failed to reach significance, F(3, 21) = 1.68. The sense of increased performance
over trials probably has more to do with a refamiliarization
with
the task than with an actual increase in the amount of recalled knowledge.
Probability of a correct response (PC)
Table 1 gives the mean P, values for the grammatical and non-grammatical
items for the grammars learned under each condition in both the original
experiment
and the follow-up. The single most interesting value in this
experiment
is the overall P, for the follow-up of 0.667. With chance at 0.5
and P, > 0.6 needed for significance for an individual subject, this value
demonstrates
that sufficient knowledge of these grammars has survived the
two year hiatus for our subjects to reliably distinguish well-formed
from
18 1
non-well-formed
strings. However, it is also clear that there has been a
decrease in overall performance;
the difference between the P, values from
the original and follow-up testing sessions is significant,4 F( 1,7) = 26.2,
p < 0.005.
Table 1.
Item Status
Grammatical
Nongrammatical
Means
Follow-up
Task
Task
Observation
Paired
Associates
Means
Observation
Paired
Associates
Means
0.845
0.775
0.808
0.710
0.780
0.740
0.778
0.778
0.778
0.678
0.690
0.684
0.650
0.650
0.650
0.664
0.670
0.667
4Wherever statistical
comparisons
are drawn between the original and follow-up studies, the data
from the two subjects not recalled have been discarded.
All tests to follow, therefore,
utilize a completely within subjects design with eight subjects. The deletion of these two original subjects seems not
to have resulted in any systematic loss of data.
Representativeness
The issue here concerns the extent to which subjects knowledge of structure
is an accurate reflection of grammatical structure as displayed in the original
learning stimuli. We had noted two kinds of non-representativeness
in the
original experiment:
the explicit rule induction
strategy occasionally
led
subjects to articulate
rules which were simply incorrect and the analogy
strategy often led subjects to consistently
misclassify items on the grounds
that candidates for analogy-by-similarity
were not in memory. The existence
of non-representativeness
is detemrined by analyzing the pattern of responses
to the two presentations
of each test item, comparing (by a x2 test) the
number of repeated misclassifications
(EE) to the number of single misclassifications (CE and EC). Table 2 shows all four possible patterns from each
learning procedure and order of running.
Table 2.
Training
Condition
Observation
cc
CE
EC
EE
Run 1st
.~~
Run 2nd
109
24
36
31
104
35
26
35
Paired Associates
Run 1st
Run 2nd
93
20
32
55
106
29
40
25
test items
After original learning all subjects had performed equally effectively when
assigning grammatical status to old test items and to novel grammatical
items. In the follow-up, however, we now observed a significant learning
procedure by old/new status interaction, F( 1,7) = 10.62, p < 0.025. Specifically, PA subjects now perform significantly
poorer on novel grammatical
items than they do on old items, F( 1,7) = 6.15, p < 0.05; OBS subjects show
no significant difference. This result clearly indicates that there is retention
of specitic learning set materials after a two year lag, and that it is associated
with the acquisition mode that most strongly directed subjects attention
to the physical features of the stimulus material.
In summary, there is no doubt that knowledge of these grammars has
survived remarkably
well. Some of it is in an abstract form and some in
reasonably
concrete
form, and these memorial
forms correspond
quite
closely with the memory systems of two years ago. Moreover, as indicated
by the analysis of the response patterns to old and novel items and by the
emergence of non-representativeness
in the PA-1st subjects, both the beneficial and detrimental
impacts of these memorial forms can still be felt.
Discussion
To return to the original questions of robustness, form and mode of acquisition, it seems quite remarkable that information
gained over the course of
a 10 to 15 minute exposure to an artificial language can be retained for as
long as two years without intervening exposure or rehearsal. Even two years
after learning, all subjects are significantly above chance at assigning grammatical status to test items. But it is not the case that all types of knowledge are
equally robust. Explicit, conscious knowledge in particular appears to be
relatively fragile in nature.s From a levels of processing point of view as
Rather,
we should say that explicit knowledge
is fragile without
rehearsal.
It seems an obvious
point that if a rule (e.g., a chess move) is rehearsed periodically,
it will be remembered
- perhaps indefinitely. The important
notion here is that the other two modes are robust without rehearsal.
184
References
Brooks,
L. R. (1978) Nonanalytic
concept formation
and memory for instances. In E. Rosch and B. B.
Lloyd (Eds.), Cognition
and categorizntion.
Hillsdale, N.J., Lawrence
Erlbaum
Associates.
Burtt, H. E. (1941) An experimental
study of early childhood
memory. J. gener. Psychol., 58, 435439.
Craik, F. I. M. and Lockhart,
R. S. (1972) Levels of processing:
A framework
for memory research.
J. verb. Learn. verb. Beh., 11, 671-684.
Craik, F. I. M. and Tulving, E. (1975) Depth of processing
and the retention
of words in episodic
memory. J. exper. Psychol.: General, 104, 268-294.
Kolers, P. (1976) Pattern analyzing memory. Science, 191, 1280 -1281.
Posner, M. I. (1973) Cognition:
An introduction.
Glenview, Ill., Scott, Foresman
and Co., Chap. 7.
Reber, A. S. and Allen, R. (1978) Analogic and abstraction
strategies in synthetic grammar learning:
A functionalist
interpretation.
Cog., 6, 189-221.
Reber, A. S. and Glick, J. A. Implicit learning and stage theory. Int. J. Beh. Devel., in press.
Wicketgren, W. A. (1972) Trace resistance and the decay of long term memory. J. math. Psychol., 9,
418-455.
Cette recherche
Porte sur la memoire i long terme pour un materiel abstrait. Les sujets de Iexperience
avaient participe,
deux ans auparavant,
i une experience
dapprentissage
de grammaire
synthitique.
Au tours de cette recherche
(Reber and Allen, 1978) on avait degagk plusieurs modes dacquisition
cognitive, les representations
en memoire quils induisaient
et les processus de decisions qui y ktaient
associes. Deux ans plus tard sans quil y ait possibilite de repetition
ou de reapprentissage,
les sujets
se souvenaient
remarquablement
de ces grammaires.
Si certaines nuances etaient att&mees avec le
temps, la forme et la structure des connaissances
et leurs modes dutilisation
restaient t&s cornparables
avec les originaux.
Les variations remarquekes
dans le mode dacquisition
dam Ientrainement
initial
sobservaient
encore. Comme pour la premiere etude, ces resultats sont discutks dans le contexte
general dune approche fonctionnaliste
des processus cognitifs complexes.
Cognition,
@Elsevier
4
-- Printed
in the Netherlands
of Hawaii
ERAN ZAIDE L
University
of California,
Los Angeles
and
California
Institute
of
Technology
Abstract
The growth in children S ability to perform the task of separating the sounds
of words from their meanings was investigated by asking children between
3;3 and 6;3 to select homonyms from pictures. The results show a growth in
ability with age, with a jump at 4;4. An investigation of the developmental
changes in the strategies employed shows that the task is cognitively
complex. Performance in the younger children is more hampered by a
resource-limited inability to cope with many cognitive factors all at once
than by lack of ability to do the linguistic aspects of the task. These cognitive factors include access to vocabulary, rehearsal of intermediate results,
and implementation of a search strategy.
Introduction
In English, with its phonologically-based
writing system (as opposed, for
example, to the Chinese ideographically-based
system), reading readiness
must depend in part on an ability to separate the sounds of words from
their meanings. At what point in their linguistic development
are Englishspeaking children able to effect this separation? Is there a clearly marked
*We thank Deborah
Burke for advice on test design, Leslie L. Wolcott for drawing of the test
materials, Susan Fischer and Danny Steinberg for help in statistics, and the All Saints Day Care Center
for making subjects and facilities available. Thanks are also due to H. and V. Wayland for support of
the first author during this study. This work was also supported
by NIMH grant MH-03372 and NSF
grant BNS76-01629
to Prof. R. W. Sperry, by USPH awards MH-00179 and RR07003,land
by NSF
grant BNS7 8-247 29 to E. Zaidel.
188
189
Method
Subjects
190
Homonym sets
Table 1.
Semantic
ring (jewel)
glasses (drink)
nail (metal)
ring (bell)
glasses (specs)
nail (finger)
necklace
cups
hammer
bat (baseball)
bow
(arrow)
horn (instrument)
trunk (elephant)
tie (cravat)
bear
night
palm (hand)
spring
(metal)
-_
bat (mammal)
bow (ribbon)
horn (animal)
trunk (chest)
tie (package)
bare
knight
palm (tree)
spring (season)
Homonym
Homonym
Semantic
Rhyme
Allit
Repeat
order
Training
1.
2.
3.
4.
5.
6.
7.
8.
9.
mitt
gu
drum
hippo
jacket
lion
day
foot
screw
swing
glad/girl
hat
hoe
corn
skunk
spider
knot
tusk
suitcases
sew
clothes
queen
bush
fall
cry
pear
kite
bomb
ring
back
bone
horse
train
tire
barrel
knife
P_OJ
2.
6.
5.
I.
1.
4.
3.
9.
8.
1 illustrates
repeat pass
mitt
hat
baseball-bat
back
flying-bat
spider
Testing Procedures
The children were tested one at a time in a small room or office at the
school (whichever happened to be free at the time). Sessions took varying
lengths of time depending on the ages of the children: some of the younger
children took 45 minutes while some of the older children finished in 15. All
sessions were tape recorded.
T?zeacquisition of homonymy
Figure 1.
19 1
Sample items from the homonym test: Find hvo pictures that sound the
same but mean different kinds of things. Left: first pass; right: repeat pass
(see text).
I. Prenaming
2. Homo~~yms
193
Scoring
As soon as possible after testing, the tapes from each session were reviewed
and any verbal comments made by the child were transcribed onto a new
score sheet along with a copy of the pointing responses and timing information noted at the time of testing. Each childs responses were scored according to the following rules:
1. Correct responses.
a) overt: if the correct pair was indicated and the child could say the word.
b) passive: if the correct pair was pointed to and, although the child
would not say the word, s/he did Task 3 correctly.
2. Errors
a) semantic (S)
b) phonological (P), including
( 1) rhyme (RI
(2) alliteration (A)
c) random association (X), if the child pointed to a pair that was neither
correct nor S nor P (i.e., association between the phonological and semantic
distractor
items, or between the non-target homonym
and the semantic
distractor).
d) no response (-), when the child refused to point to a pair and either
said nothing or said I cant or I don t know.
e) phonological
inventions
(I), where the children either tried to invent
rhymes or alliterations
that were not words used in the prenaming or else
tried to force homonymy
through neutralization
or by brute force relabelling (see Discussion under Development
of strategies for finding homonyms).
3. Errors were scored as originally designed unless a verbalization
indicated
that some other strategy was being used, e.g., Knight (with sword), krzife was
scored as A (alliteration),
but if the child said knife, sword, it was scored
as S (semantic).
4. No response (-) was counted as an error only on the first request for
each task, but refusal to make another try after a child had made at least
one response was not counted as a further error (since it was assumed that
after one overt attempt, no further response simply indicated that the child
had no better guess to offer).
5. Any given pair was only counted once even if it was pointed to more than
once.
6. If a child indicated a pair but rejected this choice himself, it was not
counted.
There are several possible ways of assessing each childs basic homonym
ability
due to the facts that (1) each child was encouraged
to keep
194
Performance
and Age
I. Results
Age group means for the 4 homonym scorings (maximum for each = 18).
Significant differences between scorings computed from correlated-sample
t-test.
Oldest
Middle
Youngest
All
10.7
9.0
2.5
*
**
13.7
10.5
3.5
*
**
**
15.6
13.4
6.6
*
*
16.5
15.1
8.4
7.4
**
9.2
**
11.9
**
13.3
* p < 0.01.
**p < 0.001.
Ho
HI
Hz
HP
=
=
=
=
overt
overt
overt
overt
homonyms
homonyms
homonyms
and passive
195
the middle group (p < O.OOl), a lesser level for the oldest group (p < 0.01)
and were not significant at all for the youngest group. Increases in scores
from Task I (H,) to Task 2 (H,) were significant for all three groups,
whereas adding in passive scores made significant differences
only for the
two younger groups (p < 0.0 1) (see Table 2).
Even though most of the youngest children could find at least one or two
homonyms,
there was a clear jump in ability at the boundary between the
youngest and middle age groups (4;4 years). This was indicated by a maximum in the value of F [F = 42.6, df = (1,28)], signalling a maximum of certainty in a score difference when the children were ranked by age, and oneway analyses of variance were performed on the H, scores of older versus
younger
age groups when the boundary
between the two groups was
systematically
increased. A second maximum value for F [F = 19.2, df =
(1,28)]
occurred with the boundary
between the two groups set at 5;2
years, setting off the 8 oldest children as the most able group.
Since a one-way analysis of variance on the differences in scores between
Pass 1 and Pass 2 (originals versus repeats) was not significant [F = 1.43,
df = (1,58)],
these scores have been combined into a total score for each
child (maximum = 18). The lack of change from originals to repeats shows
that the children evoked the names of the concepts rather than having
learned to associate them by rote with specific pictures. The children were
not told that the second set of homonyms involved the same words as the
first set.
2. Discussion
Since the children were always asked to verbalize the homonyms for each
pair they chose, it was very clear whether or not they really had found a pair
and thus their scores were not compared to chance. Even for the children
who had the hardest time, most of them were able to find at least one or two
homonym pairs and in each case it was very clear that they understood what
their goal was and were aware that they had solved that particular problem.
The expected increase with age in ability to find homonyms was clearly
indicated by every statistical test we made. More interesting is the relationship between age and scoring that can be seen in Table 2 and which was
supported by the t-tests: the older the children, the better they did on their
first try (H,), while the younger the children, the more they benefited from
more tries and passive scoring. In particular, the oldest children did best
within Task 1 with their biggest increase in scores from Ho to H1, while
the younger two groups profited most by moving to Task 2 with their
biggest increase from H, to Hz.
Although no significant interaction
between homonym
ability and sex
was found for any score separately, it is worth noting that while the girls
196
of Strutegies
By Age Group
Table 3.
197
Age group means for strategies on Task I, first tries. Significant differences
in each strategy are shown between youngest and middle groups and youngest
and oldest groups (bottom row) as computed by i-tests.
Oldest
5;1-6;3
Middle
4;3-5;l
Youngest
3;3-4;5
Ho
CP
107
(59%)
(Z%,
**
Cl%)
(*)
25
(14%)
**
:3y,)
(*)
39
(22%)
197$h)
(&
::O%,
*
&
(*)
$983
*
(2:2%)
*
42
(23%)
43
(24%)
*p < 0.01.
**p < 0.001.
(*)p < 0.05.
Ho = number of overt homonyms
C, = number found passively.
P = phonological
errors.
S = semantic errors.
X = random associations
- = no response.
logically-based
guesses (P). This,
ceiling effects in the oldest group.
12
(7%)
22
(12%)
found.
however,
may be artifactual
and due to
198
Figure 2.
Percent responses for each strategy used on Task 1, first tries. Top: by age
group; bottom: by ability group. Symbols as in Table 3.
ocp
Oldest
pP
Best
s x (A)
s x -
Middle
H,
cp
Middle
H,
cp
s x -
Youngest
s x (B)
HoCp
worst
s x (C)
Table 4.
199
Ability group means for strategies on Task I, first tries. Significant differences
in each strategy are shown between best and middle groups, middle and worst
groups, and best and worst groups (bottom row) as computed by i-tests.
Symbols as in Table 3.
P
Ho
CP
A: best
4;s6;3
129
(72%)
**
(:%,
8: middle
4;3-5;6
69
(38%)
**
(24%)
58
(32%)
C: worst
3;3-5;2
24
(13%)
**
12
(7%)
(*)
49
(27%)
(*I
**
19
(11%)
fF2%)
(*I
46
(26%)
*
7
(4%)
*
ffl
%)
:122%)
*
associations
(X) tend to
3. Discussion
There were several other readily observable strategy differences between
the groups. In both the youngest (Fig. 2, top) and least able (Fig. 2, bottom)
groups, passive responses were extremely common: not only when these
children picked correct pairs, but also when they picked phonologically
and semantically associated pairs, they tended not to want to say any words
aloud when asked Whats the word? In fact in Task 1, 8 1% of the passive
responses were made by children in the youngest age group. (For further
discussion of passive vocabulary,
see below.) Another clear developmental
difference was a shift in the type of semantic responses given. The youngest
children not only indicated many semantically
associated pairs for which
they refused to verbalize, but when they did say the words, they tended to
label the individual members of a class rather than giving a single superordinate
class label (for instance, pointing to Zion and bear and saying
lion, bear rather than animal which would at least have used the same word
for both pictures). 77% of such class membership responses were made by
the youngest group whereas superordinate
class responses were quite evenly
distributed
across the age groups (youngest 31%, middle 35%, oldest 35%).
A cognitively-based
difference
that separated one group from another
was apparent in the type of searching strategy employed. Thus, while the
most able children tended to scan each array of 4 pictures silently (though
often subvocalizing, as evidenced by lip movement), smile, point to the right
200
pair and say the words, the youngest seemed to just pick two pictures. If
these were wrong, they often picked the other two pictures and then gave
up. The intermediate
children, however, seemed to be on the way to developing a systematic search strategy without having quite gotten there. First,
they tended to want to name all 4 pictures aloud without making any
choices. Then they often seemed to pick out one picture which served as a
focus for their comparisons and would systematically
pair it with each of the
other 3, indicating that each such pairing was a guess at the right answer. If
they happened to choose one of the homonym pictures as their focus, this
strategy was often successful. If, however, they picked a non-homonym
as
focus, they often could not find the homonym
even though they applied
the correct labels to the pictures: although they said the correct words
aloud, they seemed not to be able to carry the sounds over from one comparison to the next. A shift to Task 2, however, in which one of the homonyms was indicated by the investigator,
seemed to help these children get
unstuck from that first choice of focus. This phenomenon
was much more
common among the older two groups of children, occurring only rarely
among the youngest. It may reflect local rigidity associated with flexibility
in another cognitive locus. It is as if the child has a limited resource for
flexible open-ended
search which s/he can apply to the search for focus or
to the search for identical labels but not simultaneously
to both. (See
Norman and Bobrow, 1975, for a discussion of resource limitations.)
An increase in ability to deal with the phonological
nature of the problem was also evident, being more pronounced
in the ability grouping than in
the age groups. The children acted very much as if there was a hierarchy of
strategies at their disposal, and if a higher strategy didnt work they would
fall back on a lower one. The apparent sequence was: get it right, make a
phonological
choice, make a semantic choice of the inclusive kind, make a
semantic choice of the associative kind, guess randomly. (Giving up could
occur at any point - how soon a child refused to try any more seemed to
depend on the individuals personality.)
The older children had more control
over the higher end of the sequence - the most able children almost always
found the right answers and when they had trouble they would fall back on
P or S almost equally often (see Fig. 2, bottom).
The least able children,
who had great difficulty,
seemed also to use P and S about equally often,
but the middle group used P much more often than S (again, see Fig. 2,
bottom). This is because, aware of the phonological
nature of the problem.
some of them used every trick they could muster to find two words that
sounded alike, including hunting for rhymes (by means of both real words
and invented nonsense words) and forcing identity of sound between two
words (invention
(I) errors). For example,
K.B. (5;l) was a prolific
arrowlbrarrow,
horselmorse
and
suggesting
mitten/kitten,
among others.
There were two different ways in which identity of sound was forced:
through brute force relabelling and through phonological neutralization.
Brute force relabelling occurred when a child pointed to two non-homonymous pictures and applied the word for one of them to both. E.N. (4;3)
did this some 10 times, e.g., pointing to hoe, bow-and-arrow,
but saying
rhymer,
drum/turn,
arrow, arrow.
202
Difficulty
and Homonym
Performance
Whole Group
Pt
Ho
HI
-0.84
Oldest
PO.66
**
**
-0.85
-0.71
**
**
H2
PO.78
m-o.73
HP
PO.78
**
m-o.70
**
**
pt
Ph
**
i PO.80
I *
1 PO.82
;
I
,
,
Middle
Youngest
Ph
ph
Pt
ph
pt
-0.67
(*)
-0.72
(*)
-0.71
(*)
-0.80
* _
-0.77
*
-0.76
(*)
\ ,
-0.85
*
PO.67
-0.76
(*)
(*)
%ignificant
differences:
**p < 0.001; *p < 0.01;
Pt = prenaming
score on total set of pictures.
Ph = prenaming
sco*e on set of homonym
pictures.
Other symbols as in Table 2.
-0.69
(*)
-0.4;
0.76
-0.58
(*)
I
I
,
,
~0.61
-0.55
-0.62
-0.01
_!
PO.13
1
0.47
PO.36
PO.29
0.30
203
the middle group, and not correlated at all for the youngest group except for
P, with H, (Table 5). Thus, although prenaming proficiency
has something
to do with homonym
finding ability, it does not tell the whole story,
especially for the youngest children. It is as if vocabulary proficiency releases
resources for searching and matching. When all of the component
prerequisites for the task (searching, matching, vocabulary proficiency)
are mature
enough, growth in ability in any one area releases cognitive resources to
improve perfomance in the whole task.
It also seemed likely that if a child did not know one or both of a given
pair of homonym words at the prenaming stage, s/he would have difficulty
finding that particular pair in the homonym test. Therefore, we looked at
how well the three age groups did at finding homonyms contingent upon
whether they did or did not have prenaming success with the homonym
words. This analysis showed that the oldest children did quite well even
when they had vocabulary
difficulty (91% of the homonyms in this case).
Both the oldest and middle groups did well when they had no vocabulary
difficulty (95% and 96%, respectively,
of these homonyms).
The youngest
children only got 58% of items where they had no vocabulary
problems,
42% when such problems existed. And again, when there was no vocabulary
difficulty,
the older 2 groups had few passives (1% and 4%) while the
youngest had 19%. When, however, there had been vocabulary problems, the
middle group went up to 17% passives and the youngest to 24%. Thus, the
youngest children seem to be relatively unable to take advantage of exposure
to difficult vocabulary
items at the prenaming stage as shown by their
increase in homonym errors for just those words (23% to 34%). The middle
children, on the other hand, could utilize at least some of the prenaming
information
as evidenced by the increase in passive responses (4% to 17%).
And the oldest children seem to have taken such good advantage of their prenaming problems that their homonym performance dropped very little when
they had vocabulary
difficulty (96% to 91%). The fact that the youngest
children did find 42% of these homonyms
where prenaming difficulty
occurred shows that mastery of vocabulary as measured by success in prenaming is by no means necessary for success in homonym finding.
Although the prenaming scores do correlate fairly well with the homonym
scores, the interaction
between the two tasks seems much more complex.
Indeed, the prenaming task was designed to associate particular labels with
particular pictures in the minds of the children before they were confronted
with the homonym sets, and judging from the childrens homonym scores on
those items for which they had vocabulary
difficulty,
the prenaming task
seems to have functioned
much as it was intended to (although it did not
work perfectly
since the children did not always remember
the desired
labels). In particular,
any vocabulary
items which a child knew to some
extent but had temporarily
forgotten were likely to be reinforced,
often to
the point where finding the homonym
was a possibility, passively if not
actively. In addition, the pressure to perform well probably further enhanced
this reinforcing effect.
An interesting phonological difficulty arose for some of the children when
the alliteration
happened to phonologically
contain the whole target word
as its first part. This happened with the words tie and tire, and bear/bare and
barrel.
Somehow these were much more confusing than minimal pairs such
as bat and back, horrz and horse, or night/knight
and knife.
A final question that needs to be discussed with respect to the effects of
vocabulary on homonym performance
is that of homonymy versus polysemy.
That is, is there any evidence that any of these pairs of words were stored in
the childrens lexicons as two sub-meanings to a single entry rather than as
two separate entries which happened to sound alike? Of all the homonym
pairs, only tie (a string) and necktie seemed to be at all polysemous.
(One
child, age 4; 10, spontaneously
remarked,
You tie something around your
neck anyou tie OH your shoe, too.) This did not, however, seem to be the
case for all the children.
The ability to find homonym
pairs depends then, not only on an understanding of the nature of the task involved, but also on having access to the
phonological
representations
of the critical words in order to be able to compare them for identity.
Active (productive)
versus passive (only receptive)
knowledge of words probably has its effect here - in the case of passively
correct choices the children seemed to be able to hear enough of the relevant words in their heads to make their decisions but were not sure enough
of the words to want to say them out loud. The tendency of the middle
children to want first to name all 4 pictures aloud before making any choices
also seems to relate to the need to be able to hear the words in order to compare them. When a childs control over pronunciation
is not fully developed,
it is unclear whether his difficulties
with pronunciation
will tend to carry
over into his phonological
comparisons or not. The child who had the least
success in finding homonyms
was a boy (4;O) whose phonological
development was very slow. According to his teacher, this trait ran in his family and
was always eventually outgrown. How much of his difficulty with homonyms
was due to this developmental
characteristic
is unclear, but probably the
effect was not negligible.
The homonym
test calls for the coordination
of a number of cognitive
prerequisites.
These include the ability (1) to understand
the task, i.e.,
what sound the same means, (2) to conduct an exhaustive search through
the set of alternative pictures, (3) to access the phonological representations
20.5
of the critical words, (4) to rehearse a label while searching for others to
match with it, (5) phonologically
to match two labels once found, (6) to
cycle through alternative labels for a picture in cases of phonological mismatch. Inefficient processing or immaturity
in any of the component
processes or in the ability to coordinate them could result in failure to perform
the task. Maturation of some component
processes can release resources for
processing others. Thus, the younger children were particularly limited by
mastery of vocabulary - a problem which hardly affected the older children.
That improvement
in ability to find homonyms is a function of maturation rather than learning is shown by the fact that exposure to one exemplar
of a homonym (Pass 1) did not result in improved performance
on exposure
to a second exemplar of the same homonym pair (Pass 2 - viz. the fact that
the overall scores on the two passes were not significantly different). And
yet there is a sharp improvement
in recognizing homonyms
at age 4;4
without any special training. Thus, the resource limitations affecting performance on this task would seem to be biologically determined rather than
learning-dependent.
Summary
In our investigation
of pre-school childrens ability to find homonyms,
we
have found not only that children over 4;4 years of age had considerably
more success than their juniors, but also that successtat solving this problem
depended on a complex interaction
of cognitive and linguistic development.
Thus, even though children were able to deal with the linguistic aspects of
the problem, the fact that they had not yet developed an efficient search
strategy could, if they were unlucky in their choice of a focus for comparisons, cause insurmountable
problems. And, on the other hand, even if a
search strategy was well developed, linguistic problems could cause a particular pair to be missed. The youngest children had both cognitive and
linguistic problems; the middle children were learning to deal with both sometimes
difficulties
arose in one area, sometimes
in the other. The
most able children had their searching strategies well developed and only
rarely had linguistic difficulties.
a As noted in Homonym Performance and Age, even the children who had the hardest time were
able to find one or two homonym
pairs and in these cases it was clear that they knew they had solved
the problem and found two words that sounded the same. Thus, when they had difficulties
with the
other pairs, it was not because of problems
with component
(1) alone, but rather mainly with the
cognitive components
of search (2) and rehearsal (4) and/or the linguistic components
of access to and
phonological
representation
of vocabulary
(3), (S), and (6).
The linguistic abilities needed for finding these homonyms were of two
kinds: lexical and phonological.
If a child had no lexical access to a
particular vocabulary
item, s/he could not use it in the task. If such access
was only passive (receptive), it might be sufficient to allow the child to find
the homonym
but insufficient
for the child to want to risk producing the
word. Such passive success was most common among the youngest children.
The oldest children were the most lexically facile - if they happened to
forget the particular label associated with a picture at prenaming, they were
able to try out several names for each picture.
Phonological ability here refers to the capability of separating sound from
symbol and then manipulating
that sound by comparing it with the sounds
of other words. The youngest children showed relatively little evidence of
having developed such abilities - they tended to fall back on semantic association as a criterion for similarity. The intermediate
children, however, had
developed
a fair repertory
of phonological
manipulations
they could perform. Since they were not as efficient as the most able group, they made
numerous
guesses, looking for rhymes and alliterations,
inventing them if
they had to, or trying in some way to force identity of sound.
The ability to recognize
phonological
similarity would seem to be a
necessary if not sufficient prerequisite
for learning to read via phonological
decoding. Indeed, the disconnected
left hemisphere
is proficient
in both
recognizing homonymy
and in translating graphemes to phonemes, whereas
the right hemisphere is not proficient in either. The improvement
in ability
to recognize
homonyms
between 4 and 6 years apparently
reflects left
hemisphere maturation
(Zaidel and Peters, 1979) - if so, then age 5 seems a
natural biological (rather than purely cultural) starting point for learning to
read. And yet the fact that the oldest group in our experiment
did not
precisely
consist of the most able homonym
finders should be kept in
mind: some children simply had their act together
(both cognitive and
linguistic) at an earlier age than others.
References
Bogen,
207
Locke,
Locke,
Cognition,
@Elsevier
8 (1980) 209-225
Sequoia LA., Lausanne
Discussion
- Printed
in the Netherlands
ERIC WANNER*
Sussex University
In a recent issue of Cognition, Lyn Frazier and Janet Dean Fodor proposed
a new two-stage parsing model, dubbed the Sausage Machine (Frazier and
Fodor, 1978). One of the major results which Frazier and Fodor bring
forward in support of their proposal concerns a parsing strategy which,
following Kimball (1973), they call Right Association. The center-piece
of
their argument concerns an interaction
between this parsing strategy and
another
one, which they call Minimal Attachment.
Frazier and Fodor
(henceforth
FF) provide interesting evidence that the language user makes
tacit use of both strategies to resolve temporary syntactic ambiguities that
arise during parsing. FF then proceed to argue that the existence of these
strategies, as well as the apparent interaction
between them, can be fully
explained if we assume that the language users parsing system is configured
along the lines of the Sausage Machine. In FFs view, the Augmented Transition Network (ATN) runs a very poor second to the Sausage Machine, for
according to FFs argument, it is impossible even to describe the two parsing
strategies within the ATN framework. In effect then, FF are claiming that
the Sausage Machine achieves explanation
adequacy in this case while the
ATN fails to reach the level of descriptive adequacy.
These are strong and potentially
important
claims. If correct,
they
obviously provide grounds for pursuing parsing models built along the lines
of the Sausage Machine rather than the ATN. However, when FFs arguments
are examined at close range, the comparison between parsing systems comes
out rather differently
than they claim. In particular, it appears that the
Sausage Machine explanation
of Right Association and its interaction with
Minimal Attachment
is empirically incorrect. The inadequacy of this explanation completely
cancels the Sausage Machines ability to describe the
interaction
between strategies that FF have observed. This follows because
*Reprint
Cambridge,
requests
should be sent
Mass. 02138, U.S.A.
to Eric Wanner,
Harvard
University
Press,
79, Garden
Street,
210
Eric Wanner
FF aspire to an explanation
that renders independent
description
of the
parsing strategies unnecessary.
The Sausage Machine contains no apparatus
for describing strategies. Hence, the failure to achieve explanatory
adequacy
automatically
entails descriptive failure as well. In contrast, and in contradiction of FFs negative claim, the ATN can provide a perfectly general description for each strategy in terms of scheduling principles that constrain the
order in which arcs in an ATN grammar are attempted.
Moreover, when
these scheduling principles are coupled with an ATN version of the grammar
FF tacitly employed to generate their pivotal cases, FFs observations about
the interactions
between
strategies are completely
accounted
for. Thus,
although the ATN framework
does not provide an explanation
for either
parsing strategy, it appears to achieve descriptive adequacy. Moreover, the
descriptive framework
of the ATN makes it possible to discern just what
phenomena require explanation
and to speculate in a reasonable way about
the explanatory
principles that underlie the parsing strategies FF have discovered.
2 11
Tom said that Bill had taken the cleaning out yesterday.
Joe called the friend who had smashed his new carup.
In (l), yesterday
can be attached as an adverbial modifier either to the topmost S in the phrase marker (Tom said . ..) or to the embedded S (Bill had
taken . ..). Similarly, in (2), up can be attached as a particle to the verb in the
topmost S (called) or to the verb in the embedded S (smashed). In both
sentences, the lower of the two possible attachments
seems to be preferred
by most people and Frazier (1978) has provided experimental
evidence for
the reliability of this preference.
According
to FF, this type of bias can be adequately
described by
Kimballs principle of Right Association, which dictates that an ambiguous
constituent
should be attached into the phrase marker as a right sister to
existing constituents
and as low in the tree as possible (p. 294). The Right
Association strategy applies in the obvious way to make the correct predictions about the language users preferences
in sentences (1) and (2). But
what explains the existence of this particular
strategy? Why should the
language user be uniformly biased toward low right attachment
as opposed
to (say) high right attachment?
According to FF, the Sausage Machine can
supply the answer. Their story begins with the observation
that the ten-
212
Eric Wanner
Joe bought
Joe bought
(5)
1
N
PNP
I
Joe
bought
the
I
book
I
for
(6)
2 13
because its limited window prevents it from seeing the higher attachment
possibility. Note also that this account automatically
explains why Minimal
Attachment
prevails over Right Association in (4). Since there is no independent statement of Right Association in the parser there is no conflict to be
explained. In short sentences like (l), the PPP will see both attachment
possibilities. Therefore,
there will be no bias towards low right attachment
and the Minimal Attachment
strategy prevails by default. On the basis of
this demonstration,
FF claim to have achieved, at least in one important
instance,
their announced
goal of showing that the parsers decision
preferences
can be seen as an automatic
consequence
of its structure
(p. 297).
214
Eric Wanner
There are, however, serious problems with this claim. If the preference for
low right attachment
sets in . .. at some distance just because of the PPPs
limitation
to a six word window, then this limitation
ought to operate
uniformly
in all cases. Just as the preference for low right attachment
dissolves as sentence (3) is shortened into sentence (4), so it should also dissolve as sentences (1) and (2) are shortened. But it does not. Sentence sets
(7) and (8) represent
progressive
shortenings
of sentences (1) and (2):
(7)
(8)
(a)
(b)
(c)
(d)
(e)
(f)
(a)
(b)
(c)
(d)
(e)
(f)
Tom said
Tom said
Tom said
Tom said
Tom said
Tom said
Joe called
Joe called
Joe called
Joe called
Joe called
Joe called
Notice that as these sentences shrink, there is no noticeable tendency for the
preference
for low right attachment
to diminish. Indeed, informants
to
whom I have given just the (f) versions uniformly
report a preference
in
favor of the analysis in which the final word is attached to the lower of the
two clauses.* But neither (f) version is more than six words long. Both (f)
sentences can fit comfortably
within the PPPs window. Hence the PPP
could readily see both clauses as candidates
for possible attachment.
Therefore,
the structure of the PPP cannot provide any explanation
of the
language users continued preference for low right attachment in these short
sentences.3
*Some informants
find the higher attachment
in (80 ungrammatical,
presumably
because it requires
an intransitive
interpretation
of smashed.
However, these informants
all prefer the low right attachment in (8e) where there is Rio possible confounding
from ungrammaticality
of either attachment.
3Thc same sort of argument
can be brought
to bear upon some of FFs other arguments
for the
explanatory
power of the PPPs limited window. For example, FE argue that the multiple embedded
sentence
(a) is easier than the identically
embedded
sentence
(b) because its major constituents
(marked here by brackets) are approximately
the length of the PPPs window:
[The very beautiful
young woman] [the man the girl loved] [met on a cruise ship in Maine]
(a)
[died of cholera in 19721.
The woman the man the girl loved met died.
(b)
But again it is possible to construct
an equivalent
sentence which is short enough to fall entirely
within the PPPs window yet is very difficult to comprehend:
Women men girls love meet die.
(c)
2 15
To summarize, it now appears that contrary to the Sausage Machine prediction, Right Association
is not limited to cases of distant attachment.
Moreover, the Sausage Machine offers no explanation
of why the language
user appears to follow the Right Association strategy in some short sen-tences (7f and Sf, but not others (4). Accordingly,
it seems clear that the
Sausage Machines putative explanation of the behavior of Right Association
strategy is simply incorrect. There is nothing about FFs observations which
would require a parser with properties (A) and (B). However, it remains to
be seen whether a parser like the ATN, which has neither two stages nor a
limited input window, can give a satisfactory
account of the behavior of
Right Association
and Minimal Attachment,
as well as their somewhat
puzzling interaction.
2 16 Eric Wanner
terms of concepts like lowest rightmost node in the phrase marker, the parsers
structural preferences would have to be built in separately for each type of phrase
and each sentence context in which it can appear. Evidence that the human
sentence parser exhibits general preferences based on the geometric arrangement of
nodes in the phrase marker indicates that its executive component does have access
to the results of its prior computations.
Its input at each choice point must consist
of both the incoming lexical string and the phrase marker (or some portion thereof)
which it has already assigned to previous lexical items (p. 294).
It is difficult
to determine in general, whether the ATN will eventually
require the addition of something like property (C). However, it is quite clear
that no such property is required to give a perfectly general description of
the two parsing strategies that FF have proposed. The structural preferences
involved in these strategies would not have to be built in separately for each
type of phrase and each sentence context. On the contrary, it appears to be
possible to fonnulate
scheduling principles for the ATN that completely
capture the structural preferences involved and that do so without explicit
appeal to the geometry of the phrase marker. Moreover, when these principles are combined with an ATN grammar for FFs crucial sentences, the
residual mysteries concerning the interaction between Right Association and
Minimal Attachment
are completely resolved.
To see this, recall first that a scheduling rule in an ATN, as described by
Kaplan (1975, 1972) and by Wanner and Maratsos (1978), is essentially a
specification
of the order in which the ATN processor considers the arcs
leaving a state in an ATN grammar network. Recall also that the ATN network includes at least 5 types of arcsp
~ WORD arcs that analyze specific grammatical morphemes such as that
orto,
-- CAT
discussion
cited above.
2 17
Right Association:
Schedule all SEND arcs and all JUMP arcs after
every other arc type. (Since SEND arcs and JUMP arcs never leave the
same state, there is no ambiguity here with respect to the relative
ordering of these two arc types.)
Schedule all CAT arcs and WORD arcs before all
(10) Minimal Attachment:
SEEK arcs.
(9)
<yyJ&
\,
Note first that because CAT Y arc is ordered before SEEK ZP arc, the
constituent
XP will always be completed
by means of categorical nodes
rather than phrasal nodes, if such a completion
is possible. This ordering
guarantees Minimal Attachment.
To see this, imagine that our hypothetical
ATN also includes a network for Z phrases (ZP), one path through which
begins with a CAT Y arc, as in:
CAT Y
CAT R
Suppose that the parser is in state XP, at the moment that it encounters a
word in the input string that belongs to the syntactic category Y. At this
point, two analyses of Y are possible, roughly those corresponding
to the
following attachment possibilities:
2 18
Eric Wanner
(12)
(11) ... .
XP
....
\
1 .. ..
Y
y,zP*.
XP
.
CAT Q
Suppose also that the SEEK XP on arc 9 has been issued and the parser has
completed
the partial path through the XP network to state XPnnat by
finding a Y in the input. Now suppose the next word falls in the Q category.
Here again, two attachments
are possible. Either Q can be attached directly
to the X constituent
via arc 4 to yield (13) or Q can be attached to the UP
constituent via arcs 5 and 10 to yield (14).
(13)
(14)
arc 5,
Finally, notice that the JUMP on arc 3 must be ordered last since it leads
to the SEND on arc 5. If the JUMP were ordered earlier at state XP,, it
could lead the parser to violate Right Association by executing the SEND at
arc 5 before trying the CAT and SEEK on arcs I and 2.
Given the ATN restatement of Right Association and Minimal Attachment
provided by scheduling principles (9) and (lo), we can now consider the way
in which these principles apply to FFs crucial cases. Figure 1 presents an
ATN grammar which will handle sentences (l), (4), and (7a-7f).
The
grammar was constructed
by restating in ATN terms all the phrase structure
rules that FF implicitly used to construct the phrase markers given in their
paper. Corresponding
to every context free phrase structure rule in FFs
generative grammar, there is a level of the ATN network which expresses the
identical analysis of each phrase. For simplicity, however, we have ignored
irrelevant grammatical details pertaining to verbal auxiliaries, verb particles,
and deleted complementizers
in the grammar of Figure 1. None of these
omissions has any bearing upon the interesting aspects of the ATN analysis
of FFs sentences.
To illustrate the way in which principle (9) captures Kimballs principle
of Right Association,
consider first the analysis of sentence (7e), repeated
here alongside the arc sequence (15) which gives its analysis path through the
grammar of Figure 1:
(7e)
(15)
15),
In constructing
arc sequence (IS), the arcs in the analysis sub-path that
fulfill a SEEK have been listed in parenthesis after the number of the SEEK
arc that caused the SEEK to be attempted. Thus the analysis of sentence (7e)
begins with a SEEK for a NP on arc 1, which is completed when the proper
noun Tom is analyzed on arc 17 and control returned to arc 1 by the
SEND on arc 22. This arc sequence is represented in (IS) as I( 17, 22). Following the analysis from this poin_t, we see that said is analyzed on arc 5,
and a SEEK for the complement
S is issued on arc 8. The complementizer
In adopting
this principle
for constructing
the ATN grammar,
I am following Bresnans (1978)
proposal
for relating ATN and phrase structure grammars.
ATN grammars constructed
according
to
this principle
provide a well formed labelled bracketing
of the input sentence directly by means of
the sequence of transitions
made in accepting the sentence. This avoids the use of LISP functions to
build phrase- markers thereby reducing the expressive power of the ATN. A limited set of additional
actions is required
to label grammatical
functions
and handle moved constituents.
However, with
one exception
(the HOLD action) I have deleted these actions from the Figure 1 grammar since they
play no part in the description
of Minimal Attachment
and Right Association.
Figure 1,
SEEK VP
SEEK NP
11
CAT PRONOUN
CAT PREP
SEEK NP
23
that is analyzed on arc 13, followed by a SEEK S on arc 14. The SEEK S
is pursued along arc 1 which analyzes the subject noun phrase Bill and arc 2
which analyzes the verb phrase died. At this point there is a choice between
analyzing the adverb yesterday
as a modifier of the complement
clause on
arc 3 or terminating the complement
clause via the SEND on arc 4. However,
22 1
since the SEND arc must be ordered after all other arcs according to principle (9), this choice must be resolved in favor of arc 3. Hence, yesterday is
attached to the complement
clause, thus insuring the low right attachment
of the adverb. Notice also that this attachment
will be preferred no matter
how long or short the complement
clause is. So long as the complement
clause terminates at state Snnat, the fact that arc 3 precedes arc 4 will guarantee low right attachment.
Since each complement
clause in sentence set (7)
terminates
at just this state, the ATN successfully captures our intuitions
that the lower attachment
of yesterday
is preferred throughout
the entire
set of sentences in (7). (Anyone with sufficient scepticism and stamina can
prove this by tracing through the ATN analysis of the full set.)
Principle (9) also insures the low right attachment offir Susan in sentence
(3), although here the principle applies at state VP,, where the JUMP on arc
11 must be ordered after the SEEK PP on arc 10. To see the effect of this
ordering,
consider the following analysis path for sentence (3) by the
grammar of Figure 1:
(3)
(16)
Joe bought the book that I had been trying to obtain for Susan.
1(17,22), 2(5, 7(19(18, 20,22), 26(13, 14(l(i6, 22),
2(5,6, 27(5, 7(28,22),
lO(23, 24(17, 22), 25), 12) 12) 4)
15,22), 11, 12), 4.
The parser works its way to state VP, by SEEKing an S at arc 26 in order to
process the relative clause (that I had been trying...). As this SEEK is executed, the head noun phrase (the book) is put on HOLD in accordance
with the ATN procedure
for processing relative clauses (for details, see
Wanner and Maratsos (1978)). The relative clause is then processed as an
ordinary declarative
clause until the parser reaches state VP,, having just
analyzed the infinitival complement
to obtain. Since obtain requires an
object noun phrase, the book will be removed from HOLD at this point
and assigned as direct object on arc 7. Then, at state VP,, the parser must
attempt to find a prepositional
phrase on arc 10 to complete the complement clause. Since for Susan is available at this point in the input string, it
is automatically
attached as the indirect object of the complement
clause.
This is, of course, just the low right attachment that language users prefer in
this case. The only way for the ATN to make the higher attachment would
be to reverse the order of arcs 10 and 11 in violation of principle (9).
But now what about sentence (4)? Why doesnt Right Association operate
there as well and come into conflict with Minimal Attachment?
This is a
natural question if one considers just the geometry of the alternative phrase
markers for (4): in phrase marker (5),for Susan is minimally attached to the
VP; in (6), for Susan is low right attached to the NP. However, the question
222
Eric Wanner
The subject
noun
successfully
apply to the book. At this point, the book has been minimally attached to
the verb phrase. In effect the CAT-before-SEEK
arc ordering at state NP,
has selected partial structure ( 18) over structure ( 19) :
Once this structure has been selected, there is only one possible attachment
for the prepositional
phrase for Susan and that is the direct attachment
to
the verb phrase. The ATN accomplishes this attachment on arc 10 once control returns to state VP3 after the successful SEEK for an NP on arc 7.
Notice that the conflict between the minimal attachment
and the low right.
attachment
for Susan never arises in the ATN analysis because the ATN
never considers structure
(19), and it is only in terms of a comparison
between structures (18) and (19) that there appears to be a conflict between
Minimal Attachment
and Right Association.
Therefore,
it appears that the
ATN resolves the apparent conflict in (4) between Minimal Attachment
and
Right Association
in the psychologically
appropriate
way specifically
because it does not appraise the geometry of the two possible parse trees.
This is, of course, just the reverse of FFs claimed advantage for the Sausage
Machines ability to survey the structural details of the developing phrase
marker.
To summarize then, I hope to have shown, contrary to FFs claims, first
that the ATN can provide a general statement of Minimal Attachment
and
Right Association; second, that the ATN can do so without explicit appeal
to the geometry of the developing phrase marker; third, that a careful formulation of the two parsing strategies coupled with a detailed ATN grammar
can account for the otherwise puzzling interactions between parsing strategies
noted by FF simply by appeal to left-to-right processing and without any
assumption of a limited input window.
Explanation
If the ATN analysis of the interaction of Minimal Attachment
and Right
Association in sentence (4) is correct, then there is no need to explain why
Right Association sets in only at a distance in some cases (4) but not others
(7 and 8). Evidently,
Right Association
operates uniformly
across all
sentences although it may be preempted
by another strategy if the preemptive strategy operates at an earlier point in the sentence and eliminates
the opportunity
for low right attachment.
Thus the ATN analysis explains
FFs observations
about the interaction
of the two strategies but it leaves
us with the problem of explaining the strategies themselves. Why does the
parser employ these strategies as opposed to others?
To answer this question within the ATN framework will require a theory
of scheduling
which provides some means of selecting psychologically
appropriate
scheduling principles and dismissing psychologically
inappropriate scheduling principles. Although it is difficult to specify such a theory
at present, it is possible to speculate, given the results in hand, about the
eventual character of such a theory. The basic idea is that a scheduling
theory should choose scheduling principles which minimize computation
during parsing. In different ways both the Minimal Attachment
principle
(IO) and the Right Association principle (9) appear to have this effect on
ATN processing. Thus, by ordering CAT arcs before SEEK arcs, the Minimal
Attachment
principle guarantees that the attachment
requiring the fewest
number of arcs will be tried first. This follows because a CAT arc makes an
attachment
directly, while a SEEK arc requires the implementation
of at
least one additional
arc (within the network invoked by the SEEK) to
complete the attachment.
Minimal Attachment
also minimizes the number
224
Eric Wanner
of SEEKS per parse and this also reduces the memory demands involved in
implementing
SEEKS (for details see Wanner and Maratsos, 1978).
Right Association
minimizes
computation
in a different
way.6 This
strategy guarantees that the parser will continue to include input elements
-within the scope of the current constituent
for as long as possible. Shifts
between constituents
will be minimized. Since syntactic structure is generally
more predictable within constituents
than across constituent
boundaries, this
strategy should insure that the parser will minimize garden paths. In the
ATN, garden paths inevitably inflate the number of arcs which the parser
must traverse while pursuing dead-ended
analysis patlls.7 Therefore,
Right
Association, like Minimal Attachment,
will have the effect of reducing to a
minimum the average number of arcs traversed for any large and representative set of sentences input to the parser. This suggests that a theory of
scheduling might ultimately be constructed
around a metric based on ATN
performance
measured in arc counts. Such a theory would rank scheduling
principles according to their effects on average arc count, with the highest
ranking scheduling
principles
producing
the lowest average count. The
theory could be tested by determining
whether the highest ranking principles were also those employed by the language user.*
These brief suggestions are obviously far from the sort of explanatory
theory
of scheduling
that is required to account for FFs observations
about parsing strategies. Nevertheless,
they should serve to demonstrate
that
such a theory is by no means unobtainable
within an ATN framework.
References
Bresnan,
1 am indebted to Ronald Kaplan for pointing out this property of Right Association.
7See Wanner, Kaplan and Shiner (1975) for evidence that garden paths have a measurable
effect
on comprehension
time.
*This procedure
for ranking scheduling
principles
need not be identified with the childs procedure
for learning scheduling
principles.
The child might reconstruct
the ranking by means of tacit statistics
performed
over its own parsing history. But it might also be innately equipped with biases in favor
of the highest ranking strategies. Conceivably,
such biases might evolve to permit the child to perform
efficient
parsing
from the outset.
Note, however,
that if so, the explanation
of the preferred
scheduling
principles does not lie in the childs innate bias but in the efficiency
ranking to which that
bias conforms.
Explanation
need not always be rooted in constraints
on acquisition
as is sometimes
assumed.
225
Reference
Kaplan,
Notes
disserpaper,