Pigeons were studied in an experiment involving two concurrently available response keys.
Conditions were such that in the first condition the predictions of melioration (Herrnstein
& Vaughan, 1980), minimization of deviation from matching, and maximization were iden-
tical: relative time on the right key should have fallen between .125 and .25, which in fact
occurred. In the second condition, melioration predicted a shift in relative time on the right
to between .75 and .875, which would involve a transient deviation from matching as well
as a substantial drop in rate of reinforcement. All three birds eventually shifted their distri-
bution of behavior to within the range predicted by melioration.
Key words: matching, concurrent schedule, maximization, key peck, pigeons
A number of experiments have demon- tion process. Shimp (1966, 1969), for example,
strated that, given a choice between two con- has argued that, for conc VI VI, matching
currently available variable-interval schedules would be approximated if an animal always
(conc VI VI), animals and people tend, within emitted that response with the highest momen-
some margin of error, to match both relative tary probability of reinforcement, which in
responses and relative time to relative ob- turn implies that overall rate of reinforcement
tained reinforcements (de Villiers, 1977; is maximized. Rachlin, Green, Kagel, and Bat-
Herrnstein, 1961, 1970). That is, given two talio (1976) have shown, by means of a com-
response alternatives, R and L, we find: puter simulation, that matching is close to
maximization of overall reinforcement rate,
PR _____
whereas Rachlin (1978) and Staddon and
PR + PL RR + RL (1) Motheral (1978) have argued that matching is
TR RR (2) analytically equivalent to maximization.
TR+ TL RR + RL At the same time, there is some evidence
against the maximization thesis. Herrnstein
where PI, and PL are responses to alternatives and Heyman (1979) have found matching, but
R and L, TR and TL are time spent at those not maximization, in a concurrent variable-
alternatives, and RR and RL are obtained rein- interval variable-ratio (conc VI VR) experi-
forcements on those alternatives. Equations 1 ment; Heyman and Luce (1979) have argued
and 2 apply to asymptotic behavior; that is to that on conc VI VI schedules, matching and
say, distributions of behavior that appear not maximization do not quite coincide, though
to be varying systematically. their argument is not entirely persuasive
This beachhead has in turn generated a vig- (Rachlin, 1979). Finally, Herrnstein and
orous controversy with regard to the mecha- Vaughan (1980) have reviewed several experi-
nism, or process, responsible for matching at ments which are inconsistent with maximiza-
asymptote. One line of thought, deriving in tion, but are consistent with a process they
part from neoclassical economic theory, holds term melioration.
that matching is the outcome of a maximiza- This latter process, which remains to be
worked out with regard to a number of param-
This study was supported by NIMH grant MH-15494 eters, applies to situations in which several al-
to Harvard Univ-ersity. I thank R. J. Herrnsteini for ternatives are being sampled. If the local rate
comments on ani earlier draft. Reprints may be ob- of reinforcement on one alternative (i.e.,
tained from William Vaughan, Jr., Department of Psy-
chology and Social Relations, Harvard University, Cam- Ri/ Tj, number of reinforcements obtained
bridge, Massachusetts 02138. from an alternative divided by time spent
141
142 WILLIAM VAUGHAN, Jr.
there) differs from the local rate on another cm wide, with 2 keys centered on the front
alternative, then melioration says that the dis- panel, their centers 22 cm from the floor and
tribution of behavior will shift, in a relatively 10.5 cm apart. A standard grain magazine was
continuous manner, from the poorer to the used. The keys were transilluminated with
richer alternative. The formula for meliora- white light, and the chamber was illuminated
tion is given by: by two white Christmas tree bulbs. The keys
required a force of about .14 N to operate, and
RD =D RR RL
TL (3) a feedback click was provided. During rein-
forcement, which was access for 2.5 sec to
in which RD is the difference in local reinforce- mixed grain, only the hopper was illuminated,
ment rates from two alternatives, R and L. and pecks were not effective. The experiment
When RD is positive, time spent at R (i.e., TR) was controlled by a PDP-8 computer.
increases; when RD is negative, TL increases.
Equilibrium is reached when RD = 0 at which Procedure
point Equation 3 implies Equation 2, the A response to either key initiated a timer
matching relationship for time spent respond- which ran for two sec, unless reinforcement
ing. was delivered or the other key was pecked, ini-
Equation 3 is an idealization, in that no ac- tiating a timer there. Thus, if all interresponse
tual organism will respond to values of RD times happened to be less than two sec, one of
sufficiently close to zero. For example, Herrn- two timers would always run, except following
stein and Loveland (1975) find a closer ap- reinforcement. Time on a key was accumu-
proach to exclusive preference for the better lated while the associated timer was running;
of two VR schedules as the schedule values dif- similarly, the VI tape (simulated by computer)
fer more; melioration, matching, and maximi- associated with a key advanced only while the
zation all logically imply exclusive preference respective timer ran.
given unequal ratio requirements, no matter The rate at which each of the two VI tapes
how small. ran was a function of the distribution of be-
Although on some schedules (such as conc havior between the two keys, which was calcu-
VR VR) melioration logically implies maxi- lated as follows. After each four-minute period
mization of overall rate of reinforcement, in of responding, relative time on the right was
theory such a shift in behavior will occur defined as the proportion of that four-minute
whether or not overall rate of reinforcement sample deriving from the right key. This pro-
increases (see Herrnstein and Vaughan, 1980). portion then set the rate at which the VI tapes
In fact, a decrease in overall rate of reinforce- would run for the next four minutes of re-
ment can be consistent with melioration, given sponding. The specific functions relating local
appropriate contingencies. The present experi- reinforcement rates on the two keys to relative
ment was designed to test the melioration hy- time on the right are shown in Figure 1, for
pothesis along these lines. both the first condition (a) and second condi-
tion (b) of the present experiment.
METHOD Consider first the range of relative time on
the right between 0 and .125. Given this dis-
Subjects tribution, whenever time accumulated on the
Three male White Carneaux pigeons with right key, its VI tape advanced at a rate equiv-
prior experimental histories were run. Just alent to a VI 20" (180 reinforcements per
prior to the first condition described here, they hour); time spent on the left key advanced the
had been exposed to special conc VI VI sched- other tape at a rate equivalent to a VI 60" (60
ules for 26 sessions in the same chamber (re- reinforcements per hour). During responding
sults given in Figure 5.9 of Herrnstein and on the right key, the left VI tape did not ad-
Vaughan, 1980). They were maintained at ap- vance, and vice versa; while responding on
proximately 80% of their free-feeding weights. neither key, neither tape advanced. When re-
inforcement set up, the VI tape stopped, and
Apparatus the next response to that key would in general
A standard two-key pigeon chamber was em- produce reinforcement (during which neither
ployed. It was 33 cm high, 30 cm long, and 30 timer ran). However, whenever a stopped
MELIORATION 143
E b RESULTS
180 , , ,__ Condition a was run for 27 sessions before
the relative times for all pigeons appeared sta-
IV
120; ble. For Pigeons 1 through 3, the averages of
the relative times on the right for the last five
sessions were .196, .160, and .148, respectively
60 ----------__
_ _ _ -
tween .75 and .875, each key paid off at 60 rein- .75
forcers per hour, while from .875 to 1 the left .50
.25
Bird2~*
* b
key paid off at 180 reinforcers per hour while
the right key paid off at 60 reinforcers per .Z_ .75
.50
hour. c
0 .25
In Condition a, the left key (Figure 1, top, .E .75 a Bird32 b
L) paid off at a higher rate than the right key .50
..
E .12
The abscissa is the overall rate of reinforce- E
ment, calculated over the complete session du- ° .08
ration (excluding only feeder durations). The 1-
o .04
C
I I
0 .16
-a O
0 180 . I
Q. .12
c
60 70 80 90 100 110 120 130
I
L
0
.25 .25 .50 Reinforcements per hour
Fig. 4. Absolute deviation from matching and overall
Relative time on right rate of reinforcement (calculated over the whole ses-
Fig. 3. Modified programmed local rates of reinforce- sion, excluding reinforcer time) for the last five sessions
ment for Bird 2 as a function of relative time on the of Condition a (a), while behavior was changing in Con-
right between 0 and .5. Left: first modification of Con- dition b (b'), and during the last five sessions of Condi-
dition b schedules. Right: second modification. tion b (b).
MELIORATION 145
Table 1
Data in the form of sums for the last five sessions of Condition a, the five days during which behavior
was changing most rapidly (b'), and the last five sessions of Condition b.
Responses Time (Seconds) Reinforcers
Condition Left Right Left Right Total Left Right CO's
a
Bird 1 4920 1946 4262.7 1017.3 6308 173 37 638
Bird 2 9362 2115 4450.0 824.4 7333 167 36 291
Bird 3 5504 1203 4321.0 719.0 5808 163 32 401
b'
Bird 1 3212 5730 2673.3 3326.0 7151 62 132 1384
Bird 2 4571 6391 2615.9 2664.1 8244 64 120 482
Bird 3 4181 3424 3374.5 2385.5 6562 77 107 382
b
Bird 1 2337 5592 1638.2 6281.8 9246 29 124 753
Bird 2 3363 10696 1688.8 5751.2 9052 33 136 603
Bird 3 2056 6589 1768.7 6391.3 9038 32 151 419
(Since Figure 4 derives from averages across mentary maximization can account for the
individual sessions, there is not an exact corre- present results, then global maximization is
spondence between Table 1 and Figure 4.) Al- not a logical consequence of that theory.
though it might be expected that the propor- As a dynamic process, global maximization
tion of time spent responding would decrease requires that an animal sample at least two
as reinforcement rate went down, there is in distributions of behavior between the two
fact a slight increase from Condition a to Con- keys. Given a single distribution, regardless of
dition b. Bird 1's overall response rate showed the outcome, there is not sufficient information
some decline, but the other two birds re- to say in what direction behavior should shift
mained approximately constant between Con- so as to increase overall rate of reinforcement.
dition a and b. The same limitation applies to minimiza-
tion of deviation from matching, as construed
DISCUSSION here. Again, given a single distribution of be-
The logic of the procedure arises in three havior, there is not sufficient information to
competing theories of behavioral allocation, say in which direction (if either) behavior
namely, melioration, maximization, and must shift in order to reduce deviation from
matching. Each theory hypothesizes a dynamic matching. The dynamic matching process was
process leading to matching (at least under discounted as a possibility in Herrnstein and
some conditions, variously delimited in differ- Loveland's study of conc VR VR (1975) and
ent theories) in the steady state. For meliora- offered speculatively by Rachlin (1973), who
tion (Herrnstein 8c Vaughan, 1980) the dy- suggested that matching may come about be-
namic process acts to shift relative time from cause animals find unequal local rates of rein-
locally poorer to locally richer situations; for forcement aversive, but does not seem to have
global maximization (Rachlin et al., 1976), it any active adherents at present.
tends to maximize total reinforcement (or util- Each of the theories can be illustrated for
ity) across alternatives; for matching, it tends the present procedure. Figure 5 shows, for
to minimize deviations from matching. Conditions a and b, overall rate of reinforce-
With regard to maximization, this experi- ment as a function of relative time on the
ment addresses theories which assume that right. For Subjects 1 and 3, 180 reinforcers per
animals behave so as to maximize rate of rein- hour (of time spent responding) would be re-
forcement as measured, say, over several ses- ceived for spending between .125 and .25 rela-
sions (cf. Rachlin, 1978; Staddon & Motheral, tive time on the right, whereas, for all subjects,
1978). The experiment does not explicitly 60 reinforcers per hour would derive from
make contact with momentary maximization spending between .75 and .875 relative time on
(Shimp, 1966, 1969), except as follows: if mo- the right. The dashed lines show the adjusted
146 WILLIAM VAUGHAN, Jr.
MO 20r
0
60 E
C
E
E .25 O .
0 o 20
z 180 -
> .15
.10
0 .5
60 - - 0
0 .25 .50 .75
O .25 .50 .75 1 Relative time on right
Relative time on right Fig. 6. Top: programmed deviation from nmatching,
Fig. 5. Programmed overall rate of reinforcement, as measured in terms of the absolute differences between
a function of relative time on the right, for Conditions local rates of reinforcement, as a function of relative
a and b. This figure may be derived from Figure 1. Re- time on the right. Bottom: programmed deviation fromi
inforcement rate and relative time are both calculated matching in terms of the absolute differences between
on the basis of time spent responding. Dashed lines relative time on the right and relative programmed re-
represent modifications for Bird 2. inforcements on the right. Dashed lines represent modi-
fications for Bird 2.
values for Subject 2 in Condition b. The im-
plication from a maximization point of view, as to minimize deviation from matching (mea-
then, is that relative time on the right should sured either way). In fact, all three pigeons
fall between .125 and .25 in both conditions concluded Condition a in the left trough, but
(slightly altered for Subject 2 in Condition b). then shifted to .the right trough in Condition
Behavior between .75 and .875 would be least b, climbing across greater deviations from
in accord with a maximization analysis for all matching (see Figure 4) as they did so. [Notice
subjects. Yet, all subjects spent between .75 that Figure 6 (bottom) employs the same ordi-
and .875 of the time on the right (averaging nate units as Figure 4.] At least for Subjects 1
.78) in Condition b. and 3, we can reject minimization of deviation
Figure 6 (top) shows deviation from match- from matching as the dynamic process.
ing measured as the absolute difference in lo- The analysis of the schedules in terms of
cal reinforcement rates, as a function of rela- melioration is shown in Figure 7. This shows
tive time on the right. The function is the local rate of reinforcement on the right minus
same for Conditions a and b; dashed lines local rate of reinforcement on the left, or net
show adjusted values for Subject 2 in Condi- reinforcement rate on the right, as a function
tion b. Figure 6 (bottom) shows deviation from of relative time on the right. Within the range
matching (for Condition b) measured as the of 0 to .125 relative time on the right, the right
absolute difference between relative time on side pays off at 120 more reinforcers per hour
the right and relative programmed reinforce- than the left, and hence melioration predicts a
ments on the right; for Condition a the func- shift to the right (see Equation 3). The oppo-
tion is approximately the same. In both Con- site contingencies hold between .875 and 1,
ditions a and b, matching occurs between .125 and so by the same logic behavior should shift
and .25 and between .75 and .875. Hence, if a left. Between .125 and .25, and .75 and .875,
pigeon found itself in either of these troughs it the two local rates are equal, implying no net
should remain there, given that it behaves so tendency to shift.
MELIORATION 147