4 JULY, 1950
the individual. To record the beginning ditioned reflexes. But in operant be-
and end of learning or a few discrete havior the relation to a stimulus is dif-
steps will not suffice, since a series of ferent. A measure of latency involves
cross-sections will not give complete other considerations, as inspection of
coverage of a continuous process. The any case will show. Most operant re-
dimensions of the change must spring sponses may be emitted in the absence
from the behavior itself; they must not of what is regarded as a relevant stimu-
be imposed by an external judgment of lus. In such a case the response is
success or failure or an external criterion likely to appear before the stimulus is
of completeness. But when we review presented. It is no solution to escape
the literature with these requirements this embarrassment by locking a lever
in mind, we find little justification for so that an organism cannot press it
the theoretical process in which we take until the stimulus is presented, since we
so much comfort. can scarcely be content with temporal
The energy level or work-output of relations that have been forced into
behavior, for example, does not change compliance with our expectations. Run-
in appropriate ways. In the sort of be- way latencies are subject to this objec-
havior adapted to the Pavlovian experi- tion. In a typical experiment the door
ment (respondent behavior) there may of a starting box is opened and the time
be a progressive increase in the magni- that elapses before a rat leaves the box
tude of response during learning. But is measured. Opening the door is not
we do not shout our responses louder only a stimulus, it is a change in the
and louder as we learn verbal material, situation that makes the response pos-
nor does a rat press a lever harder and sible for the first time. The time meas-
harder as conditioning proceeds. In ured is by no means as simple as a la-
operant behavior the energy or magni- tency and requires another formulation.
tude of response changes significantly A great deal depends upon what the
only when some arbitrary value is dif- rat is doing at the moment the stimu-
ferentially reinforced—when such a lus is presented. Some experimenters
change is what is learned. wait until the rat is facing the door,
The emergence of a right response in but to do so is to tamper with the meas-
competition with wrong responses is an- urement being taken. If, on the other
other datum frequently used in the hand, the door is opened without refer-
study of learning. The maze and the ence to what the rat is doing, the first
discrimination box yield results which major effect is the conditioning of fa-
may be reduced to these terms. But a vorable waiting behavior. The rat even-
behavior-ratio of right vs. wrong can- tually stays near and facing the door.
not yield a continuously changing meas- The resulting shorter starting-time is
ure in a single experiment on a single not due to a reduction in the latency of
organism. The point at which one re- a response, but to the conditioning of
sponse takes precedence over another favorable preliminary behavior.
cannot give us the whole history of the Latencies in a single organism do not
change in either response. Averaging follow a simple learning process. Rele-
curves for groups of trials or organisms vant data on this point were obtained as
will not solve this problem. part of an extensive study of reaction
Increasing attention has recently been time. A pigeon, enclosed in a box, is
given to latency, the relevance of which, conditioned to peck at a recessed disc
like that of energy level, is suggested by in one wall. Food is presented as rein-
the properties of conditioned and uncon- forcement by exposing a hopper through
ARE THEORIES OF LEARNING NECESSARY? 197
a hole below the disc. If responses are Some responses occur sooner and others
reinforced only after a stimulus has are delayed, but the commonest value
been presented, responses at other times remains unchanged (bottom curve in
disappear. Very short reaction times Fig. 1). The longer latencies are easily
are obtained by differentially reinforc- explained by inspection. Emotional be-
ing responses which occur very soon havior, of which examples will be men-
after the stimulus (4). But responses tioned later, is likely to be in progress
also come to be made very quickly with- when the ready-signal is presented. It
out differential reinforcement. Inspec- is often not discontinued before the
tion shows that this is due to the de- "go" signal is presented, and the result
velopment of effective waiting. The is a long starting-time. Cases also be-
bird comes to stand before the disc with gin to appear in which the bird simply
its head in good striking position. Un- does not respond at all during a speci-
der optimal conditions, without differ- fied time. If we average a large num-
ential reinforcement, the mean time be- ber of readings, either from one bird or
tween stimulus and response will be of many, we may create what looks like a
the order of % sec. This is not a true progressive lengthening of latency. But
reflex latency, since the stimulus is dis- the data for an individual organism do
criminative rather than eliciting, but not show a continuous process.
it is a fair example of the latency used Another datum to be examined is the
in the study of learning. The point is rate at which a response is emitted.
that this measure does not vary con- Fortunately the story here is different.
tinuously or in an orderly fashion. By We study this rate by designing a situa-
giving the bird more food, for example, tion in which a response may be freely
we induce a condition in which it does repeated, choosing a response (for ex-
not always respond. But the responses ample, touching or pressing a small lever
that occur show approximately the same or key) that may be easily observed
temporal relation to the stimulus (Fig. and counted. The responses may be re-
1, middle curve). In extinction, of spe- corded on a polygraph, but a more con-
cial interest here, there is a scattering venient form is a cumulative curve from
of latencies because lack of reinforce- which rate of responding is immediately
ment generates an emotional condition, read as slope. The rate at which a re-
sponse is emitted in such a situation
comes close to our preconception of
STANDARD HUNGER the learning process. As the organism
{All responses re/fibrced)
learns, the rate rises. As it unlearns
(for example, in extinction) the rate
falls. Various sorts of discriminative
VERY LOW HUNGER stimuli may be brought into control
(An responses reinforced)
of the response with corresponding
modifications of the rate. Motivational
changes alter the rate in a sensitive
way. So do those events which we
STANDARD HUNGER
IfXTINCTlCH) speak of as generating emotion. The
range through which the rate varies
significantly may be as great as of the
4 ~T 4 5 6 7 8 9 10 H 12 13 14 I order of 1000:1. Changes in rate are
RESPO'NSE TIME IN TENTHS OF A SECOND satisfactorily smooth in the individual
Fro. 1 case, so that it is not necessary to aver-
198 B. F. SKINNER
age cases. A given value is often quite future response, but our data are in the
stable: in the pigeon a rate of four or form of frequencies of responses that
five thousand responses per hour may have already occurred. These responses
be maintained without interruption for were presumably similar to each other
as long as fifteen hours. and to the response to be predicted.
Rate of responding appears to be the But this raises the troublesome problem
only datum that varies significantly and of response-instance vs. response-class.
in the expected direction under condi- Precisely what responses are we to take
tions which are relevant to the "learn- into account in predicting a future in-
ing process." We may, therefore, be stance? Certainly not the responses
tempted to accept it as our long-sought- made by a population of different or-
for measure of strength of bond, excita- ganisms, for such a statistical datum
tory potential, etc. Once in possession raises more problems than it solves. To
of an effective datum, however, we may consider the frequency of repeated re-
feel little need for any theoretical con- sponses in an individual demands some-
struct of this sort. Progress in a scien- thing like the experimental situation
tific field usually waits upon the dis- just described.
covery of a satisfactory dependent vari- This solution of the problem of a ba-
able. Until such a variable has been sic datum is based upon the view that
discovered, we resort to theory. The operant behavior is essentially an emis-
entities which have figured so promi- sive phenomenon. Latency and magni-
nently in learning theory have served tude of response fail as measures be-
mainly as substitutes for a directly ob- cause they do not take this into ac-
servable and productive datum. They count. They are concepts appropriate
have little reason to survive when such to the field of the reflex, where the all
a datum has been found. but invariable control exercised by the
It is no accident that rate of respond- eliciting stimulus makes the notion of
ing is successful as a datum, because it probability of response trivial. Con-
is particularly appropriate to the funda- sider, for example, the case of latency.
mental task of a science of behavior. Because of our acquaintance with sim-
If we are to predict behavior (and pos- ple reflexes we infer that a response that
sibly to control it), we must deal with is more likely to be emitted will be
probability of response. The business emitted more quickly. But is this true?
of a science of behavior is to evaluate What can the word "quickly" mean?
this probability and explore the condi- Probability of response, as well as pre-
tions that determine it. Strength of diction of response, is concerned with
bond, expectancy, excitatory potential, the moment of emission. This is a point
and so on, carry the notion of prob- in time, but it does not have the tem-
ability in an easily imagined form, but poral dimension of a latency. The exe-
the additional properties suggested by cution may take time after the response
these terms have hindered the search for has been initiated, but the moment of
suitable measures. Rate of responding occurrence has no duration.3 In recog-
is not a "measure" of probability but it 3
It cannot, in fact, be shortened or length-
is the only appropriate datum in a ened. Where a latency appears to be forced
formulation in these terms. toward a minimal value by differential re-
As other scientific disciplines can at- inforcement, another interpretation is called
for. Although we may differentially reinforce
test, probabilities are not easy to han- more energetic behavior or the faster execu-
dle. We wish to make statements about tion of behavior after it begins, it is meaning-
the likelihood of occurrence of a single less to speak of differentially reinforcing re-
ARE THEORIES OF LEARNING NECESSARY? 199
nizing the emissive character of operant' not follow that learning is not taking
behavior and the central position of place in such situations. The notion
probability of response as a datum, la- of probability is usually extrapolated to
tency is seen to be irrelevant to our cases in which a frequency analysis can-
present task. not be carried out. In the field of be-
Various objections have been made to havior we arrange a situation in which
the use of rate of responding as a basic frequencies are available as data, but
datum. For example, such a program we use the notion of probability in
may seem to bar us from dealing with analyzing and formulating instances or
many events which are unique occur- even types of behavior which are not
rences in the life of the individual. A susceptible to this analysis.
man does not decide upon a career, get Another common objection is that a
married, make a million dollars, or get rate of response is just a set of latencies
killed in an accident often enough to and hence not a new datum at all. This
make a rate of response meaningful. is easily shown to be wrong. When we
But these activities are not responses. measure the time elapsing between two
They are not simple unitary events lend- responses, we are in no doubt as to what
ing themselves to prediction as such. If the organism was doing when we started
we are to predict marriage, success, acci- our clock. We know that it was just
dents, and so on, in anything more than executing a response. This is a natural
statistical terms, we must deal with the zero—quite unlike the arbitrary point
smaller units of behavior which lead to from which latencies are measured. The
and compose these unitary episodes. If free repetition of a response yields a
the units appear in repeatable form, the rhythmic or periodic datum very differ-
present analysis may be applied. In ent from latency. Many periodic physi-
the field of learning a similar objection cal processes suggest parallels.
takes the form of asking how the pres- We do not choose rate of responding
ent analysis may be extended to experi- as a basic datum merely from an analy-
mental situations in which it is impos- sis of the fundamental task of a science
sible to observe frequencies. It does of behavior. The ultimate appeal is to
its success in an experimental science.
spouses with short or long latencies. What The material which follows is offered as
we actually reinforce differentially are (a)
favorable waiting behavior and (b) mote vig- a sample of what can be done. It is
orous responses. When we ask a subject to not intended as a complete demonstra-
respond "as soon as possible" in the human tion, but it should confirm the fact
reaction-time experiment, we essentially ask that when we are in possession of a
him (a) to carry out as much of the response
as possible without actually reaching the cri- datum which varies in a significant fash-
terion of emission, (b) to do as little else as ion, we are less likely to resort to theo-
possible, and (c) to respond energetically after retical entities carrying the notion of
the stimulus has been given. This may yield probability of response.
a minimal measurable time between stimulus
and response, but this time is not necessarily
a basic datum nor have our instructions al- Why Learning Occurs
tered it as such, A parallel interpretation of
the differential reinforcement of long "laten- We may define learning as a change
cies" is required. This is easily established by in probability of response but we must
inspection. In the experiments with pigeons also specify the conditions under which
previously cited, preliminary behavior is con- it comes about. To do this we must
ditioned that postpones the response to the
key until the proper time. Behavior that survey some of the independent vari-
"marks time" is usually conspicuous. ables of which probability of response is
200 B. F. SKINNER
a function. Here we meet another kind built up which suppresses the behavior.
of learning theory, This "experimental inhibition" or "re-
An effective class-room demonstration action inhibition" must be assigned to a
of the Law of Effect may be arranged different dimensional system, since noth-
in the following way. A pigeon, re- ing at the level of behavior corresponds
duced to 80 per cent of its ad lib weight, to opposed processes of excitation and
is habituated to a small, semi-circular inhibition. Rate of responding is simply
amphitheatre and is fed there for sev- increased by one operation and de-
eral days from a food hopper, which the creased by another. Certain effects
experimenter presents by closing a hand commonly interpreted as showing re-
switch. The demonstration consists of lease from a suppressing force may be
establishing a selected response by suit- interpreted in other ways. Disinhibi-
able reinforcement with food. For ex- tion, for example, is not necessarily the
ample, by sighting across the amphi- uncovering of suppressed strength; it
theatre at a scale on the opposite wall, may be a sign of supplementary strength
it is possible to present the hopper from an extraneous variable. The proc-
whenever the top of the pigeon's head ess of spontaneous recovery, often cited
rises above a given mark. Higher and to support the notion of suppression,
higher marks are chosen until, within a has an alternative explanation, to be
few minutes, the pigeon is walking about noted in a moment.
the cage with its head held as high as Let us evaluate the question of why
possible. In another demonstration the learning takes place by turning again to
bird is conditioned to strike a marble some data. Since conditioning is usu-
placed on the floor of the amphitheatre. ally too rapid to be easily followed, the
This may be done in a few minutes by process of extinction will provide us
reinforcing successive steps. Food is with a more useful case. A number of
presented first when the bird is merely different types of curves have been con-
moving near the marble, later when it , sistently obtained from rats and pigeons
looks down in the direction of the using various schedules of prior rein-
marble, later still when it moves it? head forcement. By considering some of the
toward the marble, and finally when it relevant conditions we may see what
pecks it. Anyone who has seen such a room is left for theoretical processes.
demonstration knows that the Law of The mere passage of time between
Effect is no theory. It simply specifies conditioning and extinction is a vari-
a procedure for altering the probability able that has surprisingly little effect.
of a chosen response. The rat is too short-lived to make an
But when we try to say why rein- extended experiment feasible, but the
forcement has this effect, theories arise. pigeon, which may live ten or fifteen
Learning is said to take place because years, is an ideal subject. More than
the reinforcement is pleasant, satisfying, five years ago, twenty pigeons were con-
tension reducing, and so on. The con- ditioned to strike a large translucent
verse process of extinction is explained key upon which a complex visual pat-
with comparable theories. If the rate tern was projected. Reinforcement was
of responding is first raised to a high contingent upon the maintenance of a
point by reinforcement and reinforce- high and steady rate of responding and
ment then withheld, the response is ob- upon striking a particular feature of the
served to occur less and less frequently visual pattern. These birds were set
thereafter. One common theory ex- aside in order to study retention. They
plains this by asserting that a state is were transferred to the usual living
ARE THEORIES OF LEARNING NECESSARY? 201
quarters, where they served as breeders. reported elsewhere (3). The response
Small groups were tested for extinction of pressing a lever was established in
at the end of six months, one year, two eight rats with a schedule of periodic
years, and four.years. Before the test reinforcement. They were fed the main
each bird was transferred to a separate part of their ration on alternate days so
living cage. A controlled feeding sched- that the rates of responding on succes-
ule was used to reduce the weight to ap- sive days were alternately high and low.
proximately 80 per cent of the ad lib Two subgroups of four rats each were
weight. The bird was then fed in the matched on the basis of the rate main-
dimly lighted experimental apparatus in tained under periodic reinforcement un-
the absence of the key for several days, der these conditions. The response was
during which emotional responses to the then extinguished—in one group on al-
apparatus disappeared. On the day of ternate days when the hunger was high,
the test the bird was placed in the dark- in the other group on alternate days
ened box. The translucent key was when the hunger was low. (The same
present but not lighted. No responses amount of food was eaten on the non-
were made. When the pattern was experimental days as before.) The re-
projected upon the key, all four birds sult is shown in Fig. 3. The upper
responded quickly and extensively. Fig. graph gives the raw data. The levels of
2 shows the largest curve obtained. hunger are indicated by the points at P
This bird struck the key within two on the abscissa, the rates prevailing un-
seconds after presentation of a visual der periodic reinforcement. The subse-
pattern that it had not seen for four quent points show the decline in extinc-
years, and at the precise spot upon tion. If we multiply the lower curve
which differential reinforcement had through by a factor chosen to super-
previously been based. It continued to impose the points at P, the curves are
respond for the next hour, emitting reasonably closely superimposed, as
about 700 responses. This is of the or- shown in the lower graph. Several other
der of one-half to one-quarter of the experiments on both rats and pigeons
responses it would have emitted if ex- have confirmed this general principle.
tinction had not been delayed four If a given ratio of responding prevails
years, but otherwise, the curve is fairly under periodic reinforcement, the slopes
typical. of later extinction curves show the same
Level of motivation is another vari- ratio. Level of hunger determines the
able to be taken into account. An ex- slope of the extinction curve but not its
ample of the effect of hunger has been curvature.
-MO
-600
EXTINCTION 4 YEARS
AFTER CONDITIONING
MINUTES
FIG. 2
202 B. F. SKINNER
,, High Degree
Intermediate distances produce inter-
mediate slopes. It should also be noted
that the change from one position to
another is felt immediately. If repeated
Daily Periods of One Hour Each responding in a difficult position were
FIG. 3 to build a considerable amount of re-
action inhibition, we should expect the
Another variable, difficulty of re- rate to be low for some little time after
sponse, is especially relevant because it returning to an easy response. Con-
has been used to test the theory of re- trariwise, if an easy response were to
action inhibition (1), on the assump- build little reaction inhibition, we should
tion that a response requiring consider- expect a fairly high rate of responding
able energy will build up more reaction for some time after a difficult position
inhibition than an easy response and is assumed. Nothing like this occurs.
lead, therefore, to faster extinction. The "more rapid extinction" of a diffi-
The theory requires that the curvature cult response is an ambiguous expres-
of the extinction curve be altered, not sion. The slope constant is affected and
merely its slope. Yet there is evidence with it the number of responses in ex-
that difficulty of response acts like level tinction to a criterion, but there may
of hunger simply to alter the slope. be no effect upon curvature.
Some data have been reported but not One way of considering the question
published (5). A pigeon is suspended of why extinction curves are curved is
in a jacket which confines its wings and to regard extinction as a process of ex-
ABE THEORIES op LEARNING NECESSARY? 203
haustion comparable to the loss of heat particularly useful concept, nor does the
from source to sink or the fall in the view that extinction is a process of ex-
level of a reservoir when an outlet is haustion add much to the observed fact
opened. Conditioning builds up a pre- that extinction curves are curved in a
disposition to respond—a "reserve"— certain way.
which extinction exhausts. This is per- There are, however, two variables
haps a defensible description at the level that affect the rate, both of which oper-
of behavior. The reserve is not neces- ate during extinction to alter the curva-
sarily a theory in the present sense, ture. One of these falls within the field
since it is not assigned to a different di- of emotion. When we fail to reinforce
mensional system. It could be opera- a response that has previously been re-
tionally denned as a predicted extinc- inforced, we not only initiate a process
tion curve, even though, linguistically, it of extinction, we set up an emotional
makes a statement about the momentary response—perhaps what is often meant
condition of a response. But it is not a by frustration. The pigeon coos in an
TIME JN MINUTES
Fio. 4
204 B. F. SKINNER
tn
UJ
U)
0
Q.
<nUJ
a:
TIME
FIG. 6
206 B. F. SKINNER
100
TIME IN MINUTES
Fio. 1
ONE
4QOOr PIGEON, ONE
CONTINUOUS PERIOD
to 3000
en
o
Q.
2000
Crt
EXTINCTION
LJ (After aperiodic reinforcement,
K 1000 arithmetic series,
mean: one reinforcement per minute)
_L J_ _L
4 5 8
TIME IN HOURS
FIG. 8
208 B. F. SKINNER
FIG. 9
of reinforcement repeated every hour. tervals averaging one minute, but the
In changing to this program from the curves are otherwise much alike. Fur-
arithmetic series, the rates first declined ther exposure to the geometric sched-
during the longer intervals, but the pi- ule builds up longer runs during which
geons were soon able to sustain a con- the rate does not change significantly.
stant rate of responding under it. Two Curve 2 followed Curve 1 after two and
records in the form in which they were one-half hours of further aperiodic re-
recorded are shown in Fig. 9. (The inforcement. On the day shown in
pen resets to zero after every thousand Curve 2 a few aperiodic reinforcements
responses. In order to obtain a single were first given, as marked at the be-
cumulative curve it would be necessary ginning of the curve. When reinforce-
to cut the record and to piece the sec- ment was discontinued, a fairly con-
tions together to yield a continuous line. stant rate of responding prevailed for
The raw form may be reproduced with several thousand responses. After an-
less reduction.) Each reinforcement is other experimental session of two and
represented by a horizontal dash. The one-half hours with the geometric se-
time covered is about 3 hours. Records ries, Curve 3 was recorded. This ses-
are shown for two pigeons that main- sion also began with a short series of
tained different overall rates under this aperiodic reinforcements, followed by a
program of reinforcement. sustained run of more than 6000 unrein-
Under such a schedule a constant rate forced responses with little change in
of responding is sustained for at least rate (A). There seems to be no reason
21 minutes without reinforcement, after why other series averaging perhaps more
which a reinforcement is received. Less than five minutes per interval and con-
novelty should therefore develop during taining much longer exceptional inter-
succeeding extinction. In Curve 1 of vals would not carry such a straight
Fig. 10 the pigeon had been exposed to line much further.
several sessions of several hours each In this attack upon the problem of ex-
with this geometric set of intervals. tinction we create a schedule of rein-
The number of responses emitted in ex- forcement which is so much like the
tinction is about twice that of the curve conditions that will prevail during ex-
in Fig. 8 after the arithmetic set of in- tinction that no decline in rate takes
ARE THEORIES OF LEARNING NECESSARY? 209
place for a long time. In other words der regular reinforcement. We have
we generate extinction with no curva- not, of course, entirely eliminated this
ture. Eventually some kind of exhaus- correlation. Even though there is no
tion sets in, but it is not approached longer a differential reinforcement of
gradually. The last part of Curve 3 high against low rates, practically all
(unfortunately much reduced in the fig- reinforcements have occurred under a
ure) may possibly suggest exhaustion in constant rate of responding.
the slight overall curvature, but it is a Further study of reinforcing sched-
small part of the whole process. The ules may or may not answer the ques-
record is composed mainly of runs of a tion of whether the novelty appearing in
few hundred responses each, most of the extinction situation is entirely re-
them at approximately the same rate as sponsible for the curvature. It would
that maintained under periodic rein- appear to be necessary to make the
forcement. The pigeon stops abruptly; conditions prevailing during extinction
when it starts to respond again, it identical with the conditions prevailing
quickly reaches the rate of responding during conditioning. This may be im-
under which it was reinforced. This possible, but in that case the question is
recalls the spurious correlation between academic. The hypothesis, meanwhile,
rapid responding and reinforcement un- is not a theory in the present sense,
EXTINCTION
(After continued aperiodic reinforcement,
geometric series)
TIME
FIG. 10
210 B. F. SKINNER
ing in position (right or left) or in some key. The need for considering the be-
property like color randomized with re- havior of changing over is clearly shown
spect to position. By occasionally rein- if we now reverse these conditions and
forcing a response on one key or the reinforce responses to the left key only.
other without favoring either key, we The ultimate result is a high rate of re-
obtain equal rates of responding on the sponding on the left key and a low rate
two keys. The behavior approaches a on the right. By reversing the condi-
simple alternation from one key to the tions again the high rate can be shifted
other. This follows the rule that tend- back to the right key. In Fig. 11 a
encies to respond eventually correspond group of eight curves have been aver-
to the probabilities of reinforcement. aged to follow this change during six
Given a system in which one key or the experimental periods of 45 minutes
other is occasionally connected with the each. Beginning on the second day in
magazine by an external clock, then if the graph responses to the right key
the right key has just been struck, the (RR) decline in extinction while re-
probability of reinforcement via the left sponses to the left key (RL) increase
key is higher than that via the right through periodic reinforcement. The
since a greater interval of time has mean rate shows no significant varia-
elapsed during which the clock may
have closed the circuit to the left key.
But the bird's behavior does not cor-
respond to this probability merely out
of respect for mathematics. The spe-
cific result of such a contingency of
reinforcement is that changing-to-the-
other-key-and-striking is more often re-
inforced than striking-the-same-key-a-
second-time. We are no longer dealing o
with just two responses. In order to
analyze "choice" we must consider a
single final response, striking, without
respect to the position or color of the
key, and in addition the responses of
changing from one key or color to the
other.
Quantitative results are compatible
to
UJ
with this analysis. If we periodically CO
reinforce responses to the right key
only, the rate of responding on the
right will rise while thajt on the left will 1
fall. The response of changing-from-
right-to-left is never reinforced while
•ffl-
the response of changing-from-left-to-
right is occasionally so. When the bird
is striking on the right, there is no great
tendency to change keys; when it is 1 2 3 4 5 6
striking on the left, there is a strong
tendency to change. Many more re- DAYS
sponses come to be made to the right FIG. 11
212 B. F. SKINNER
TIME IN MINUTES
FIG. 12
tion, since periodic reinforcement is con- also determines the scope of the re-
tinued on the same schedule. The mean sponse of changing-over, with an im-
rate shows the condition of strength of plied difference in sensory feed-back.
the response of striking a key regard- It also modifies the-spread of reinforce-
less of position. The distribution of re- ment to responses supposedly not rein-
sponses between right and left depends forced, since if the keys are close to-
upon the relative strength of the re- gether, a response reinforced on one
sponses of changing over. If this were side may occur sooner after a preceding
simply a case of the extinction of one response on the other side. In Fig. 11
response and the concurrent recondi- the two keys were about one inch apart.
tioning of another, the mean curve They were therefore fairly similar with
would not remain approximately hori- respect to position in the experimental
zontal since reconditioning occurs much box. Changing from one to the other
more rapidly than extinction.6 involved a minimum of sensory feed-
The rate with which the bird changes back, and reinforcement of a response
from one key to the other depends upon to one key could follow very shortly
the distance between the keys. This dis- upon a response to the other. When
tance is a rough measure of the stimu- the keys are separated by as much as
lus-difference between the two keys. It four inches, the change in strength is
B
Two topographically independent responses, much more rapid. Fig. 12 shows two
capable of emission at the same time and hence curves recorded simultaneously from a
not requiring change-over, show separate proc- single pigeon during one experimental
esses of reconditioning and extinction, and the
combined rate of responding varies. period of about 40 minutes. A high rate
ARE THEORIES OF LEARNING NECESSARY? 213
to the right key and a low rate to the for R. The constancy of the rate
left had previously been established. In throughout the process of extinction has
the figure no responses to the right were been shown in the figure by multiply-
reinforced, but those to the left were re- ing L through by a suitable constant
inforced every minute as indicated by and entering the points as small circles
the vertical dashes above curve L. The on R. If extinction altered the prefer-
slope of R declines in a fairly smooth ence, the two curves could not be super-
fashion while that of L increases, also imposed in this way.
fairly smoothly, to a value comparable These formulations of discrimination
to the initial value of R. The bird has and choosing enable us to deal with
conformed to the changed contingency what is generally regarded as a much
within a single experimental period. more complex process—matching to
The mean rate of responding is shown sample. Suppose we arrange three
by a dotted line, which again shows no translucent keys, each of which may be
significant curvature. illuminated with red or green light. The
What is called "preference" enters middle key functions as the sample and
into this formulation. At any stage of we color it either red or green in ran-
the process shown in Fig. 12 preference dom order. We color the two side keys
might be expressed in terms of the rela- one red and one green, also in random
tive rates of responding to the two keys. order. The "problem" is to strike the
This preference, however, is not in strik- side key which corresponds in color to
ing a key but in changing from one key the middle key. There are only four
to the other. The probability that the three-key patterns in such a case, and
bird will strike a key regardless of its it is possible that a pigeon could learn
identifying properties behaves independ- to make an appropriate response to each
ently of the preferential response of pattern. This does not happen, at least
changing from one key to the other. within the temporal span of the experi-
Several experiments have revealed an ments to date. If we simply present a
additional fact. A preference remains series of settings of the three colors and
fixed if reinforcement is withheld. Fig. reinforce successful responses, the pi-
13 is an example. It shows simultane- geon will strike the side keys without
ous extinction curves from two keys respect to color or pattern and be rein-
during seven daily experimental periods forced SO per cent of the time. This is,
of one hour each. Prior to extinction in effect, a schedule of "fixed ratio" re-
the relative strength of the responses inforcement which is adequate to main-
of changing-to-R and changing-to-L tain a high rate of responding.
yielded a "preference" of about 3 to 1 Nevertheless it is possible to get a
pigeon to match to sample by rein-
forcing the discriminative responses
of striking-red-after-being-stimulated-
by-red and striking-green-after-being-
stimulated-by-green while extinguishing
the other two possibilities. The diffi-
culty is in arranging the proper stimu-
lation at the time of the response. The
sample might be made conspicuous—
for example, by having the sample color
DAYS in the general illumination of the ex-
FIG. 13 perimental box. In such a case the pi-
214 B. F. SKINNER
geon would learn to strike red keys in The. behavior of matching survives
a red light and green keys in a green unchanged when all reinforcement is
light (assuming a neutral illumination withheld. An intermediate case has
of the background of the keys). But been established in which the correct
a procedure which holds more closely matching response is only periodically
to the notion of matching is to induce reinforced. In one experiment one color
the pigeon to "look at the sample" by appeared on the middle^ key for one
means of a separate reinforcement. We minute; it was then changed or not
may do this by presenting the color on changed, at random, to the other color.
the middle key first, leaving the side A response to this key illuminated the
keys uncolored. A response to the mid- side- keys, one red and one green, in
dle key is then reinforced (secondarily) random order. A response to a side
by illuminating the side keys. The pi- key cut off the illumination to both side
geon learns to make two responses in keys, until the middle key had again
quick succession—to the middle key and been struck. The apparatus recorded
then to one side key. The response to all matching responses on one graph
the side key follows quickly upon the and all non-matching on .another. Pi-
visual stimulation from the middle key, geons which have acquired matching
which is the requisite condition for a behavior under continuous reinforce-
discrimination. Successful matching was ment have maintained this behavior
readily established in all ten pigeons when reinforced no oftener than once
tested with this technique. Choosing per minute on the average. They may
the opposite is also easily set up. The make thousands of matching responses
discriminative response of striking-red- per hour while being reinforced for no
after-being-stimulated-by-red is appar- more than sixty of them. This sched-
ently no easier to establish than strik- ule .will not necessarily develop match-
ing-red-after-being-stimulated-by-green. ing behavior in a naive bird, for the
When the response is to a key of the problem can be solved in three ways.
same color, however, generalization may The bird will receive practically as
make it possible for the bird to match a many reinforcements if it responds to
new color. This is an extension of the (1) only one key or (2) only one color,
notion of matching that has not yet since the programming of the experi-
been studied with this method. ment makes any persistent response
Even when matching behavior has eventually the correct one.
been well established, the bird will not A sample of the data obtained in a
respond correctly if all three keys are complex experiment of this sort is given
now presented at the same time. The in Fig. 14. Although this pigeon had
bird does not possess strong behavior learned to match color under continuous
of looking at the sample. The experi- reinforcement, it changed to the spuri-
menter must maintain a separate rein- ous solution of a color preference under
forcement to keep this behavior in periodic reinforcement. Whenever the
strength. In monkeys, apes, and hu- sample was red, it struck both the sam-
man subjects the ultimate success in ple and the red side key and received all
choosing is apparently sufficient to re- reinforcements. When the sample was
inforce and maintain the behavior of green, it did not respond and the side
looking at the sample. It is possible keys were not illuminated. The result
that this species difference is simply a shown at the beginning of the graph in
difference in the temporal relations re- Fig. 14 is a high rate of responding on
quired for reinforcement. the upper graph, which records match-
ARE THEORIES OF LEARNING NECESSARY? 215
ing responses. (The record is actually on the two curves. It seems to follow
step-wise, following the presence or ab- the same rule as in the case of choos-
sence of the red sample, but this is lost ing: changes in the distribution of re-
in the reduction in the figure.) A color sponses between two keys do not in-
preference, however, is not a solution to volve the over-all rate of responding to
the problem of opposites. By chang- a key. This mean rate will not remain
ing to this problem, it was possible to constant under the spurious solution
change the bird's behavior as shown be- achieved with a color preference, as at
tween the two vertical lines in the fig- the beginning of this figure.
ure. The upper curve between these These experiments on a few higher
lines shows the decline in matching re- processes have necessarily been very
sponses which had resulted from the briefly described. They are not of-
color preference. The lower curve be- fered as proving that theories of learn-
tween the same lines shows the develop- ing are not necessary, but they may
ment of responding to and matching the suggest an alternative program in this
opposite color. At the second vertical difficult area. The data in the field of
line the reinforcement was again made the higher mental processes transcend
contingent upon matching. The upper single responses or single stimulus-re-
curve shows the reestablishment of sponse relationships. But they appear
matching behavior while the lower curve to be susceptible to formulation in terms
shows a decline in striking the opposite of the differentiation of concurrent re-
color. The result was a true solution: sponses, the discrimination of stimuli,
the pigeon struck the sample, no mat- the establishment of various sequences
ter what its color, and then the corre- of responses, and so on. There seems
sponding side key. The lighter line to be no a priori reason why a complete
connects the means of a series of points account is not possible without appeal
to theoretical processes in other dimen-
sional systems.
Conclusion
Perhaps to do without theories alto-
gether is a tour de force that is too
much to expect as a general practice.
Theories are fun. But it is possible
that the most rapid progress toward an
understanding of learning may be made
by research that is not designed to test
theories. An adequate impetus is sup-
plied by the inclination to obtain data
showing orderly changes characteristic
of the learning process. An acceptable
scientific program is to collect data of
this sort and to relate them to ma-
nipulable variables, selected for study
through a common sense exploration of
the field.
30 60 90 120 This does not exclude the possibility
TIME IN MINUTES
ITxelvt experimental periods joined) of theory in another sense. Beyond the
FIG. 14 collection of uniform relationships lies
216 B. F. SKINNER
the need for a formal representation of there are many relevant variables, and
the data reduced to a minimal number until their importance has been experi-
of terms. A theoretical construction mentally determined, an equation that
may yield greater generality than any allows for them will have so many arbi-
assemblage of facts. But such a con- trary constants that a good fit will be a
struction will not refer to another di- matter of course and a cause for very
mensional system and will not, there- little satisfaction.
fore, fall within our present definition.
It will not stand in the way of our REFERENCES
search for functional relations because 1. MOWRER, 0. H., & JONES, H. M, Extinc-
it will arise only after relevant variables tion and behavior variability as func-
have been found and studied. Though tions of effortfulness of task. /. exp.
Psychol, 1943, 33, 369-386.
it may be difficult to understand, it will 2. SKINNER, B. F. The behavior of organisms.
not be easily misunderstood, and it will New York: D. Appleton-Century Co.,
have none of the objectionable effects of 1938.
the theories here considered. 3. . The nature of the operant reserve.
Psychol. Butt., 1940, 37, 423 (abstract).
We do not seem to be ready for 4. , Differential reinforcement with re-
theory in this sense. At the moment spect to time. Amer. Psychol., 1946, 1,
we make little effective use of empirical, 274-275 (abstract).
let alone rational, equations. A few of S. . The effect of the difficulty of a re-
the present curves could have been sponse upon its rate of emission. Amer.
Psychol, 1946, 1, 462 (abstract).
fairly closely fitted. But the most ele-
mentary preliminary research shows that [MS. received December 5, 1949]