Learning is a process that results in relatively consistent change in behavior or behavior potential
and is based on experience. Following are the three critical parts of this definition:
1. Learning connotes change: learning is a change in behavior for better or worse. The
changes produced by learning are not always positive in nature.
2. It is a change that takes place through practice or experience. Changes that come about by
growth, drugs, illness, fatigue are not called learning
3. The direction of learning can be vertical and or horizontal. “Vertical learning” applies to
addition of knowledge to that which already is possessed in a particular area of knowledge,
the improvement of a skill in which some dexterity has been achieved or strengthening of
developing attitudes and thinking. “Horizontal learning” means that the learner is
widening his or her learning horizons, competence in new forms of skills, gaining new
interests, discovering new approaches to problem solving and developing different
attitudes towards newly experienced situations and conditions.
4. Learning is an active process
5. Learning is goal directed
Simultaneous conditioning
In this conditioning, neutral stimulus and UCS begin and end at the same
time
Backward conditioning
In this conditioning, UCS is presented before neutral stimulus is presented.
Forward conditioning
UCS UCS
Time Time
UCS UCS
Time Time
Fig: Four variations of the neutral-UCS temporal (time) arrangements in classical conditioning
Apart from these factors, conditioning is faster when the intensity of either
the neutral or unconditioned stimulus increases during the learning trials.
(CS alone)
(Strength
of CR)
Weak
Trials
(Neutral stimulus)
Tone Salivation
(CS1) (CR)
During conditioning
(CS2) (CR)
b. Operant conditioning
At about the same time that Pavlov was using classical conditioning to induce Russian
dogs to salivate to salivate to the sound of a bell, Edward L. Thorndike (1898) was
watching American cats trying to escape from puzzle boxes. Thorndike built a special
cage called a puzzle box that could be opened from the inside by pulling a string or
stepping on a lever. He placed a hungry cat inside the box. Food was placed outside so
that the animal had to learn how to open the box to get the food. The cat at first
scratched and pushed the bars, tried to dig through the floor and by chance it
eventually stepped on the lever and opened the door. The same cat was again placed
in it repeatedly for several trials. Overtime, the cat learned to press the lever soon
after the door was shut. From this, he concluded that with trial and error, the cat
gradually eliminated the responses that failed to open the door and became more
likely to perform the actions that worked. He called this process Instrumental Learning
because an organism’s behavior is instrumental in bringing about certain outcomes.
He also proposed the Law of effect in a given situation, a response followed by a
satisfying consequence will become more likely to occur and a response followed by
an annoying consequence will become less likely to occur.
B. F. Skinner embraced Thorndike’s view that environmental consequences exert a
powerful effect on behavior. He coined the term as operant conditioning. Skinner
(1938, 1953) defined operant conditioning as a type of learning in which behavior is
influenced by the consequences that follow it. Literally, operant means affecting the
environment, operating on it. What skinner implies were organism show different
responses. These responses affect the environment. And the kind of effect or
consequence that is produced determines whether the organism will show that
behavior again in the future or not.
To analyze behavior experimentally, Skinner designed a Skinner box, a special
chamber used to study operant conditioning experimentally on one wall. There is a
lever which is positioned above a small cup. When the lever is depressed, a food pellet
automatically drops into the cup. A hungry rat is put into the chamber, and as it moves
about, it accidentally presses the lever. A food falls into the cup and the rat eats it. It
was found that the rat pressed the bar more frequently over time.
There are mainly two types of consequences:
a. Reinforcement
With reinforcement, a response is strengthened by an outcome that follows it.
Typically, the term strengthened is operationally defined as an increase in
frequency of a response. The outcome that increases the frequency of a response
is called reinforcer.
Types of reinforcement
a. Positive reinforcement
It is the reinforcement occurs when a response is strengthened by the
subsequent presentation of a stimulus. The stimulus follows and
strengthens the response is called a positive reinforce like food, attention,
praise, money, etc. There are two broad types of positive reinforcers:
i. Primary reinforcer
Primary reinforcers are stimuli such as food, water, light,
comfortable air temperature, that an organism naturally finds
reinforcing because they satisfy biological needs.
ii. Secondary reinforce
Secondary reinforcers are stimuli that acquire reinforcing
properties through their association with primary reinforcers.
Some examples of secondary reinforcers are money, praise, good
grades, awards, gold stars, etc.
b. Negative reinforcement
Negative reinforcement occurs when a response is strengthened by the
subsequent removal or avoidance of an aversive stimulus. The aversive
stimulus that is removed is called negative reinforce. For example, we
have a severe headache and we take aspirin to lessen the headache. In
future, whenever we have headache, we take aspirin. In this example, our
act of taking aspirin is increased because it removes the aversive stimulus
i.e. headache.
Another example, we put on sunglasses when the sunlight is very bright
and we put off the alarm when we hear the alarm sound out in the
morning.
b. Punishment
Punishment occurs when a response is weakened by outcomes that follow it. The
outcome that decreases the frequency of response is called punisher.
There are two types of punishments:
i. Positive punishment
A response is weakened by subsequent presentation of stimulus like
scolding a child for misbehaving. The stimulus that follows and weakens
the response is called as positive punisher. There are two types of positive
punishers:
a. Primary punisher
Pain, extreme heat or cold are inherently punishing and are therefore
known as primary punishers.
b. Secondary punisher
Punishers that are manmade and created according to the situational
demand are called secondary punishers like criticism, demerits, bad
grades, fines, etc.
ii. Negative punishment
A response is weakened by the subsequent removal of a stimulus like a
boy does not do his homework, his parents cuts his allowance, then the
boy starts to do his homework. Similarly, a girl hits her younger sister, her
mother stops giving her attention that lead her not to hit her sister.
Principles of operant conditioning
a. Extinction
In operant conditioning, extinction takes place when the reinforce that maintained
the response is removed or is no longer available.
b. Stimulus generalization and stimulus discrimination
In operant conditioning, generalization that an animal or person emits the same
response to similar stimuli while discrimination means a response is emitted in the
presence of a stimulus that is reinforced and not in the presence of an
unreinforced stimulus.
c. Shaping
The organism undergoing shaping receives a reward for each small step towards a
final goal i.e. the target response rather than only for the final response. At first,
actions even remotely resembling the target behavior termed as successive
approximations are followed by reward. Gradually, closer and closer
approximations of the final target behavior are required before the reward is
given.
d. Learning on schedule- Schedules of reinforcement
It refers to a rule stating which behavior will be reinforced. There are two basic
types of reinforcement schedules:
i. Continuous reinforcement schedule
In this schedule, we reinforce every desired response/behavior that an
organism shows or makes.
ii. Intermittent/partial reinforcement schedule
In this schedule, we do not reinforce every time the desired response is
being made. Partial reinforcement to learning which exhibit greater
resistance to extinction of the learning resulting from continuous
reinforcement. There are two types of intermittent/partial reinforcement
schedules:
Interval schedule
Here, reinforcement is based on passage of time. There are two types
of interval schedule:
Fixed interval schedule (FI)
This type of schedule requires the passage of specific amount of
time before reinforcement will be delivered to a response. No
response during the interval is reinforced. For example, getting a
pay cheque at the end of the week or after a month. This schedule
produces a scallop i.e. gradual increase in the rate of responding
with responding occurring at a high rate just before reinforcement
is available.
Variable interval schedule (VI)
In this schedule, there are variations in the intervals after we
reinforce. For example, calling a friend and getting busy tone in
phone, we retry after a time space that may be of variable time.
There is a constant response rate because the organism never
knows when the reinforcer is scheduled and that it must emit a
response or it will lose the opportunity to be reinforced.
Ratio schedule
Fixed ratio schedule (FR)
Reinforcement is delivered after a fixed number of responses are
given. After a response is reinforced, no responding occurs for a
period of time and then the response occurs at a high and steady
rate until the next reinforcement is delivered termed as post
reinforcement pause (PRP). Length of such pause is directly
proportional to the size of pause. For example, factory workers
being paid on per piece rate.
The organism under the control of FR eventually learns when it
will be reinforced. It works steadily until it is reinforced. It knows it
will have to emit a certain number of responses before the next
reinforce is delivered. So it is as though it takes a little break after
each reinforcement before going back to work. PRP is flat because
time is moving by with no responses occurring.
Variable ratio schedule (VR)
In this type of intermittent schedule, organism is being reinforced
after variable number of responses. For example, reward on a slot
machine, or a bingo. In this type of schedule, there is a steady
response.
Ratio Interval
Fixed
FR FI
Variable
VR VI
Table: 2X2 contingency table to illustrate different types of reinforcement schedules
Now here, the question rises as how could the solution have come so suddenly? The
answer to this question could be that the organism forms a mental representation of
the problem, mentally reorganizes, and manipulates components of the problem
thereby coming up with new relationship among the components of the problem
which leads to the solution of the problem i.e. mental trial and error.
b. Observation learning
In this learning, we acquire new behavior by imitating behaviors we observe in others.
The person whose behavior is observed is called a Model. Observation learning is also
called social learning theory because we acquire much of our behavior by observing
and imitating others within a social context. Simply watching the behavior of others,
we can learn many behaviors without going through the tedious trial and error
process of gradually eliminating wrong responses and acquiring the right responses.
In 1963, Bandura and his colleagues did a study on observation learning. They did the
study on nursery- school children. These children were grouped into two. In a video,
one group of children saw an adult hitting, kicking, and verbally assaulting a Bobo doll.
While another group of children saw an adult loving and caring the Bobo doll. Later on,
when the children were made to play with the doll, the children who saw the adult
kick, hit the Bobo doll did the same. While the other group of children who saw the
adult loving the doll treated nicely.
According to Bandura (1986), there are four elements or steps of observational
learning:
i. Attention
In order to learn through observation, we have to pay attention to the
behavior being shown.
ii. Retention
In order to imitate the behavior, we have to remember what the model says
and did. Retention can be improved by mental rehearsal or by actual practice.
iii. Motor reproduction
We need to be able to convert these memories into appropriate actions. It is
called production. It depends on our physical abilities, our capacity to monitor
our own performance and adjust it until it matches that of the model. Apart
from that, practice makes the behavior smoother and more expert.
iv. Motivation and reinforcement
We may require a new skill or behavior through observation but we may not
perform that behavior until there is some motivation or incentive to do so. If
we are not in need of skill/behavior being shown by another person, we will
not pay attention to that behavior. In addition to that, reinforcement does
play role in observation learning. If we anticipate being reinforced for
imitating actions of a model, we may be more motivated to pay attention,
remember, and reproduce the behavior. Bandura identified following forms of
reinforcement that can encourage observational leaning:
The observer may reproduce the behavior of model if he/she receives
direct reinforcement for imitating it.
The observer may simply see others reinforced for a particular behavior
and then he/she increases his/her production of that behavior.
How do we learn?
We learn by association, our mind naturally connects events that occur in sequence. Suppose, we
see and smell a cake in bakery shop, eat one, and find it tasty. The next time we see and smell
such cakes, that experience will lead us to expect that eating it will once again be tasty.
In associative learning, we learn that certain events occur together. Conditioning is a process of
learning associations. In classical conditioning, we learn to associate two stimuli such that one
stimulus comes to elicit a response that originally was elicited only by other stimulus. While in
operant conditioning, we learn to associate a response i.e. our behavior and its consequences and
thus to repeat acts followed by good results and avoid the ones followed by bad results.
But for socio-cognitive learning theorists, learning includes not only changes in behavior but also
changes in our thoughts, expectations and knowledge. According to these theorists, cognition
plays important role in learning. Bandura’s observation learning and Kohler’s insight learning
comes under socio-cognitive learning theory.
Behavior modification
Analyzing means identifying the functional relationship between environmental events and
particular behavior to understand the reasons for behavior or to determine why person behaves
as he/she does.
Modifying means developing and implementing procedures to help people change their behavior.
It involves altering environmental events so as to influence behavior.
Behavior modification is used in many areas like health, school, home, organizations. The
application of behavior modification concepts in work setting is called organizational behavior
modification. It has been successfully used to improve productivity, attendance, punctuality, safe
work practices, customer services, and other important behaviors in a wide variety of kinds of
organizations such as banks, department stores, factories, hospitals, and construction sites. It can
be used to encourage learning of desired organizational behaviors as well as to discourage
undesired behaviors.
Typical organizational behavior modification (OB Mod) program follows five steps:
1. Identifying behaviors
Identifying behavior to be modified. The behavior should be observable, measurable, should
be relevant to job and organizational performance. And it should be critical. Critical behaviors
are those behaviors that make a significant impact on an employee’s job performance. These
are those 5 to 10 percent of behaviors that may account for up to 70 to 80 percent of each
employee’s performance.
2. Measuring the frequency of behavior
It is the identified behaviors repetitions. This provides baseline of the behavior we may
measure the behavior by doing direct observation, by questioning managers, supervisors,
team leaders, o r any other related personnel. Also we may get the data from archival data.
3. Analyzing
Analysis is done in A-B-C terms where A stands for antecedents i.e. the preceding events or
circumstances that acts as the prompt to B i.e. behaviors or responses made by an organism
and C i.e. consequences which is the response followed by a satisfying consequence will
become more likely to occur and a response followed by an annoying consequence will
become less likely to occur (according to the Law of Effect proposed by Edward Thorndike).
4. Developing and implementing intervention strategies
Intervention is done to modify the identified behavior. The goal of development and
implementation of interventions is to strengthen desirable behaviors and weaken undesirable
behaviors. Mainly, positive and negative reinforcements are used. But there arises
circumstances where punishments have to be used as well.
5. Evaluating performance improvement
We evaluate the effectiveness of the intervention strategy that we have used. We measure
the frequency of behavior to determine the effectiveness of the intervention strategy. If the
behavior has been successfully modified, then all that needs to be done at this step is to
maintain the intervention. If the behavior has not been modified, then we need to reconsider
the intervention methods and modify them accordingly and/or reconsider the behavior we
originally identified.
Research has suggested that OB Mod if appropriately used, it can be highly effective when it
comes to prompting desirable organizational behavior. For instance, research on OB Mod showed
it improved employee performance by 17 percent on average. Field experiment conducted by
Alexander Stajkovic and Fred Luthans in a division of a large organization that processes credit
card bills, it was found that OB Mod resulted in 37 percent increase in performance when the
reinforced behavior included financial incentives. When performance was positively reinforced by
simple supervisory feedback performance of employees was increased by 20 percent. When social
recognition and praise were used, performance increased by 24 percent.
Identify Behavior must be observable,
measurable, behavior change task related and critical to
task
Intervene
performance improvements
Differences between classical conditioning and operant conditioning
Transfer of learning