Classical Conditioning cc
Pavlov's
experiment
One of Pavlov’s dogs with a
surgically implanted cannula to measure salivation,
Pavlov Museum, 2005
The original and most famous
example of classical conditioning involved the salivary
conditioning of Pavlov's dogs. During his research on the physiology of
digestion in dogs, Pavlov noticed that, rather than simply salivating in the
presence of meat powder (an innate response to food that he called the unconditioned
response), the dogs began to salivate in the presence of the lab technician
who normally fed them. Pavlov called these psychic secretions. From this
observation he predicted that, if a particular stimulus in the dog’s
surroundings was present when the dog was presented with meat powder, then this
stimulus would become associated with food and cause salivation on its
own. In his initial experiment, Pavlov used bells to call the dogs to their
food and, after a few repetitions, the dogs started to salivate in response to
the bell. Thus, a neutral stimulus (bell) became a conditioned stimulus
(CS) as a result of consistent pairing with the unconditioned stimulus
(US - meat powder in this example). Pavlov referred to this learned
relationship as a conditional reflex (now called Conditioned Response
[edit] Types of
Classical Conditioning
Types and variations of classical
conditioning are all derived from the same source. [1]
[edit] Forward
Conditioning
The onset of the CS precedes the
onset of the US. Three common forms of Forward Conditioning are: Short-delay,
Long-delay, and Trace.
Short-delay Conditioning
The onset of the US is delayed
relative to the onset of the CS. In this procedure, the CS may completely
overlap with the US, or the CS may terminate at some point before the US
offset. The term "short" refers to the Interstimulus interval (ISI), and is
determined by the type of classical conditioning. For example, in some forms of
classical conditioning, such as Eyeblink conditioning, ISIs in the range
of 100 to 750 msec are typically considered short. In other forms of classical
conditioning, such as in Taste aversion, ISIs in the range of minutes
to 1 or 2 hours are considered short.
Long-delay Conditioning
In this procedure, the onset of
the US is still delayed relative to the onset of the CS, but ISIs are longer
than in the Short-delay Procedure. While the difference between Short and Long
may appear trivial, the distinction is important because some forms of
conditioning are best learned with a long delay, while others are best learned
with a short delay.
Trace Conditioning
The CS and US do not overlap.
Instead, the CS is presented, a period of time is allow to elapse during which
no stimuli are presented, and then the US is presented. The stimulus free
period is called the trace interval.
[edit] Simultaneous
Conditioning
The CS and US are presented at
the same time.
[edit] Backward
Conditioning
The onset of the US precedes the
onset of the CS. Rather than being a reliable predictor of an impending US
(such as in Forward Conditioning), the CS actually serves as a signal that the
US has ended. As a result, the CR is said to be inhibitory.
[edit] Temporal
Conditioning
The US is presented at regularly
timed intervals, and CR acquisition is dependent upon correct timing of the
interval between US presentations. The background, or context, can serve as the
CS in this example.
[edit] Unpaired
Conditioning
The CS and US are not presented
together. Usually they are presented as independent trials that are separated
by a variable, or pseudo-random, interval. This procedure is used to study
non-associative behavioral responses, such as Sensitization.
The CS is presented in the
absence of the US. This procedure is usually done after the CR has been
acquired thought Forward Conditioning training. Eventually, the CR frequency is
reduced to pre-training levels.
[edit] Variations of
Classical Conditioning Procedures
In addition to the simple
procedures described above, some classical conditioning studies are designed to
tap into more complex learning processes. Some common variations are discussed
below.
[edit] Classical
Discrimination/Reversal Conditioning
In this procedure, two CSs and
one US are typically used. The CSs may be the same modality (such as lights of
different intensity), or they may be different modalities (such as auditory CS
and visual CS). In this procedure, one of the CSs is designated CS+ and its
presentation is always followed by the US. The other CS is designated CS- and
its presentation is never followed by the US. After a number of trials, the
organism learns to discriminate CS+ trials and CS- trials such that CRs
are only observed on CS- trials.
During Reversal Training,
the CS+ and CS- are reversed and subjects learn to suppress responding to the
previous CS+ and show CRs to the previous CS-.
[edit] Classical ISI
Discrimination Conditioning
This is a discrimination
procedure in which two different CSs are used to signal two different Interstimulus intervals. For example,
a dim light may be presented 30 seconds before a US, while a very bright light
is presented 2 minutes before the US. Using this technique, organisms can learn
to perform CRs the are appropriately timed for the two distinct CSs.
[edit] Latent Inhibition
Conditioning
In this procedure, a CS is
presented several times before paired CS-US training commences. The
pre-exposure of the subject to the CS before paired training slows the rate of
CR acquisition relative to organisms that are not CS pre-exposed. Also see Latent
inhibition for applications.
[edit] Conditioned
Inhibition Conditioning
Three phases of conditioning are
typically used:
Phase 1:
A CS (CS+) is paired with a US
until asymptotic CR levels are reached.
Phase 2:
CS+/US trials are continued, but
interspersed with trials on which the CS+ in compound with a second CS, but not
with the US (i.e., CS+/CS- trials). Typically, organisms show CRs on CS+/US
trials, but suppress responding on CS+/CS- trials.
Phase 3:
In this retention test, the
previous CS- is paired with the US. If conditioned inhibition has occurred, the
rate of acquisition to the previous CS- should be impaired relative to organisms
that did not experience Phase 2.
[edit] Blocking
This form of classical
conditioning also involves three phases.
Phase 1:
A CS (CS1) is paired with a US.
Phase 2:
CS1 is presented in compound with
a new CS (CS2), and the compound is paired with the US.
Phase 3:
CS2 is paired with the US.
Blocking is measured as impairment in the rate of learning to CS2 relative to
organisms that did not experience Phase 2. Essentially, acquisition to CS2 is
blocked during compound training because CRs had already formed to CS1.
Classical
Conditioning Applied
John B. Watson's Little Albert
Behavioral Therapies Based on Classical Conditioning Behaviour
therapy
Aversion therapy
Main
article: Aversion therapy
Systematic desensitization
Main
article: Systematic desensitization
Theories
of classical conditioning
There are two competing theories
of how classical conditioning works. The first, stimulus-response theory,
suggests that an association to the unconditioned stimulus is made with the
conditioned stimulus within the brain, but without involving conscious thought.
The second theory stimulus-stimulus theory involves cognitive activity, in
which the conditioned stimulus is associated to the concept of the
unconditioned stimulus, a subtle but important distinction.
Stimulus-response theory, referred to as S-R theory, is a
theoretical model of behavioral psychology that suggests humans and other
animals can learn to associate a new stimulus- the conditioned stimulus (CS)-
with a pre-existing stimulus - the unconditioned stimulus (UCS), and can think,
feel or respond to the CS as if it were actually the UCS.
The opposing theory,
put forward by cognitive behaviorists, is stimulus-stimulus
theory (S-S theory). Stimulus-stimulus theory, referred to as S-S theory,
is a theoretical model of classical conditioning that suggests a cognitive
component is required to understand classical conditioning and that
stimulus-response theory is an inadequate model. It proposes that a cognitive
component is at play. S-R theory suggests that an animal can learn to associate
a conditioned stimulus (CS) such as a bell, with the impending arrival of food
termed the unconditioned stimulus, resulting in an observable behavior such as
salivation. Stimulus-stimulus theory suggests that instead the animal salivates
to the bell because it is associated with the concept of food, which is a very
fine but important distinction.
To test this theory, psychologist
Robert Rescorla undertook the following experiment [2]. Rats learned to associate a loud noise as
the unconditioned stimulus, and a light as the conditioned stimulus. The
response of the rats was to freeze and cease movement. What would happen then
if the rats were habituated to the UCS? S-R theory would suggest
that the rats would continue to respond to the UCS, but if S-S theory is
correct, they would be habituated to the concept of a loud sound (danger), and
so would not freeze to the CS. The experimental results suggest that S-S was
correct, as the rats no longer froze when exposed to the signal light. [3]
Instrumental/Operant Conditioning oc
"Operant" redirects
here. For the meaning of operant, see Operant.
Operant conditioning is the use of consequences to modify the
occurrence and form of behavior. Operant conditioning is
distinguished from Pavlovian conditioning in that
operant conditioning deals with the modification of voluntary behavior
through the use of consequences, while Pavlovian conditioning deals with the
conditioning of behavior so that it occurs under new antecedent
conditions.[1]
[edit] Reinforcement,
punishment, and extinction
Reinforcement
and punishment,
the core tools of operant conditioning, are either positive (delivered
following a response), or negative (withdrawn following a response). This
creates a total of four basic consequences, with the addition of a fifth
procedure known as extinction (i.e. no change in
consequences following a response).
It's important to note that
organisms are not spoken of as being reinforced, punished, or extinguished; it
is the response that is reinforced, punished, or extinguished. Additionally,
reinforcement, punishment, and extinction are not terms whose use are restricted
to the laboratory. Naturally occurring consequences can also be said to
reinforce, punish, or extinguish behavior and are not always delivered by
people.
Reinforcement is a consequence that causes a behavior
to occur with greater frequency.
Punishment is a consequence that causes a behavior to occur
with less frequency.
Extinction is the lack of any consequence following a
response. When a response is inconsequential, producing neither favorable nor unfavorable
consequences, it will occur with less frequency.
Four contexts of operant
conditioning: Here the
terms "positive" and "negative" are not used
in their popular sense, but rather: "positive" refers to
addition, and "negative" refers to subtraction. What is added
or subtracted may be either reinforcement or punishment. Hence positive
punishment is sometimes a confusing term, as it denotes the addition of
punishment (such as spanking or an electric shock), a context that may seem
very negative in the lay sense. The four procedures are:
Positive reinforcement occurs when a behavior (response) is
followed by a favorable stimulus (commonly seen as pleasant) that increases the
frequency of that behavior. In the Skinner box experiment, a stimulus such as
food or sugar solution can be delivered when the rat engages in a target
behavior, such as pressing a lever.
Negative reinforcement occurs when a behavior (response) is
followed by the removal of an aversive stimulus (commonly seen as unpleasant)
thereby increasing that behavior's frequency. In the Skinner box experiment,
negative reinforcement can be a loud noise continuously sounding inside the
rat's cage until it engages in the target behavior, such as pressing a lever,
upon which the loud noise is removed.
Positive punishment (also called "Punishment by
contingent stimulation") occurs when a behavior (response) is followed by
an aversive stimulus, such as introducing a shock or loud noise, resulting in a
decrease in that behavior.
Negative punishment (also called "Punishment by
contingent withdrawal") occurs when a behavior (response) is followed by
the removal of a favorable stimulus, such as taking away a child's toy
following an undesired behavior, resulting in a decrease in that behavior.
Also:
Avoidance learning is a type of learning in which a certain
behavior results in the cessation of an aversive stimulus. For example,
performing the behavior of shielding one's eyes when in the sunlight (or going
indoors) will help avoid the punishment of having light in one's eyes.
Extinction occurs when a behavior (response) that
had previously been reinforced is no longer effective. In the Skinner box
experiment, this is the rat pushing the lever and being rewarded with a food
pellet several times, and then pushing the lever again and never receiving a
food pellet again. Eventually the rat would cease pushing the lever.
Non-contingent Reinforcement is a procedure that decreases the
frequency of a behavior by both reinforcing alternative behaviors and extinguishing
the undesired behavior. Since the alternative behaviors are reinforced, they
increase in frequency and therefore compete for time with the undesired
behavior.
[edit] Thorndike's Law
of Effect
Operant conditioning, sometimes
called instrumental conditioning or instrumental learning, was
first extensively studied by Edward L. Thorndike (1874-1949), who
observed the behavior of cats trying to escape from home-made puzzle boxes.[2] When first constrained in the boxes, the
cats took a long time to escape. With experience, ineffective responses
occurred less frequently and successful responses occurred more frequently,
enabling the cats to escape in less time over successive trials. In his Law
of Effect, Thorndike theorized that successful responses, those
producing satisfying consequences, were "stamped in" by the
experience and thus occurred more frequently. Unsuccessful responses, those
producing annoying consequences, were stamped out and
subsequently occurred less frequently. In short, some consequences strengthened
behavior and some consequences weakened behavior. B.F.
Skinner (1904-1990) formulated a more detailed analysis of operant
conditioning based on reinforcement, punishment, and extinction. Following the
ideas of Ernst
Mach, Skinner rejected Thorndike's mediating structures required by
"satisfaction" and constructed a new conceptualization of behavior
without any such references. Moreover, Thorndike's work with puzzle boxes
produced no meaningful data to be studied other than a measure of escape times.
So while experimenting with some homemade feeding mechanisms Skinner invented
the operant conditioning chamber
which allowed him to measure rate of response as a key dependent variable using
a cumulative record of lever presses or key pecks.[3]
[edit] Operant
Conditioning vs Fixed Action Patterns
Skinner's construct of
instrumental learning is contrasted with what Nobel Prize winning biologist Konrad
Lorenz termed "fixed action patterns," or reflexive,
impulsive, or instinctive behaviors. These behaviors were said by Skinner and
others to exist outside the parameters of operant conditioning but were
considered essential to a comprehensive analysis of behavior.
In dog training, the use of the prey drive,
particularly in training working dogs, detection dogs, etc., the stimulation of
these fixed action patterns, relative to the dog's predatory instincts, are the
key to producing very difficult yet consistent behaviors, and in most cases, do
not involve operant, classical, or any other kind of
conditioning.[citation needed] While
evolutionary processes shaped these fix action patterns, the patterns
themselves remained stable long enough to be shaped by the long time span
necessary for evolution because of their survival function (i.e., operant
conditioning).
According to the laws of operant
conditioning, any behavior that is consistently rewarded, every single time,
will extinguish at a faster rate while intermittently reinforcing behavior
leads to more stable rates of behavior that are relatively more resistant to
extinction. Thus, in detection dogs, any correct behavior of indicating a
"find," must always be rewarded with a tug toy or a ball throw early
on for initial acquisition of the behavior. Thereafter, fading procedures, in
which the rate of reinforcement is "thinned" (not every response is
reinforced)are introduced, switching the dog to an intermittent schedule of
reinforcement, which is more resistant to instances of non-reinforcement.
Nevertheless, some trainers are
now using the prey drive to train pet dogs and find that they get far better
results in the dogs' responses to training than when they only use the
principles of operant conditioning[citation needed] which,
according to Skinner and his students Keller
and Marian
Breland (who invented clicker training), break down when strong
instincts are at play.[4]
[edit] Criticisms
Thorndike's law of effect
specifically requires that a behavior be followed by satisfying consequences
for learning to occur. There are, however, cases in which learning can be shown
to occur without good or bad effects following the behavior. For instance, a
number of experiments examining the phenomenon of latent
learning[5][6][7][8] showed that a rat needn't receive a
satisfying reward (food, if hungry; water, if thirsty) in order to learn a
maze; learning that becomes apparent immediately after the desired reward is
introduced.
A different experiment, in
humans, showed that punishing the correct behavior may actually cause it to be
more frequently taken (i.e. stamp it in)[9]. Subjects are given a number of pairs of
holes on a large board and required to learn which hole to poke a stylus
through for each pair. If the subjects receive an electric shock for punching
the correct hole, they learn which hole is correct more quickly than subjects
who receive an electric shock for punching the incorrect hole.
[edit] Biological
correlates of operant conditioning
The first scientific studies
identifying neurons that responded in ways that suggested they encode for
conditioned stimuli came from work by Rusty Richardson and Mahlon deLong.[10][11] They showed that nucleus basalis neurons,
which release acetylcholine broadly throughout the cerebral
cortex, are activated shortly after a conditioned stimulus, or after a
primary reward if no conditioned stimulus exists. These neurons are equally
active for positive and negative reinforcers, and have been demonstrated to
cause plasticity in many cortical regions.[12]
Evidence also exists that dopamine is activated at similar times. The dopamine
pathways encode positive reward only, not aversive reinforcement, and they
project much more densely onto frontal cortex regions. Cholinergic projections,
in contrast, are dense even in the posterior cortical regions like the primary
visual cortex. A study of patients with Parkinson's disease, a condition
attributed to the insufficient action of dopamine, further illustrates the role
of dopamine in positive reinforcement.[13] It showed that while off their
medication, patients learned more readily with aversive consequences than with
positive reinforcement. Patients who were on their medication showed the
opposite to be the case, positive reinforcement proving to be the more
effective form of learning when the action of dopamine is high.
[edit] Factors that
alter the effectiveness of consequences
How effective a consequence can
be at modifying a response will tend to increase or decrease according to
various factors. These factors can apply to both reinforcing and punishing
consequences.
Satiation: The effectiveness of a consequence will
be reduced if the individual's "appetite" for that source of
stimulation has been satisfied. Inversely, the effectiveness of a consequence
will increase as the individual becomes deprived of that stimulus. If someone
is not hungry, food will not be an effective reinforcer for behavior.
Immediacy: After a response, how immediately a
consequence is then felt determines the effectiveness of the consequence. More
immediate feedback will be more effective than less immediate feedback. If
someone's license plate is caught by a traffic camera for speeding and they
receive a speeding ticket in the mail a week later, this consequence will not
be very effective against speeding. But if someone is speeding and is caught in
the act by an officer who pulls them over, then their speeding behavior is more
likely to be affected.
Contingency: If a consequence does not contingently
(reliably, or consistently) follow the target response, its effectiveness upon
the response is reduced. But if a consequence follows the response reliably
after successive instances, its ability to modify the response is increased. If
someone has a habit of getting to work late, but is only occasionally reprimanded
for their lateness, the reprimand will not be a very effective punishment.
Size: This is a "cost-benefit" determinant of
whether a consequence will be effective. If the size, or amount, of the
consequence is large enough to be worth the effort, the consequence will be
more effective upon the behavior. An unusually large lottery jackpot, for
example, might be enough to get someone to buy a one-dollar lottery ticket (or
even buying multiple tickets). But if a lottery jackpot is small, the same
person might not feel it to be worth the effort of driving out and finding a
place to buy a ticket. In this example, it's also useful to note that
"effort" is a punishing consequence. How these opposing expected
consequences (reinforcing and punishing) balance out will determine whether the
behavior is performed or not.
Most of these factors exist for
biological reasons. The biological purpose of the Principle of Satiation is to
maintain the organism's homeostasis. When an organism has been deprived of
sugar, for example, the effectiveness of the taste of sugar as a reinforcer is
high. However, as the organism reaches or exceeds their optimum blood-sugar
levels, the taste of sugar becomes less effective, perhaps even aversive.
The principles of Immediacy and
Contingency exist for neurochemical reasons. When an organism experiences a
reinforcing stimulus, dopamine pathways in the brain are activated. This
network of pathways "releases a short pulse of dopamine onto many dendrites,
thus broadcasting a rather global reinforcement signal to postsynaptic
neurons."[14] This makes recently activated synapses
able to increase their sensitivity to efferent signals, hence increasing the probability
of occurrence for the recent responses preceding the reinforcement. These
responses are, statistically, the most likely to have been the behavior
responsible for successfully achieving reinforcement. But when the application
of reinforcement is either less immediate or less contingent (less consistent),
the ability of dopamine to act upon the appropriate synapses is reduced.
[edit] Operant
variability
Operant variability is what
allows a response to adapt to new situations. Operant behavior is distinguished
from reflexes in that its response topography (the form of the response)
is subject to slight variations from one performance to another. These slight
variations can include small differences in the specific motions involved,
differences in the amount of force applied, and small changes in the timing of
the response. If a subject's history of reinforcement is consistent, such
variations will remain stable because the same successful variations are more
likely to be reinforced than less successful variations. However, behavioral
variability can also be altered when subjected to certain controlling
variables.[15]
An extinction burst will
often occur when an extinction procedure has just begun. This consists of a
sudden and temporary increase in the response's frequency , followed by
the eventual decline and extinction of the behavior targeted for elimination.
Take, as an example, a pigeon that has been reinforced to peck an electronic
button. During its training history, every time the pigeon pecked the button,
it will have received a small amount of bird seed as a reinforcer. So, whenever
the bird is hungry, it will peck the button to receive food. However, if the
button were to be turned off, the hungry pigeon will first try pecking the
button just as it has in the past. When no food is forthcoming, the bird will
likely try again... and again, and again. After a period of frantic activity,
in which their pecking behavior yields no result, the pigeon's pecking will
decrease in frequency.
The evolutionary advantage of
this extinction burst is clear. In a natural environment, an animal that
persists in a learned behavior, despite not resulting in immediate
reinforcement, might still have a chance of producing reinforcing consequences
if they try again. This animal would be at an advantage over another animal
that gives up too easily.
Extinction-induced variability serves a similar adaptive role. When extinction
begins, and if the environment allows for it, an initial increase in the
response rate is not the only thing that can happen. Imagine a bell curve.
The horizontal axis would represent the different variations possible for a
given behavior. The vertical axis would represent the response's probability in
a given situation. Response variants in the middle of the bell curve, at its
highest point, are the most likely because those responses, according to the
organism's experience, have been the most effective at producing reinforcement.
The more extreme forms of the behavior would lie at the lower ends of the
curve, to the left and to the right of the peak, where their probability for expression
is low.
A simple example would be a
person inside a room opening a door to exit. The response would be the opening
of the door, and the reinforcer would be the freedom to exit. For each time
that same person opens that same door, they do not open the door in the exact
same way every time. Rather, each time they open the door a little differently:
sometimes with less force, sometimes with more force; sometimes with one hand,
sometimes with the other hand; sometimes more quickly, sometimes more slowly.
Because of the physical properties of the door and its handle, there is a
certain range of successful responses which are reinforced.
Now imagine in our example that
the subject tries to open the door and it won't budge. This is when extinction-induced
variability occurs. The bell curve of probable responses will begin to
broaden, with more extreme forms of behavior becoming more likely. The person
might now try opening the door with extra force, repeatedly twist the knob, try
to hit the door with their shoulder, maybe even call for help or climb out a
window. This is how extinction causes variability in behavior, in the hope that
these new variations might be successful. For this reason, extinction-induced
variability is an important part of the operant procedure of shaping.
[edit] Avoidance
learning
Avoidance training belongs to
negative reinforcement schedules. The subject learns that a certain response
will result in the termination or prevention of an aversive stimulus. There are
two kinds of commonly used experimental settings: discriminated and
free-operant avoidance learning.
Discriminated avoidance
learning
In discriminated avoidance
learning, a novel stimulus such as a light or a tone is followed by an aversive
stimulus such as a shock (CS-US, similar to classical conditioning). Whenever
the animal performs the operant response, the CS(conditioned stimulus)
respectively the US(unconditioned stimulus)is removed. During the first trials
(called escape-trials) the animals usually experiences both the CS and the US,
showing the operant response to terminate the aversive US. By the time, the
animal will learn to perform the response already during the presentation of
the CS thus preventing the aversive US from occurring. Such trials are called
avoidance trials.
Free-operant avoidance
learning
In this experimental session, no
discrete stimulus is used to signal the occurrence of the aversive stimulus.
Rather, the aversive stimulus (mostly shocks) are presented without explicit
warning stimuli.
There are two crucial time
intervals determining the rate of avoidance learning. This first one is called
the S-S-interval (shock-shock-interval). This is the amount of time which
passes during successive presentations of the shock (unless the operant
response is performed). The other one is called the R-S-interval
(response-shock-interval) which specifies the length of the time interval
following an operant response during which no shocks will be delivered. Note
that each time the organism performs the operant response, the R-S-interval
without shocks begins newly.
[edit] Two-process
theory of avoidance
This theory was originally
established to explain learning in discriminated avoidance learning. It assumes
two processes to take place . a) Classical conditioning of fear During
the first trials of the training, the organism experiences both CS and aversive
US(escape-trials). The theory assumed that during those trials classical
conditioning takes places by pairing the CS with the US. Because of the
aversive nature of the US the CS is supposed to elicit a conditioned emotional reaction
(CER) - fear. In classical conditioning, presenting a CS conditioned with an
aversive US disrupts the organism's ongoing behavior. b) Reinforcement of
the operant response by fear-reduction Because during the first process,
the CS signaling the aversive US has itself become aversive by eliciting fear
in the organism, reducing this unpleasant emotional reaction serves to motivate
the operant response. The organism learns to make the response during the CS
thus terminating the aversive internal reaction elicited by the CS. An
important aspect of this theory is that the term "Avoidance" does not
really describe what the organism is doing. It does not "avoid" the
aversive US in the sense of anticipating it. Rather the organism escapes an
aversive internal state, caused by the CS.
One of the practical aspects of
operant conditioning with relation to animal
training is the use of shaping (reinforcing successive approximations and
not reinforcing behavior past approximating), as well as chaining.
[edit] Verbal Behavior
Main article: Verbal Behavior (book)
In 1957 Skinner published Verbal Behavior a theoretical
extension of the work he had pioneered since 1938. This work extended the
theory of operant conditioning to human behavior previously assigned to the
areas of language, linguistics and other areas. Verbal Behavior is the logical
extension of Skinner's ideas, in which he introduced new functional
relationship categories such as intraverbals, autoclitics, mand, tacts and the
controlling relationship of the audience. All of these relationships were based
on operant conditioning and relied on no new mechanisms despite the
introduction of new functional categories.
[edit] Four term
contingency
Modern behavior analysis, which
is the name of the discipline directly descended from Skinner's work, holds
that behavior is explained in
four terms:
an establishing operation (EO), a
discriminative stimulus (Sd),
a
response (R), and a
reinforcing stimulus (Srein
or Sr for reinforcers, sometimes Save for aversive
stimuli).[16]
[edit] Operant Hoarding
Operant Hording is a term referring to the choice made by
a rat, on a compound schedule called a multiple schedule, that maximizes its
rate of reinforcement in an operant conditioning
context. More specifically, rats were shown to have allowed food pellets to
accumulate in a food tray by continuing to press a lever on a continuous reinforcement schedule
instead of retrieving those pellets. Retrieval of the pellets always instituted
a one-minute period of extinction during which no additional food pellets
were available but those that had been accumulated earlier could be consumed.
This finding appears to contradict the usual finding that rats behave
impulsively in situations in which there is a choice between a smaller food
object right away and a larger food object after some delay. See schedules of reinforcement. [17
Schedules of Reinforcement
Fixed ratio (FR) schedules deliver reinforcement
after every nth response
Continuous ratio (CRF) schedules are a special form of a
fixed ratio. In a continuous ratio schedule, reinforcement follows each and
every response
Fixed interval (FI) schedules deliver reinforcement for
the first response after a fixed length of time since the last reinforcement,
while premature responses are not reinforced.
Variable ratio (VR) schedules deliver reinforcement
after a random number of responses (based upon a predetermined average)
Variable interval (VI) schedules deliver reinforcement for
the first response after a random average length of time passes since the last
reinforcement
OTHER
Differential reinforcement of
incompatible behavior (DRI)
Differential reinforcement of
other behavior (DRO)
Differential reinforcement of low
response rate (DRL)
is used to increase low rates of
responding. It is like an interval schedule, except that premature responses
reset the time required between behavior
Differential reinforcement of
high rate (DRH)
is used to increase high rates of
responding. It is like an interval schedule, except that a minimum number of
responses are required in the interval in order to receive reinforcement.
Fixed Time (FT)
Variable Time (VT)
Compound Schedules
Multiple s
Mixed sch
Concurrent
Tandem
Chained
Higher order
Conditioned taste aversion is an example of classical conditioning, also called
Pavlovian conditioning. Conditioned taste aversion occurs when a subject
associates the taste
of a certain food with symptoms caused by a toxic,
spoiled, or poisonous
substance. Generally, taste aversion is caused after ingestion of the food
causes nausea,
sickness,
or vomiting. The ability to develop a taste aversion is considered an adaptive
trait or survival mechanism that trains the body to avoid poisonous substances
(e.g., poisonous berries) before they can cause harm. This association is meant
to prevent the consumption of the same substance (or something that tastes
similar) in the future, thus avoiding further poisoning. However, conditioned
taste aversion sometimes occurs in subjects when sickness was merely
coincidental and not related to the food (for example, a subject who gets a
cold or the flu shortly after eating bananas might develop an aversion to the
taste of bananas).
|
Contents [hide] 2 Interesting notes
concerning taste aversion |
Garcia's
study
While studying the effects of
radiation on various behaviours during the 1950s, Dr. John Garcia noticed that rats
developed an aversion to substances consumed prior to being irradiated. To
examine this, Garcia put together a study in which three groups of rats were
given sweetened water followed by either no radiation, mild radiation, or
strong radiation. When the rats were subsequently given a choice between
sweetened water and regular tap water, rats who had been exposed to radiation
drank much less sweetened water than those who hadn't. Specifically, the total
consumption of sweetened water for the no-radiation, mild radiation and strong
radiation rats was 80%, 40% and 10% respectively.
This finding ran contrary to much
of the learning literature of the time in that the aversion could occurr after
just a single trial and over a long delay. Garcia proposed that the sweetened
water became dispreferred because of the nausea inducing effects of the
radiation, and so began the study of conditioned taste aversion.
[edit] Interesting notes
concerning taste aversion
Taste aversion does not
require cognitive awareness to develop--that is, the subject does not have to think, "Wow, this
tastes like the stuff that got me sick." In fact, the subject may hope
to enjoy the substance, but the body handles it reflexively. Conditioned taste
aversion illustrates the argument that in classical conditioning, a response is
elicited.
Also, taste aversion generally
only requires one trial. The experiments of Ivan
Pavlov required several pairings of the neutral stimulus (e.g., a
tuning fork) with the unconditioned stimulus (i.e., meat powder) before the
neutral stimulus elicited a response. With taste aversion, after one
association between sickness and a certain food, the food may thereafter elicit
the response. In addition, lab experiments generally require very brief (less
than a second) intervals between a neutral stimulus and an unconditioned
stimulus. With taste aversion, however, the bratwurst a person eats at lunch
may be associated with the vomiting that person has in the evening.
If the flavor has been
encountered before the subject becomes ill, the effect will not be as strong or
will not be present. This quality is called latent
inhibition. Conditioned taste aversion is often used in laboratories to
study gustation and learning in rats.
Aversions can also be developed
to odors as well as to tastes.
[edit] Taste aversion in
humans
Taste aversion is fairly common
in humans. When humans eat bad food (e.g., spoiled meat) and get sick, they may
find that food aversive until extinction occurs, if ever. Also, as in nature, a
food does not have to cause the sickness for it to become aversive. A
human who eats sushi for the first time and who happens to come down with an
unrelated stomach virus or influenza may still develop a taste aversion to
sushi.
Taste aversion is a common problem
with chemotherapy
patients, who become nauseated because of the drug therapy but associate the
nausea with consumption of food.
[edit] Applications of
taste aversion
Taste aversion has been
demonstrated in a wide variety of both captive and free-ranging predators. In
these studies, animals that consume a bait laced with an undetectable dose of
an aversion agent avoid both baits and live prey with the same taste and scent
as the baits. When predators detect the aversion agent in the baits, they
quickly form aversions to the baits, but discriminate between these and
different-tasting live prey. The use of conditioned taste aversion in wildlife
management has so far been resisted by governmental wildlife managers, mainly
because of a lack of understanding of the process.
[edit] Stimulus
generalization
Stimulus generalization is
another learning phenomenon that can be illustrated by CTA. This phenomenon demonstrates
that we tend to develop aversions even to types of food that resemble the foods
which cause us illness. For example, if one eats an orange and gets sick, one
might also avoid eating tangerines and clementines because they look similar to
oranges, and might lead one to think that they are also
Comparison between Instrumental and Classical Conditioning
Learned Helplessness lh
Observational Learning
Observational learning (also known as: vicarious learning
or social learning) is learning that occurs as a function of observing,
retaining and replicating behavior observed in others.
Although observational learning
can take place at any stage in life, it is thought to be particularly important
during childhood,
particularly as authority becomes important
Observational learning allows for
learning without any change in behavior and has therefore been used as an
argument against strict behaviorism which argued that behavior change must
occur for new behaviors to be acquired
The four Required conditions
Bandura called the process of
social learning modeling and gave four conditions required for a person
to successfully model the behaviour of someone else:
Attention to the model
A person must first pay attention
to a person engaging in a certain behavior (the model).
Retention of details
Once attending to the observed
behavior, the observer must be able to effectively remember what the model has
done.
Motor reproduction
The observer must be able to
replicate the behavior being observed. For example, juggling cannot be
effectively learned by observing a model juggler if the observer does not
already have the ability to perform the component actions (throwing and
catching a ball).
Motivation and Opportunity
The observer must be motivated to
carry out the action they have observed and remembered, and must have the
opportunity to do so. For example, a suitably skilled person must want to
replicate the behavior of a model juggler, and needs to have an appropriate
number of items to juggle at hand.
Effect on
behavior
Social learning may affect
behavior in the following ways:
Teaches new behaviors
Increases or decreases the
frequency with which previously learned behaviors are carried out
Can encourage previously
forbidden behaviors
Can increase or decrease similar
behaviors. For example, observing a model excelling in piano playing may
encourage an observer to excel in playing the saxophone.