Chapter 7 Schedules and Theories of Reinforcement Schedules of Reinforcement Continuous versus Intermittent Schedules •
•
Continuous Reinforcement Schedule: one in which each specified response is reinforced
Very useful when a behavior is first being shaped or strengthen
Example: each time a rat presses the lever, it obtains a food pellet; each time the dog rolls over on command, it gets a treat and each time Karen turns the ignition in her car, the motor starts
Intermittent (or partial) Reinforcement Schedule: one in which only some responses are reinforced
Example: perhaps only some of the rat’s lever presses result in a food pellet, and perhaps only occasionally did your mother give you a cookie when you asked for one
Four Basic Intermittent Schedules Fixed Ratio Schedules •
Reinforcement is contingent upon a fixed, predictable number of responses
FR schedules generally produce a high rate of response along with a short pause following the attainment of each reinforcer = postreinforcement pause
“Stretching the ratio” = moving from a low ratio requirement (a dense schedule) to a high ration requirement (a lean schedule) should be done gradually
Example: fixed ratio 5 schedule, a rat has to press the lever 5 times to obtain a food pellet (FR 5)
Example: once lever pressing is well established on a CFR schedule, the requirement cam be gradually increase from 2 to 5 to 10 and onward – if it’s a sudden jump, then the rats behavior becomes erratic and even die out altogether
Ratio Strain: a disruption in responding due to an overly demanding response requirement = burnout
Variable Ratio •
Reinforcement is contingent upon a varying, unpredictable number of responses
Example: on a variable ratio 5, (VR 5) schedule, a rat has to emit an average of 5 lever presses for each food pellet, with the number of lever responses on any particular trial varying between say 1 and 10
VR schedules produce a high and steady rate of response, often with little or no postreinforcement pause
Pause is unlikely to occur when the minimum response requirement in the schedule is very low
Variable ratio schedules help to account for the persistence with which some people display certain maladaptive behaviors
Example: gambling; unpredictable nature of these activities result in a very high rate of behavior
Fixed Interval •
Reinforcement is contingent upon the first response after a fixed, predictable period of time
•
Example: FI 30 sec schedule, the first lever press after a 30second internal has elapsed results in a food pellet; following that, another 30 secs must elapse before a lever press will again produce a food pellet
FI schedules often produce a “scalloped” pattern of responding, consisting of a postreinforcement pause followed by a gradually increasing rate of response as the internal draws to a close
Variable Interval •
Reinforcement is contingent upon first response after a varying, unpredictable time period
Example: rat on a VI 30 sec schedule, the first lever press after an average internal of 30 seconds will result in a food pellet, with the actual internal on any particular trial varying between, say 1 and 60 seconds.
•
Thus the number of seconds that must pass before the lever press will produce a food pellet could be 8 seconds etc.
•
Usually produce a moderate, steady rate of response, often with little or no postreinforcement pause
•
Example: Call a professor before an exam and know she will be in her office between 8 and 8:30; call ever few minutes within her arrival time
Because VI schedules produce predictable response rates, as well as predictable rates of reinforcement, they are often used to investigate other aspects of operant conditioning
Comparing the Four Basic Schedules •
Ratio schedules (FR and VR) produce higher rates of response than do internal schedules (FI and VI)
Reinforcer in such schedules is entirely “response contingent”
•
FI schedule, this means responding at a gradual increasing rate as the interval draws to a close
•
VI schedules; this means responding at a moderate, steady pace throughout the interval
•
Fixed schedules tend to produce postreinfrocement pauses, whereas variable schedules do not
Other Simple Schedules of Reinforcement •
Duration Schedules : reinforcement is contingent on performing a behavior continuously through a period of time
Fixed Duration Schedule – behavior must be performed continuously for a fixed, predictable period of time
Variable Duration Schedule – behavior must be performed continuously for a varying, unpredictable period of time
•
Example: Rat must run in the week for 60 sec to ear one pellet of food (FD 60sec schedule)
Example: Rat must run in a wheel for an average of 60 seconds to earn on pellet of food, with the required time varying between 1 second and 120 seconds on any particular trial (VD 60sec schedule)
What constitutes “continuous performance of behavior” can vary – not running very fast
ResponseRate Schedules: reinforcement is contingent upon the organism’s rate of response
Differential Reinforcement of High Rates (DRH) – reinforcement is contingent upon emitting at least a certain number of responses in a certain period of time or more generally, reinforcement is provided for responding at a fast rate
Example: rat might receive a good pellet only if it emits at least 30 lever presses within a period of a minute = ensure a high rate of responding
Differential Reinforcement of Low Rates (DRL) – a minimum amount of time must be passed between each response before the reinforcer will be delivered, generally, reinforcement is provided for responding at a slow rate
Differential Reinforcement of Paced Responding (DRP) – reinforcement is contingent upon emitting a series of responses at a set rate more generally, reinforcement is provided for responding neither too fast or too slow
•
Example: rat might receive a food pellet only if it waits at least 10 seconds between lever presses
Example: rat might receive a food pellet if it emits 10 consecutive responses with each repose separated by an internal of no less than 1.5 and no more than 2.5 seconds
Noncontingent Schedules: the reinforcement is delivered independently of any response; a response is not required for the reinforcer to be obtained
Fixed Time Schedule – reinforcement is deliver following a fixed, predictable period of time, regardless of the organism’s behavior
Example: A pigeon received access to food every 30 seconds regardless of its behavior (FT 30sec)
Variable Time Schedule – reinforcement is deliver following a varying, unpredictable period of time, regardless of the organism’s behavior
Example: A pigeon receives access to food after an average internal of 30secs, with the actual internal on any particular trial ranging from 1 to 60 seconds (VT 30sec)
Complex Schedules of Reinforcement Conjunctive Schedules •
Complex Schedules: consist of a combination of two or more simple schedules
•
Type of complex schedule in which the requirements of two or more simple schedules must be met before a reinforcer is delivered
The wage you earn on a job are contingent upon working a certain number of hours each week and doing a sufficient amount of work so that you will not be fired
Example: conjunctive FI 2minute FR 100 schedule reinforcement is contingent upon completing 100 lever presses and completing at least one lever press following a 2 minute internal
Adjusting Schedules •
Response requirement changes as a function of the organism’s performance while responding for the previous reinforcer
Criterion for reinforcement is raised (or lowered) depending on the animal’s performance
Example: FR 100 schedule; if the rat completes all 100 responses within a 5 minute interval, we may then increase the requirement to 110 responses (FR 110)
Chained Schedules •
Consists of a sequence of two or more simple schedules, each of which has its own S D and the last of which results in a terminal reinforcer
Person or animal must work through a series of component schedule to obtain the soughtafter reinforcer
Differs from conjunctive in that the two components schedule must be completed in a particular order
Example: Pigeon in a standard operant conditioning chamber is presented with a VR 20 schedule on a green key, followed by a FI 10 sec schedule on a red key, which then leads to the terminal reinforcer of food – thus an average of 20 responses on the green key will result in a change in key color to red, following which the first responses on the red key after 10 sec interval will be reinforced VR 20 FI 10sec
Green key: Peck Red key: Peck Food SD R SR/SD R SR
•
Goal Gradient Effect: increase in the strength and/or efficiency of responding as one draws near to the goal
•
Example: Rats running in a maze to obtain food tend to run faster and make fewer wrong turn as they near the goal box
Backward Chaining: train the final link first and the initial link last
Example: Pigeon is first trained to respond on the red key to obtain food, then the green key then the white
Theories of Reinforcement The Premack Principle •
Provides a more objective way to determine whether something can be used as a reinforcer
•
Based on the notion that reinforcers can often be viewed as behaviors rather than stimuli
•
Example: rather than saying that lever pressing was reinforced by food (a stimulus), we could say that lever pressing was reinforced by the act of eating food (behavior)
By comparing the frequency of various behaviors, we can determine whether one can be used as a reinforcer for another
Conceptualize the sequence of behaviors
1. The behavior that is being reinforced 2. The behavior that is the reinforcer •
States that a highprobability behavior can be used to reinforce a lowprobability behavior
Example: Rat is hungry = eating food has a higher likelihood of occurrence then running in a wheel This means that eating food (HPB) can be used to reinforce the target behavior of running in a wheel (LPB) Target Behavior Consequence Running in a wheel (LPB) Eating food (HPB) R S R