Ch7.docx
Document Details
Uploaded by BraveJubilation
Full Transcript
Operant Condititoning (Ch7) Does each lever press by the rat result in a food, or are several lever presses required? Did your mom give you a cookie each time you asked for one, or only some of the time? A continuous reinforcement schedule (CRF) is one in which each specified responseis reinforced....
Operant Condititoning (Ch7) Does each lever press by the rat result in a food, or are several lever presses required? Did your mom give you a cookie each time you asked for one, or only some of the time? A continuous reinforcement schedule (CRF) is one in which each specified responseis reinforced. An intermittent (or partial) reinforcement schedule (PRF) is one in which only some responses are reinforced. Each time you flick the light switch, the light comes on. The behavior of flicking the light switch is on a continuous reinforcement schedule. When the weather is very cold, you are sometimes unable to start your car. The behavior of starting your car in very cold weather is on a variable reinforcement schedule. This means that the car will not start every time, as the cold weather conditions can affect the car’s ability to start. 4 types of intermittent (partial) schedules (Fixed Ratio=FR) (Variable ratio=VR) (Fixed Interval=FI) (Variable Interval=VI) Fixed Ratio=FR Reinforcement is contingent upon a fixed, predictable number of responses. Note that an FR 1 schedule is the same as a CRF schedule in which each response is reinforced. Fixed Ratio=FROn a fixed ratio 5 schedule (abbreviated FR 5), a rat has to press the lever 5 times to obtain a food. On an FR 50 schedule, it has to press the lever 50 times. Fixed Ratio=FR FR schedules generally produce a high rate of response along with a short pause following the attainment of each reinforcer. This short pause is known as a postreinforcement pause. E.g. a rat will take a short break following each reinforcer. Higher ratio requirements produce lo An FR 200 schedule of reinforcement will result in a longer pause than an FR 50 schedule. This is because the number of responses required for reinforcement is higher in FR 200 compared to FR 50. Schedules in which the reinforcer is easily obtained are indeed said to be very dense or rich, while schedules in which the reinforcer is difficult to obtain are said to be very lean. An FR 5 schedule is considered a very dense schedule of reinforcement compared to an FR 100 because fewer responses are required for reinforcement in FR 5. An FR 12 schedule of reinforcement is denser than an FR 100 schedule. This is because fewer responses are required for reinforcement in FR 12 compared to FR 100. Variable Ratio=VR Reinforcement is contingent upon a varying, unpredictable number of responses. On a variable ratio 5 (VR 5) schedule, a rat has to emit an average of 5 lever presses for each food pellet, with the number of lever responses on any particular trial varying between, say, 1 and 10. (3,7,5) E.g. gambling, lottery Think of the development of an abusive relationship. At the start of a relationship, the individuals involved typically provide each other with an enormous amount of positive reinforcement (a very dense schedule). And then what happens? As the relationship progresses, such reinforcement naturally becomes somewhat more intermittent. One person (the victimizer) providing reinforcement on an extremely intermittent basis, and the other person (the victim) working incredibly hard to obtain that reinforcement. VR schedules generally produce a high and steady rate of response with little or no postreinforcement pause Fixed Interval=FI Reinforcement is contingent upon the first response after a fixed, predictable period of time. e.g. receiving your salary after a 1-month period. Fixed Interval=FI For a rat on a fixed interval 30-second (FI 30-sec) schedule, the first lever press after a 30-second interval has elapsed results in a food pellet. Following that, another 30 seconds must elapse before a lever press will again produce a food pellet. Fixed Interval=FI FI schedules often produce a “scalloped” (upwardlycurved) pattern of responding, consisting of a postreinforcement pause followed by a gradually increasing rate of response as the interval draws to a close. Variable Interval=VI Reinforcement is contingent upon the first response after a varying, unpredictable period of time. For a rat on a variable interval 30-second (VI 30-sec) schedule, the first lever press after an average interval of 30 seconds will result in a food pellet, with the actual interval on any particular trial varying between, say, 1 and 60 seconds. Variable Interval=VI VI schedules usually produce a moderate, steady rate of response with little or nopost-reinforcement pause On interval schedules, the reinforcer is largely time contingent, meaning that the rapidity with which responses are emitted has little effect on how quickly the reinforcer is obtained. In general, ratio schedules produce postreinforcement pauses because obtaining one reinforcer means that the next reinforcer is necessarily quite distant. In general, variable schedules produce little or no postreinforcement pausing because such schedules provide the possibility of relatively immediate reinforcement, even if one has just obtained a reinforcer. A schedule in which 15 responses are required for each reinforcer is abbreviated FR 15. A mother finds that she always has to make the same request three times before her child does. The mother’s behavior of making requests is on a FR 3 schedule of reinforcement. If I have just missed the bus when I get to the bus stop, I know that I have to wait 15 minutes for the next one to come along. Given that it is absolutely freezing out, I snuggle into my parka as best I can and wait out the interval. Every once in a while, though, I emerge from my cocoon to take a quick glance down the street to see if the bus is coming. My behavior of looking for the bus is on a FI (Fixed Interval) schedule of reinforcement. In the previous example, I will probably engage in few glances at the start of the interval, followed by a gradually increasing rate of glancing as time passes. This is because as the time for the next bus approaches, the likelihood of its arrival increases, prompting more frequent checks. You find that by frequently switching stations on your radio, you are able to hear your favorite song an average of once every 20 minutes. Your behavior of switching stations is thus being reinforced on a Variable Interval (VI) schedule. This is because the reinforcement (hearing your favorite song) is delivered at unpredictable time intervals (an average of every 20 minutes). On a Fixed Interval (FI) 20-sec schedule, a response cannot be reinforced until 20 seconds have elapsed since the last reinforcer. This is because the reinforcement is delivered at specific, fixed time intervals (every 20 seconds in this case). Ayşe accepts Ahmet’s invitation for a date only when he has just been paid his monthly salary. Of the four simple schedules, the contingency governing Ahmet’s behavior of asking Ayşe for a date seems most similar to a Fixed Interval (FI) schedule of reinforcement. This is because the reinforcement (Ayşe accepting the date) is delivered at specific, fixed time intervals (once a month when Ahmet gets his salary). Theories of Reinforcement Clark Hull (Drive Reduction Theory) Sheffield (Drive Induction Theory) D. Premack (Premack Principle) Timberlake & Allison (Response Deprivation Hypothesis) Food is a reinforcer because the hunger drive is reduced when you obtain it. Food deprivation produces a “hunger drive,” which then propels the animal to seek out food. When food is obtained, the hunger drive is reduced. When a stimuli is associated with a reduction in some type of physiological drive, we can call this stimuli as ‘reinforcing’ and the behavior that the organism performs before the drive reduction is strengthened. e.g. if a hungry rat in a maze turns left just before it finds food in the goal box, the act of turning The problem with this theory is some reinforcers do not seem to be associated with any type of drive reduction. e.g. A rat will press a lever to obtain access to a running wheel. So, as opposed to an internal drive state, incentive motivation could be the key. So, it may be because of the reinforcing stimulus itself, not some type of internal state. E.g. Playing a video game for the fun of it, attending a concert because you enjoy the music. Going to a restaurant for a meal might be largely driven by hunger; however, the fact that you prefer a restaurant that serves hot, spicy food is an example of incentive motivation. The fact that several small bites of food is a more effective reinforcer than one large bite is consistent with the notion of incentive motivation. This is because incentive motivation is about the attractiveness of the reward, and multiple small rewards can be perceived as more attractive or rewarding than a single large one. This can motivate the subject to perform the desired behavior more effectively. On the other hand, drive reduction is about satisfying a biological need, which doesn’t necessarily align with the effectiveness of small versus large rewards in this context. According to Sheffield, Hull just explained the half of the story. The other half requires another thing. It’s not the drive reduction, but drive induction which makes a stimuli a reinforcer. e.g. Rabbit-carrot. Animal learns to react to me and carrot? The animal never eats the carrot. Where is the drive reduction here? Sheffield says you can support learning by allowing induction of a drive, not allowing reduction of it. Sexual behavior is also similar. In a barber shop, the owner of the shop puts some playboys on the desk. While customers are waiting, they read them. And they go to the same barber ‘unconsciously’ again and again. This is the induction of sexual contact. Inducing the drive is SR! -Male and female rats (no reduction, but drive induction) -male&male version In a factory, you have a standard payment for your workers. As a manager, you say that if they produce more, you will make an increament in their salary. When you make this promise, you are not reducing anything. Just the opposite! You are inducing something, new things. If you want a very strong SR, you should combine Hull and Sheffield. First induce the drive and then give the opportunity to reduce it. Drive induction followed by drive reduction! In order to sell a product, first create a need state. And then give the product which reduces the desires. Fear of perspiration & then give the deodorant. Induce the drive & reduce it. This is the general strategy of commercials. SR has 2 legs; induction of drive and reduction of drive. Reinforcers can often be viewed as behaviors rather than stimuli. For example, rather than saying that lever pressing was reinforced by food (a stimulus), we could say that lever pressing was reinforced by the act of eating food (a behavior). Then the process of reinforcement can be conceptualized as a sequence of two behaviors: (1) the behavior that is being reinforced, followed by (2) the behavior that is the reinforcer. Moreover, by comparing the frequency of various behaviors, we can determine whether one can be used as a reinforcer for the other. We should first decide the free-choice preference rate of behaviors. A high-probability behavior can be used to reinforce a lowprobability behavior. E.g. eating food (the high-probability behavior [HPB])running in a wheel (the low-probability behavior [LPB]).On the other hand, when the rat is not hungry… More probable behaviors will reinforce less probable behaviors. This principle is like Grandma’s rule: First you work (a low-probability behavior), then you play (a high-probability behavior). First, eat your spinach, and then you can get your ice cream. If you drink 5 cups of coffee each day and only 1 glass of orange juice, then the opportunity to drink coffee can likely be used as a reinforcer for drinking orange juice. The Premack principle in applied settings. For example, a person with autism who spends many hours each day rocking back and forth might be very unresponsive to consequences that are normally reinforcing for others, such as receiving praise. The Premack principle suggests that the opportunity to rock back and forth can be used as an effective reinforcer for another behavior, such as interacting with others. Thus, the Premack principle is a handy principle to keep in mind when confronted by a situation in which normal reinforcers seem to have little effect. One problem is the probabilities of behaviors might fluctuate so it might be difficult to measure. It does not fit well in the lab. There may be an error in determination of free-choice preference rate. Another problem also arises when two behaviors have the same probability. Initial probabilities are not stable over time. Every SR looses its reinforcement power. eating spinach 10% eating ice-cream 90% Everytime you reinforce another response, you change the probability of this response. We have variability in the reinforcement event. The erosion of SR◊after extensive usage of same SR, it begins to show a decline. This is called erosion-effect. The Premack principle holds that reinforcers can often be viewed as behaviors rather than stimuli. E.g., rather than saying that the rat’s lever pressing was reinforced with food, we could say that it was reinforced with eating food. The Premack principle states that a higher probability behavior can be used as a reinforcer for a lower probability behavior. If playing video games is a diagram of a reinforcement procedure based on the Premack principle, then chewing bubble gum must be a lower probability behavior than playing video games. A behavior can be used as a reinforcer if access to the behavior is restricted so that its frequency falls below its baseline rate (preferred level) of occurrence. Do not need to know the relative probabilities of twobehaviors beforehand. The frequency of one behavior relative to its baseline is the important aspect. Example: Man normally studies 60 min, exercises 30 min a day Schedule: Every 20 min of study earns 5 min of exercise Exercise is the deprived behavior (restricted) ϒPrediction: Study time should increase Example: A rat typically runs for 1 hour a day whenever it has free access to a running wheel (the rat’s preferred level of running). If the rat is then allowed free access to the wheel for only 15 minutes per day, it will be unable to reach this preferred level (deprivation) So, the rat will now be willing to press a lever to obtain additional time on the wheel. Do homeworks -> Read comic books R SR According to them, it is not important whether the probability of reading comic books is higher or lower than the probability of doing homeworks. The important thing is that whether comic book reading is now in danger of falling below its preferred level. For Premack, frequency of one behavior relative to another is important. For Allison & Timberlake, the frequency of one behavior relative to its baseline is important. If a child normally watches 4 hours of television per night, we can make television watching a reinforcer if we restrict free access to the television to less than 4 hours per night. Gina often goes for a walk through the woods, and even more often she does yardwork. According to the Premack principle, walking through the woods could still be used as a reinforcer for yardwork given that one restricts the frequency of walking to below its baseline level. Kaily typically watches television for 4 hours per day and reads comic books for 1 hour per day. You then set up a contingency whereby Kaily must watch 4.5 hours of television each day in order to have access to her comic books. According to the Premack principle, this will likely be an ineffective contingency. This is because watching television is a higher probability behavior for Kaily than reading comic books, so it’s unlikely that increasing the amount of television watching will effectively reinforce the lower probability behavior of reading comic books. Yasmin often goes for a walk through the woods, but she rarely does yardwork. According to the Premack principle, walking through the woods could be used as a reinforcer for yardwork. This is because walking through the woods is a higher probability behavior for Yasmin than doing yardwork, so it can be used to reinforce the lower probability behavior of doing yardwork. Drinking a soda to quench your thirst is an example of drive reduction; drinking a soda because you love its sweetness is an example of incentive motivation. Drive reduction is about satisfying a biological need (like thirst), while incentive motivation is about the attractiveness of the reward (the sweetness of the soda).