How Do We Learn? PDF
Document Details
Uploaded by DarlingCarnation
Tags
Summary
This document discusses various learning theories, focusing on classical and operant conditioning, along with habituation. It explains how we learn through associations and repeating behaviors, referencing examples from animals and humans. Also, it introduces behaviorism as a perspective in psychology.
Full Transcript
How Do We Learn? 26-1 How do we define learning, and what are some basic forms of learning? By learning, we humans adapt to our environments. We learn to expect and prepare for significant events such as food or pain (classical conditioning). We learn to repeat acts that bring rewards a...
How Do We Learn? 26-1 How do we define learning, and what are some basic forms of learning? By learning, we humans adapt to our environments. We learn to expect and prepare for significant events such as food or pain (classical conditioning). We learn to repeat acts that bring rewards and avoid acts that bring unwanted results (operant conditioning). We learn new behaviors by observing events and people, and through language, we learn things we have neither experienced nor observed (cognitive learning). But how do we learn? learning the process of acquiring through experience new and relatively enduring information or behaviors. More than 200 years ago, philosophers John Locke and David Hume echoed Aristotle’s conclusion from 2000 years earlier: We learn, first, by association. Our minds naturally connect events that occur in sequence. Suppose you see and smell freshly baked bread, eat some, and find it satisfying. The next time you see and smell fresh bread, you will expect that eating it will again be satisfying. So, too, with sounds. If you associate a sound with a frightening consequence, hearing the sound alone may trigger your fear. As one 4-year-old exclaimed after watching a TV character get mugged, “If I had heard that music, I wouldn’t have gone around the corner!” (Wells, 1981). Learned associations often operate subtly: Give people a red pen (associated with error marking) rather than a black pen and, when correcting essays, they will spot more errors and give lower grades (Rutchick et al., 2010). When voting, people are more likely to support taxes to aid education if 742 their assigned voting place is in a school (Berger et al., 2008). Learned associations also feed our habitual behaviors (Wood et al., 2014). Habits can form when we repeat behaviors in a given context— sleeping in a certain posture in bed, biting our nails in class, eating popcorn in a movie theater. As behavior becomes linked with the context, our next experience of that context will evoke our habitual response. Especially when we’re mentally fatigued, we tend to fall back on our habits (Neal et al., 2013). That’s true of both good habits (eating fruit) and bad (overindulging in candy) (Graybiel & Smith, 2014). To increase our self-control, and to connect our resolutions with positive outcomes, the key is forming “beneficial habits” (Galla & Duckworth, 2015). How long does it take to form a beneficial habit? To find out, one British research team asked 96 university students to choose some healthy behavior (such as running before dinner or eating fruit with lunch), to do it daily for 84 days, and to record whether the behavior felt automatic (something they did without thinking and would find it hard not to do). On average, behaviors became habitual after about 66 days (Lally et al., 2010). Is there something you’d like to make a routine or essential part of your life? Just do it every day for two months, or a bit longer for exercise, and you likely will find yourself with a new habit. This happened for both of us—with a midday workout [DM] or late afternoon run [ND] having long ago become an automatic daily routine. TRY THIS Most of us would be unable to name the order of the songs on our favorite album or playlist. Yet, hearing the end of one piece cues (by association) an anticipation of the next. Likewise, when singing your national anthem, you associate the end of each line with the beginning of the next. (Pick a line out of the middle and notice how much harder it is to recall the previous line.) Other animals also learn by association. Disturbed by a squirt of water, the sea slug Aplysia protectively withdraws its gill. If the squirts continue, as happens naturally in choppy water, the withdrawal response diminishes. 743 We say the slug habituates. (Habituation is what happens when repeated stimulation produces waning responsiveness.) But if the sea slug repeatedly receives an electric shock just after being squirted, its protective response to the squirt instead grows stronger. The animal has associated the squirt with the impending shock. habituation decreasing responsiveness with repeated exposure to a stimulus. AP® EXAM TIP The AP® exam could have you differentiate similar concepts from various units. For example, it’s easy to confuse habituation with sensory adaptation, a concept from Unit IV. Recall that sensory adaptation occurs when one of your sensory systems stops registering the presence of an unchanging stimulus— when you go swimming in a cool pool, for example, the water soon stops feeling cool. Habituation, like sensory adaptation, involves a diminished response, but in this case it’s a form of learning. If you’re exposed to the same stimulus over and over, your response decreases. A friend might sneak up and startle you by yelling “Boo!” You will probably startle less when he tries it again two minutes later. That’s habituation. Complex animals can learn to associate their own behavior with its outcomes. An aquarium seal will repeat behaviors, such as slapping and barking, that prompt people to toss it a herring. By linking two events that occur close together, both sea slugs and seals are exhibiting associative learning. The sea slug associates the squirt with an impending shock; the seal associates slapping and barking with a herring treat. Each animal has learned something important to its survival: anticipating the immediate future. 744 associative learning learning that certain events occur together. The events may be two stimuli (as in classical conditioning) or a response and its consequence (as in operant conditioning). This process of learning associations is conditioning. It takes two main forms: In classical conditioning, we learn to associate two stimuli and thus to anticipate events. (A stimulus is any event or situation that evokes a response.) We learn that a flash of lightning signals an impending crack of thunder; when lightning flashes nearby, we start to brace ourselves (Figure 26.1). We associate stimuli that we do not control, and we respond automatically (exhibiting respondent behavior). In operant conditioning, we learn to associate a response (our behavior) and its consequence. Thus we (and other animals) learn to repeat acts followed by good results (Figure 26.2) and avoid acts followed by bad results. These associations produce operant behaviors (which operate on the environment to produce a consequence). stimulus any event or situation that evokes a response. respondent behavior behavior that occurs as an automatic response to some stimulus. operant behavior behavior that operates on the environment, producing consequences. 745 Figure 26.1 Classical conditioning Figure 26.2 Operant conditioning To simplify, we will explore these two types of associative learning separately. Often, though, they occur together. Consider the Japanese cattle ranch where the clever rancher outfitted his herd with electronic pagers which he called from his cell phone. After a week of training, the animals learned to associate two stimuli—the beep of their pager and the arrival of food (classical conditioning). But they also learned to associate their hustling to the food trough with the pleasure of eating (operant conditioning), which simplified the rancher’s work. Classical conditioning + operant conditioning did the trick. Conditioning is not the only form of learning. Through cognitive learning, we acquire mental information that guides our behavior. Observational learning, one form of cognitive learning, lets us learn from 746 others’ experiences. Chimpanzees, for example, sometimes learn behaviors merely by watching others perform them. If one animal sees another solve a puzzle and gain a food reward, the observer may perform the trick more quickly. So, too, in humans: We look and we learn. cognitive learning the acquisition of mental information, whether by observing events, by watching others, or through language. “ Watch your thoughts, they become words; watch your words, they become actions; watch your actions, they become habits; watch your habits, they become character; ” watch your character, for it becomes your destiny. Attributed to nineteenth-century fugitive cowboy Frank Outlaw, 1977 Let’s look more closely now at classical conditioning. Check Your Understanding Ask Yourself Can you remember some example from your childhood of learning through classical conditioning—perhaps salivating at the sound or smell of some delicious food cooking in your family kitchen? Can you remember an example of operant conditioning, when you repeated (or decided not to repeat) a behavior because you liked (or hated) its consequences? Can you recall watching someone else perform some act and later repeating or avoiding that act? Test Yourself Why are habits, such as having something sweet with that cup of coffee, so hard to break? (Hint: Think about learned associations.) 747 As we develop, we learn cues that lead us to expect and prepare for good and bad events. We learn to repeat behaviors that bring rewards. And we watch others and learn. What do psychologists call these three types of learning? Answers to the Test Yourself questions can be found in Appendix E at the end of the book. 748 Classical Conditioning 26-2 What is behaviorism’s view of learning? For many people, the name Ivan Pavlov (1849–1936) rings a bell. His early twentieth-century experiments—now psychology’s most famous research—are classics, and the phenomenon he explored we justly call classical conditioning. classical conditioning a type of learning in which we link two or more stimuli; as a result, to illustrate with Pavlov’s classic experiment, the first stimulus (a tone) comes to elicit behavior (drooling) in anticipation of the second stimulus (food). Ivan Pavlov “Experimental investigation... should lay a solid foundation for a future true science of psychology” (1927). Pavlov’s work laid the foundation for many of psychologist John B. Watson’s ideas. In searching for laws underlying learning, Watson (1913) urged his colleagues to discard reference to inner thoughts, feelings, and motives. The science of psychology should instead study how organisms respond to stimuli in their environments, said Watson: “Its theoretical goal is the prediction and control of behavior. Introspection forms no essential part of its methods.” Simply said, psychology should be an objective 749 science based on observable behavior. This view, which Watson called behaviorism, influenced North American psychology during the first half of the twentieth century. Pavlov and Watson both came to share a disdain for “mentalistic” concepts (such as consciousness) and a belief that the basic laws of learning were the same for all animals—whether sea slugs or dogs or humans. Few researchers today agree that psychology should ignore mental processes, but most do agree that classical conditioning is a basic form of learning by which all organisms adapt to their environment. behaviorism the view that psychology (1) should be an objective science that (2) studies behavior without reference to mental processes. Most research psychologists today agree with (1) but not with (2). Pavlov’s Experiments Flip It Video: Pavlov’s Experiments 26-3 Who was Pavlov, and what are the basic components of classical conditioning? Pavlov was driven by a lifelong passion for research. After setting aside his initial plan to follow his father into the Russian Orthodox priesthood, Pavlov earned a medical degree at age 33 and spent the next two decades studying dogs’ digestive system. This work earned him Russia’s first Nobel Prize. But it was his novel experiments on learning, which consumed the last three decades of his life, that earned this feisty, intense scientist his place in history (Todes, 2014). Pavlov’s new direction came when his creative mind seized on an incidental observation. Without fail, putting food in a dog’s mouth caused the animal to salivate. Moreover, the dog began salivating not only to the taste of the food, but also to the mere sight of the food, or the food dish, or the person delivering the food, or even at the sound of that person’s 750 approaching footsteps. At first, Pavlov considered these “psychic secretions” an annoyance—until he realized they pointed to a simple but fundamental form of learning. Pavlov and his assistants tried to imagine what the dog was thinking and feeling as it drooled in anticipation of the food. This only led them into fruitless debates. So, to explore the phenomenon more objectively, they experimented. To eliminate other possible influences, they isolated the dog in a small room, secured it in a harness, and attached a device to divert its saliva to a measuring instrument (Figure 26.3). From the next room, they presented food—first by sliding in a food bowl, later by blowing meat powder into the dog’s mouth at a precise moment. They then paired various neutral stimuli (NS)—events the dog could see or hear but didn’t associate with food—with food in the dog’s mouth. If a sight or sound regularly signaled the arrival of food, would the dog learn the link? If so, would it begin salivating in anticipation of the food? Figure 26.3 Pavlov’s device for recording salivation A tube in the dog’s cheek collects saliva, which is measured in a cylinder outside the chamber. neutral stimulus (NS) in classical conditioning, a stimulus that elicits no response before conditioning. 751 The answers proved to be Yes and Yes. Just before placing food in the dog’s mouth to produce salivation, Pavlov sounded a tone. After several pairings of tone and food, the dog, now anticipating the meat powder, began salivating to the tone alone. In later experiments, a buzzer,1 a light, a touch on the leg, even the sight of a circle set off the drooling. (This procedure works with people, too. When hungry young Londoners viewed abstract figures before smelling peanut butter or vanilla, their brain soon responded in anticipation to the abstract images alone [Gottfried et al., 2003].) Peanuts A dog does not learn to salivate in response to food in its mouth. Rather, food in the mouth automatically, unconditionally, triggers a dog’s salivary reflex (Figure 26.4). Thus, Pavlov called this drooling an unconditioned response (UR). And he called the food an unconditioned stimulus (US). 752 Figure 26.4 Pavlov’s classic experiment Pavlov presented a neutral stimulus (a tone) just before an unconditioned stimulus (food in mouth). The neutral stimulus then became a conditioned stimulus, producing a conditioned response. unconditioned response (UR) in classical conditioning, an unlearned, naturally occurring response (such as salivation) to an unconditioned stimulus (US) (such as food in the mouth). unconditioned stimulus (US) in classical conditioning, a stimulus that unconditionally—naturally and automatically—triggers an unconditioned response UR). Salivation in response to a tone, however, is learned. It is conditional upon the dog’s associating the tone with the food. Thus, we call this response the conditioned response (CR). The stimulus that used to be neutral (in this case, a previously meaningless tone that now triggers salivation) is the conditioned stimulus (CS). Distinguishing these two kinds of stimuli and responses is easy: Conditioned = learned; unconditioned = unlearned. 753 conditioned response (CR) in classical conditioning, a learned response to a previously neutral (but now conditioned) stimulus (CS). conditioned stimulus (CS) in classical conditioning, an originally neutral stimulus that, after association with an unconditioned stimulus (US), comes to trigger a conditioned response (CR). If Pavlov’s demonstration of associative learning was so simple, what did he do for the next three decades? What discoveries did his research factory publish in his 532 papers on salivary conditioning (Windholz, 1997)? He and his associates explored five major conditioning processes: acquisition, extinction, spontaneous recovery, generalization, and discrimination. Check Your Understanding Ask Yourself Do Pavlov’s experiments, showing that dogs learned to anticipate meat powder, surprise you? Why or why not? Test Yourself An experimenter sounds a tone just before delivering an air puff that causes your eye to blink. After several repetitions, you blink to the tone alone. What is the NS? The US? The UR? The CS? The CR? Answers to the Test Yourself questions can be found in Appendix E at the end of the book. Acquisition 26-4 In classical conditioning, what are the processes of acquisition, extinction, spontaneous recovery, generalization, and 754 discrimination? To understand the acquisition, or initial learning, of the stimulus-response relationship, Pavlov and his associates wondered: How much time should elapse between presenting the NS (the tone, the light, the touch) and the US (the food)? In most cases, not much—half a second usually works well. acquisition in classical conditioning, the initial stage, when one links a neutral stimulus and an unconditioned stimulus so that the neutral stimulus begins triggering the conditioned response. In operant conditioning, the strengthening of a reinforced response. What do you suppose would happen if the food (US) appeared before the tone (NS) rather than after? Would conditioning occur? Not likely. Conditioning usually won’t occur when the NS follows the US. Remember, classical conditioning is biologically adaptive because it helps humans and other animals prepare for good or bad events. To Pavlov’s dogs, the originally neutral tone became a CS after signaling an important biological event—the arrival of food (US). To deer in the forest, the snapping of a twig (CS) may signal a predator’s approach (US). Research on male Japanese quail shows how a CS can signal another important biological event (Domjan, 1992, 1994, 2005). Just before presenting an approachable female quail, the researchers turned on a red light. Over time, as the red light continued to herald the female’s arrival, the light alone caused the male quail to become excited. They developed a preference for their cage’s red-light district, and when a female appeared, they mated with her more quickly and released more semen and sperm (Matthews et al., 2007). This capacity for classical conditioning supports reproduction. 755 AP® EXAM TIP You may find it helpful to note that spontaneous recovery is, in fact, spontaneous (Figure 26.6). Notice that the extinguished conditioned response returns without any additional pairing with the unconditioned stimulus. It is not a form of acquisition. Sexual conditioning also occurs in rats. When their early sexual experiences are with partners scented with a peculiar odor, rats later display a preference for similarly scented partners (Pfaus et al., 2012). In humans, too, objects, smells, and sights associated with sexual pleasure— even a geometric figure, in one experiment—can become conditioned stimuli for sexual arousal (Byrne, 1982; Hoffman, 2012). Onion breath does not usually produce sexual arousal. But when repeatedly paired with a passionate kiss, it can become a CS and do just that (Figure 26.5). The larger lesson: Conditioning helps an animal survive and reproduce—by responding to cues that help it gain food, avoid dangers, locate mates, and produce offspring (Hollis, 1997). Learning makes for yearning. 756 Figure 26.5 An unexpected CS Psychologist Michael Tirrell (1990) recalled: “My first girlfriend loved onions, so I came to associate onion breath with kissing. Before long, onion breath sent tingles up and down my spine. Oh what a feeling!” FYI Remember: NS = Neutral Stimulus US = Unconditioned Stimulus UR = Unconditioned Response CS = Conditioned Stimulus CR = Conditioned Response Through higher-order conditioning, a new NS can become a new CS without the presence of a US. All that’s required is for it to become associated with a previously conditioned stimulus. If a tone regularly signals food and produces salivation, then a light that becomes associated with the tone (light → tone → food) may also begin to trigger salivation. Although this higher-order conditioning (also called second-order conditioning) tends to be weaker than first-order conditioning, it influences our everyday lives. Imagine that something makes us very afraid (perhaps 757 a guard dog associated with a previous dog bite). If something else, such as the sound of a barking dog, brings that guard dog to mind, the bark alone may make us feel a little afraid. higher-order conditioning a procedure in which the conditioned stimulus in one conditioning experience is paired with a new neutral stimulus, creating a second (often weaker) conditioned stimulus. For example, an animal that has learned that a tone predicts food might then learn that a light predicts the tone and begin responding to the light alone. (Also called second-order conditioning.) Extinction and Spontaneous Recovery What would happen, Pavlov wondered, if, after conditioning, the CS occurred repeatedly without the US? If the tone sounded again and again, but no food appeared, would the tone still trigger salivation? The answer was mixed. The dogs salivated less and less, a reaction known as extinction. Extinction is the diminished response that occurs when the CS (tone) no longer signals an impending US (food). But a different picture emerged when Pavlov allowed several hours to elapse before sounding the tone again. After the delay, the dogs would again begin salivating to the tone (Figure 26.6). This spontaneous recovery—the reappearance of a (weakened) CR after a pause—suggested to Pavlov that extinction was suppressing the CR rather than eliminating it. extinction the diminishing of a conditioned response; occurs in classical conditioning when an unconditioned stimulus (US) does not follow a conditioned stimulus (CS); occurs in operant conditioning when a response is no longer reinforced. spontaneous recovery the reappearance, after a pause, of an extinguished conditioned response. 758 Figure 26.6 Idealized curve of acquisition, extinction, and spontaneous recovery The rising curve shows the CR rapidly growing stronger as the NS becomes a CS due to repeated pairing with the US (acquisition). The CR then weakens rapidly as the CS is presented alone (extinction). After a pause, the (weakened) CR reappears (spontaneous recovery). Check Your Understanding Ask Yourself Psychologist Michael Tirrell recalled coming to associate his girlfriend’s onion breath with arousal. Can you remember ever experiencing something that would normally be neutral (or even unpleasant) that came to mean something special? Test Yourself If the aroma of a baking cake sets your mouth to watering, what is the US? The CS? The CR? The first step of classical conditioning, when an NS becomes a CS, is called ____________. When a US no longer follows the CS, and the CR becomes weakened, this is called ______________. Answers to the Test Yourself questions can be found in Appendix E at the end of the book. 759 Generalization Pavlov and his students noticed that a dog conditioned to the sound of one tone also responded somewhat to the sound of a new and different tone. Likewise, a dog conditioned to salivate when rubbed would also drool a bit when scratched (Windholz, 1989) or when touched on a different body part (Figure 26.7). This tendency to respond to stimuli similar to the CS is called generalization (or stimulus generalization). Figure 26.7 Generalization Pavlov demonstrated generalization by attaching miniature vibrators to various parts of a dog’s body. After conditioning salivation to stimulation of the thigh, he stimulated other areas. The closer a stimulated spot was to the dog’s thigh, the stronger the conditioned response. generalization the tendency, once a response has been conditioned, for stimuli similar to the conditioned stimulus to elicit similar responses. (In operant conditioning, 760 generalization occurs when responses learned in one situation occur in other, similar situations.) Generalization can be adaptive, as when toddlers who learn to fear moving cars also become afraid of moving trucks and motorcycles. And generalized fears can linger. Years after being tortured, one Argentine writer reported still flinching with fear at the sight of black shoes—his first glimpse of his torturers as they approached his cell. Generalized anxiety reactions have been demonstrated in laboratory studies comparing abused with nonabused children (Figure 26.8). When an angry face appears on a computer screen, abused children’s brain-wave responses are dramatically stronger and longer lasting (Pollak et al., 1998). And when a face that we’ve been conditioned to like (or dislike) is morphed into another face, we also have some tendency to like (or dislike) the vaguely similar morphed face (Gawronski & Quinn, 2013). Figure 26.8 Child abuse leaves tracks in the brain Abused children’s sensitized brains react more strongly to angry faces (Pollak et al., 1998). This generalized anxiety response may help explain their greater risk of psychological disorder. Stimuli similar to naturally disgusting objects will, by association, also 761 evoke some disgust. Shape otherwise desirable fudge to resemble dog feces, and people will be repulsed (Rozin et al., 1986). Ditto when sewage gets recycled as pure drinking water (Rozin et al., 2015). Toilet → tap → yuck. In each of these human examples, people’s emotional reactions to one stimulus have generalized to similar stimuli. AP® EXAM TIP Generalization and discrimination are introduced in this module, but they don’t just apply to classical conditioning. These two concepts will show up in other types of learning as well. It is important that you understand this for the AP® exam. Discrimination Pavlov’s dogs also learned to respond to the sound of a particular tone and not to other tones. One stimulus (tone) predicted the US, and the others did not. This learned ability to distinguish between a conditioned stimulus (which predicts the US) and other, irrelevant stimuli is called discrimination. Being able to recognize differences is adaptive. Slightly different stimuli can be followed by vastly different consequences. Facing a guard dog, your heart may race; facing a guide dog, it probably will not. discrimination in classical conditioning, the learned ability to distinguish between a conditioned stimulus and similar stimuli that do not signal an unconditioned stimulus. (In operant conditioning, the ability to distinguish responses that are reinforced from similar responses that are not reinforced.) Check Your Understanding Ask Yourself 762 How have your emotions or behaviors been classically conditioned? Test Yourself In horror movies, sexually arousing images of women are sometimes paired with violence against women. Based on classical conditioning principles, what might be an effect of this pairing? Answers to the Test Yourself questions can be found in Appendix E at the end of the book. Pavlov’s Legacy 26-5 Why does Pavlov’s work remain so important? What remains today of Pavlov’s ideas? A great deal. Most psychologists now agree that classical conditioning is a basic form of learning. Judged with today’s knowledge of the interplay of our biology, psychology, and social-cultural environment, Pavlov’s ideas were incomplete. But if we see further than Pavlov did, it is because we stand on his shoulders. Why does Pavlov’s work remain so important? If he had merely taught us that old dogs can learn new tricks, his experiments would long ago have been forgotten. Why should we care that dogs can be conditioned to salivate to the sound of a tone? The importance lies, first, in the finding that many other responses to many other stimuli can be classically conditioned in many other organisms—in fact, in every species tested, from earthworms to fish to dogs to monkeys to people (Schwartz, 1984). Thus, classical conditioning is one way that virtually all organisms learn to adapt to their environment. 763 “I don’t care if she’s a tape dispenser. I love her.” What conditioning principle is influencing the snail’s affections?2 Second, Pavlov showed us how a process such as learning can be studied objectively. He was proud that his methods involved virtually no subjective judgments or guesses about what went on in a dog’s mind. The salivary response is a behavior measurable in cubic centimeters of saliva. Pavlov’s success therefore suggested a scientific model for how the young discipline of psychology might proceed—by isolating the basic building blocks of complex behaviors and studying them with objective laboratory procedures. Applications of Classical Conditioning 26-6 What have been some applications of Pavlov’s work to human health and well-being? How did Watson apply Pavlov’s principles to learned fears? Other units in this text show how Pavlov’s principles can influence human health and well-being. Here are three examples: Drug cravings. Former drug users often feel a craving when they are again in the drug-using context—with people or in places they associate with previous highs. Thus, drug counselors advise their clients to steer clear of people and settings that may trigger these cravings (Siegel, 2005). Food cravings. Classical conditioning makes dieting difficult. We 764 readily associate sugary substances with an enjoyable sweet sensation. Researchers have conditioned healthy volunteers to experience cravings after only one instance of eating a sweet food (Blechert et al., 2016). Eating one cookie can create hunger for another. People who struggle with their weight often have eaten unhealthy foods thousands of times, leaving them with strongly conditioned responses to eat the very foods that will keep them in poor health (Hill, 2007). Immune responses. Classical conditioning even works on the body’s disease-fighting immune system. When a particular taste accompanies a drug that influences immune responses, the taste by itself may come to produce an immune response (Ader & Cohen, 1985). Pavlov’s work also provided a basis for Watson’s (1913) idea that human emotions and behaviors, though biologically influenced, are mainly a bundle of conditioned responses. Working with an 11-month-old, Watson and his graduate student Rosalie Rayner (1920; Harris, 1979) showed how specific fears might be conditioned. Like most infants, “Little Albert” feared loud noises but not white rats. Watson and Rayner presented a white rat and, as Little Albert reached to touch it, struck a hammer against a steel bar just behind his head. After seven repeats of seeing the rat and hearing the frightening noise, Albert burst into tears at the mere sight of the rat. Five days later, he had generalized this startled fear reaction to the sight of a rabbit, a dog, and a sealskin coat, but not to dissimilar objects, such as toys. 765 John B. Watson Watson (1924) admitted to “going beyond my facts” when offering his famous boast: “Give me a dozen healthy infants, well-formed, and my own specified world to bring them up in and I’ll guarantee to take any one at random and train him to become any type of specialist I might select—doctor, lawyer, artist, merchant-chief, and, yes, even beggarman and thief, regardless of his talents, penchants, tendencies, abilities, vocations, and race of his ancestors.” Little Albert In Watson and Rayner’s experiments, “Little Albert” learned to fear a white rat after repeatedly experiencing a loud noise as the rat was presented. In these experiments, what was the US? The UR? The NS? The CS? The CR?3 For years, people wondered what became of Little Albert. Sleuthing by Russell Powell and his colleagues (2014) found a well-matched child of one of the hospital’s wet nurses. The child, William Albert Barger, went by Albert B.—precisely the name used by Watson and Rayner. This Albert, who died in 2007, was an easygoing person, though, perhaps coincidentally, he had an aversion to dogs. He died without ever knowing of his early life in a hospital residence or his role in psychology’s history. People also wondered what became of Watson. After losing his Johns 766 Hopkins professorship over an affair with Rayner (whom he later married), he joined an advertising agency as the company’s resident psychologist. There, he used his knowledge of associative learning to conceive many successful advertising campaigns, including one for Maxwell House that helped make the “coffee break” an American custom (Hunt, 1993). The treatment of Little Albert would be unethical by today’s standards. Also, some psychologists had difficulty repeating Watson and Rayner’s findings with other children. Nevertheless, Little Albert’s learned fears led many psychologists to wonder whether each of us might be a walking warehouse of conditioned emotions. If so, might extinction procedures or new conditioning help us change our unwanted responses to emotion- arousing stimuli? One patient, who for 30 years had feared entering an elevator alone, did just that. Following his therapist’s advice, he forced himself to enter 20 elevators a day. Within 10 days, his fear had nearly vanished (Ellis & Becker, 1982). With support from airline AirTran, comedian-writer Mark Malkoff likewise extinguished his fear of flying. He lived on an airplane for 30 days, taking 135 flights that had him in the air 14 hours a day (NPR, 2009). After a week and a half, his fears had faded and he began playing games with fellow passengers. (His favorite antic was the “toilet paper experiment”: He’d put one end of a roll in the toilet, unroll the rest down the aisle, and flush. The entire roll would be sucked down in three seconds.) In Unit XIII we will see more examples of how psychologists use behavioral techniques such as counterconditioning to treat emotional disorders and promote personal growth. 767 Module 26 REVIEW 26-1 How do we define learning, and what are some basic forms of learning? Learning is the process of acquiring through experience new and relatively enduring information or behaviors. In associative learning, we learn that certain events occur together. In classical conditioning, we learn to associate two or more stimuli (a stimulus is any event or situation that evokes a response). Automatically responding to stimuli we do not control is called respondent behavior. In operant conditioning, we learn to associate a response and its consequences. These associations produce operant behaviors. Through cognitive learning, we acquire mental information that guides our behavior. For example, in observational learning, we learn new behaviors by observing events and watching others. 26-2 What is behaviorism’s view of learning? Ivan Pavlov’s work on classical conditioning laid the foundation for behaviorism, the view that psychology should be an objective science that studies behavior without reference to mental processes. The behaviorists believed that the basic laws of learning are the same for all species, including humans. 26-3 Who was Pavlov, and what are the basic components of classical conditioning? Ivan Pavlov, a Russian physiologist, created novel experiments on learning. His early twentieth-century research over the last three decades of his life demonstrated that classical conditioning is a basic form of learning. 768 Classical conditioning is a type of learning in which an organism comes to associate stimuli and anticipate events. The process involves stimuli and responses: A UR is an event that occurs naturally (such as salivation), in response to some stimulus. A US is something that naturally and automatically (without learning) triggers the unlearned response (as food in the mouth triggers salivation). A CS is originally an NS (neutral stimulus, such as a tone) that, after association with a US (such as food) comes to trigger a CR. A CR is the learned response (salivating) to the originally neutral (but now conditioned) stimulus. 26-4 In classical conditioning, what are the processes of acquisition, extinction, spontaneous recovery, generalization, and discrimination? In classical conditioning, the first stage is acquisition, or associating an NS with the US so that the NS begins triggering the CR. Acquisition occurs most readily when the NS is presented just before (ideally, about a half-second before) a US, preparing the organism for the upcoming event. This finding supports the view that classical conditioning is biologically adaptive. Through higher-order conditioning, a new NS can become a new CS. Extinction is diminished responding, which occurs if the CS appears repeatedly by itself (without the US). Spontaneous recovery is the appearance of a formerly extinguished response, following a rest period. Generalization is the tendency to respond to stimuli that are similar to a CS. Discrimination is the learned ability to distinguish between a CS and other irrelevant stimuli. 769 26-5 Why does Pavlov’s work remain so important? Pavlov taught us that significant psychological phenomena can be studied objectively, and that classical conditioning is a basic form of learning that applies to all species. 26-6 What have been some applications of Pavlov’s work to human health and well-being? How did Watson apply Pavlov’s principles to learned fears? Classical conditioning techniques are used to improve human health and well-being in many areas, including therapy for those recovering from drug addiction and for some types of psychological disorders. The body’s immune system may also respond to classical conditioning. Pavlov’s work also provided a basis for Watson’s idea that human emotions and behaviors, though biologically influenced, are mainly a bundle of conditioned responses. Watson applied classical conditioning principles in his studies of “Little Albert” to demonstrate how specific fears might be conditioned. Multiple-Choice Questions 1. Which of the following is the best example of learning? a. A dog salivates when food is placed in its mouth. b. A honeybee stings when the hive is threatened. c. A child cries when his brother hits him. d. A child feels ill after drinking sour milk. e. A child flinches when he sees lightning because he is afraid of thunder. 2. A family uses the microwave to prepare their cat’s food. The cat comes running into the room when the microwave timer sounds, but not when it hears the oven timer. The cat is demonstrating the concept of a. generalization. 770 b. discrimination. c. spontaneous recovery. d. extinction. e. habituation. 3. In classical conditioning, a person learns to anticipate events by a. associating a response with its consequence. b. avoiding spontaneous recovery. c. using operant behaviors. d. associating two stimuli. e. employing cognitive learning. 4. In classical conditioning, the conditioned stimulus a. naturally triggers a response. b. is a naturally occurring response. c. is initially neutral, and then comes to trigger a response. d. prompts spontaneous recovery. e. is a reward offered for completing a behavior. 5. Students are accustomed to a bell ringing to indicate the end of a class period. The principal decides to substitute popular music for the bell to indicate the end of each class period. Students quickly respond to the music in the same way they did to the bell. In this example, the music is a(n) a. conditioned response. b. conditioned stimulus. c. unconditioned response. d. unconditioned stimulus. e. habituated response. 6. Students in a school are accustomed to moving to the next class when music plays. After a period of time, the principal replaces the music with a bell to signal the end of class. If one day he plays the music by mistake and the students leave class, which of the following is being shown? 771 a. Acquisition b. Generalization c. Habituation d. Spontaneous recovery e. Operant conditioning Practice FRQs 1. Carter’s romantic friend has worn plaid shirts on all their special dates. Now, when seeing a plaid shirt, Carter automatically feels happy and a little excited. Identify what each of the following terms would be in this example. Conditioned response (CR) Conditioned stimulus (CS) Unconditioned stimulus (US) Answer 1 point: The happy, excited response to the plaid shirt is the CR (thanks to its learned association with the plaid-wearing friend in happy times). Page 274 1 point: The plaid shirt was initially a neutral stimulus (NS). But, thanks to its association with the friend in the dating situation, it has become a CS. Page 274 1 point: The friend in a romantic context is the US. Page 274 2. A researcher paired the sound of a whistle with a puff of air to the eye to classically condition Ashley to blink when the whistle alone was sounded. Explain how the researcher could demonstrate the following. Generalization Extinction Spontaneous recovery (3 points) 772 Module 27 Operant Conditioning LEARNING TARGETS 27-1 Describe operant conditioning. 27-2 Identify Skinner, and describe how operant behavior is reinforced and shaped. 27-3 Differentiate positive reinforcement from negative reinforcement, and identify the basic types of reinforcers. 27-4 Explain how different reinforcement schedules affect behavior. 27-5 Differentiate punishment from negative reinforcement, and explain how punishment affects behavior. 27-6 Describe why Skinner’s ideas provoked controversy. 27-1 What is operant conditioning? It’s one thing to classically condition a dog to salivate to the sound of a tone, or a child to fear moving cars. But to teach an elephant to walk on its hind legs or a child to say please, we turn to operant conditioning. Classical conditioning and operant conditioning are both forms of associative learning, yet their differences are straightforward: Classical conditioning forms associations between stimuli (a CS and the US it signals). It also involves respondent behavior—automatic responses to a stimulus (such as salivating in response to meat powder 773 and later in response to a tone). In operant conditioning, organisms associate their own actions with consequences. Actions followed by reinforcers increase; those followed by punishments often decrease. Behavior that operates on the environment to produce rewarding or punishing stimuli is called operant behavior. operant conditioning a type of learning in which a behavior becomes more likely to recur if followed by a reinforcer or less likely to recur if followed by a punisher. Check Your Understanding Ask Yourself Consider what you have learned so far about these two types of associative learning. Why does operant conditioning help teach a child good manners? Test Yourself With classical conditioning, we learn associations between events we __________ (do/do not) control. With operant conditioning, we learn associations between our behavior and __________ (resulting/random) events. Answers to the Test Yourself questions can be found in Appendix E at the end of the book. 774 Skinner’s Experiments 27-2 Who was Skinner, and how is operant behavior reinforced and shaped? B. F. Skinner (1904–1990) was a college English major and aspiring writer who, seeking a new direction, enrolled as a graduate student in psychology. He went on to become modern behaviorism’s most influential and controversial figure. Skinner’s work elaborated on what psychologist Edward L. Thorndike (1874–1949) called the law of effect: Rewarded behavior tends to recur (Figure 27.1), and punished behavior is less likely to recur. Using Thorndike’s law of effect as a starting point, Skinner developed a behavioral technology that revealed principles of behavior control. By shaping pigeons’ natural walking and pecking behaviors, for example, Skinner was able to teach them such unpigeon-like behaviors as walking in a figure 8, playing Ping-Pong, and keeping a missile on course by pecking at a screen target. law of effect Thorndike’s principle that behaviors followed by favorable consequences become more likely, and that behaviors followed by unfavorable consequences become less likely. Figure 27.1 Cat in a puzzle box 775 Thorndike used a fish reward to entice cats to find their way out of a puzzle box through a series of maneuvers. The cats’ performance tended to improve with successive trials, illustrating Thorndike’s law of effect. For his pioneering studies, Skinner designed an operant chamber, popularly known as a Skinner box (Figure 27.2). The box has a bar (a lever) that an animal presses—or a key (a disc) the animal pecks—to release a reward of food or water. It also has a device that records these responses. This creates a stage on which rats and other animals act out Skinner’s concept of reinforcement: any event that strengthens (increases the frequency of) a preceding response. What is reinforcing depends on the animal and the conditions. For people, it may be praise, attention, or a paycheck. For hungry and thirsty rats, food and water work well. Skinner’s experiments have done far more than teach us how to pull habits out of a rat. They have explored the precise conditions that foster efficient and enduring learning. operant chamber in operant conditioning research, a chamber (also known as a Skinner box) containing a bar or key that an animal can manipulate to obtain a food or water reinforcer; attached devices record the animal’s rate of bar pressing or key pecking. reinforcement in operant conditioning, any event that strengthens the behavior it follows. 776 Figure 27.2 A Skinner box Inside the box, the rat presses a bar for a food reward. Outside, measuring devices (not shown here) record the animal’s accumulated responses. Shaping Behavior Imagine that you wanted to condition a hungry rat to press a bar. Like Skinner, you could tease out this action with shaping, gradually guiding the rat’s actions toward the desired behavior. First, you would watch how the animal naturally behaves, so that you could build on its existing behaviors. You might give the rat a bit of food each time it approaches the bar. Once the rat is approaching regularly, you would give the food only when it moves close to the bar, then closer still. Finally, you would require it to touch the bar to get food. By rewarding successive approximations (as Sutherland did with her husband), you reinforce responses that are ever closer to the final desired behavior, and you ignore all other responses. By making rewards contingent on desired behaviors, researchers and animal trainers gradually shape complex behaviors. shaping an operant conditioning procedure in which reinforcers guide behavior toward closer and closer approximations of the desired behavior. 777 Reinforcers vary with circumstances What is reinforcing (a heat lamp) to one animal (a cold meerkat) may not be to another (an overheated child). What is reinforcing in one situation (a cold snap at the Taronga Zoo in Sydney, Australia) may not be in another (a sweltering summer day). Shaping can also help us understand what nonverbal organisms can perceive. Can a dog distinguish red and green? Can a baby hear the difference between lower- and higher-pitched tones? If we can shape them to respond to one stimulus and not to another, then we know they can perceive the difference. Such experiments have even shown that some nonhuman animals can form concepts. When experimenters reinforced pigeons for pecking after seeing a human face, but not after seeing other images, the pigeon’s behavior showed that it could recognize human faces (Herrnstein & Loveland, 1964). In this experiment, the human face was a discriminative stimulus. Like a green traffic light, discriminative stimuli signal that a response will be reinforced (Figure 27.3). After being trained to discriminate among classes of events or objects—flowers, people, cars, chairs—pigeons can usually identify the category in which a new pictured object belongs (Bhatt et al., 1988; Wasserman, 1993). They have even been trained to discriminate between the music of Bach and Stravinsky (Porter & Neuringer, 1984). 778 discriminative stimulus in operant conditioning, a stimulus that elicits a response after association with reinforcement (in contrast to related stimuli not associated with reinforcement). Figure 27.3 Bird brains spot tumors After being rewarded with food when correctly spotting breast tumors, pigeons became as skilled as humans at discriminating cancerous from healthy tissue (Levenson et al., 2015). Other animals have been shaped to sniff out land mines or locate people amid rubble (La Londe et al., 2015). Skinner noted that we continually reinforce and shape others’ everyday behaviors, though we may not mean to do so. Isaac’s whining annoys his dad, for example, but consider how his dad typically responds: Isaac: Could you take me to the mall? 779 Dad: (Continues reading paper.) Isaac: Dad, I need to go to the mall. Dad: Uh, yeah, in a few minutes. Isaac: DAAAAD! The mall!! Dad: Show me some manners! Okay, where are my keys... Isaac’s whining is reinforced, because he gets something desirable—a trip to the mall. Dad’s response is reinforced, because it gets rid of something aversive—Isaac’s whining. Or consider a teacher who sticks gold stars on a wall chart beside the names of children scoring 100 percent on spelling tests. As everyone can then see, some children consistently do perfect work. The others, who may have worked harder than the academic all-stars, get no rewards. The teacher would be better advised to apply the principles of operant conditioning—to reinforce all spellers for gradual improvements (successive approximations toward perfect spelling of words they find challenging). Types of Reinforcers Flip It Video: Reinforcement 27-3 How do positive and negative reinforcement differ, and what are the basic types of reinforcers? Until now, we’ve mainly been discussing positive reinforcement, which strengthens responding by presenting a typically pleasurable stimulus immediately after a response. But, as the whining Isaac story illustrates, there are two basic kinds of reinforcement (Table 27.1). Negative reinforcement strengthens a response by reducing or removing something negative. Isaac’s whining was positively reinforced, because Isaac got something desirable—a trip to the mall. His dad’s response (doing what Isaac wanted) was negatively reinforced, because it ended an aversive event—Isaac’s whining. Similarly, taking aspirin may relieve your 780 headache, and hitting snooze will silence your irritating alarm. These welcome results provide negative reinforcement and increase the odds that you will repeat these behaviors. For those with drug addiction, the negative reinforcement of ending withdrawal pangs can be a compelling reason to resume using (Baker et al., 2004). Note that negative reinforcement is not punishment. (Some friendly advice: Repeat those italicized words in your mind.) Rather, negative reinforcement— psychology’s most misunderstood concept—removes a punishing (aversive) event. Think of negative reinforcement as something that provides relief—from that whining teenager, bad headache, or annoying seat belt alarm. positive reinforcement increasing behaviors by presenting positive reinforcers. A positive reinforcer is any stimulus that, when presented after a response, strengthens the response. negative reinforcement increasing behaviors by stopping or reducing aversive stimuli. A negative reinforcer is any stimulus that, when removed after a response, strengthens the response. (Note: Negative reinforcement is not punishment.) TABLE 27.1 Ways to Increase Behavior Operant Conditioning Term Description Examples Positive Add a desirable Pet a dog that comes when you call it; reinforcement stimulus. pay someone for work done. Negative Remove an Take painkillers to end pain; fasten reinforcement aversive seat belt to end loud beeping. stimulus. 781 Crying works How is operant conditioning at work in this cartoon?1 AP® EXAM TIP Prepare to identify specific examples of positive and negative reinforcements. Pay particular attention to Table 27.1 for guidance. Sometimes negative and positive reinforcement coincide. Imagine a worried student who, after goofing off and getting a bad test grade, studies harder for the next test. This increased effort may be negatively reinforced by reduced anxiety, and positively reinforced by a better grade. We reap the rewards of escaping the aversive stimulus, which increases the chances that we will repeat our behavior. The point to remember: Whether it works by reducing something aversive or by providing something desirable, reinforcement is any consequence that strengthens behavior. Primary and Conditioned Reinforcers Getting food when hungry or having a painful headache go away is innately satisfying. These primary reinforcers are unlearned. Conditioned reinforcers, also called secondary reinforcers, get their power through learned association with primary reinforcers. If a rat in a Skinner box learns that a light reliably signals a food delivery, the rat will work to turn on the light (see Figure 27.2). The light has become a conditioned reinforcer. Our lives are filled with conditioned reinforcers— money, good grades, a pleasant tone of voice—each of which has been linked with more basic rewards. If money is a conditioned reinforcer—if 782 people’s desire for money is derived from their desire for food—then hunger should also make people more money hungry, reasoned one European research team (Briers et al., 2006). Indeed, in their experiments, people were less likely to donate to charity when food deprived, and less likely to share money with fellow participants when in a room with hunger-arousing aromas. primary reinforcer an innately reinforcing stimulus, such as one that satisfies a biological need. conditioned reinforcer a stimulus that gains its reinforcing power through its association with a primary reinforcer; also known as a secondary reinforcer. Immediate and Delayed Reinforcers Let’s return to the imaginary shaping experiment in which you were conditioning a rat to press a bar. Before performing this “wanted” behavior, the hungry rat will engage in a sequence of “unwanted” behaviors—scratching, sniffing, and moving around. If you present food immediately after any one of these behaviors, the rat will likely repeat that rewarded behavior. But what if the rat presses the bar while you are distracted, and you delay giving the reinforcer? If the delay lasts longer than about 30 seconds, the rat will not learn to press the bar. It will have moved on to other incidental behaviors, such as scratching, sniffing, and moving, and one of these behaviors will instead get reinforced. Unlike rats, humans do respond to delayed reinforcers: the paycheck at the end of the week, the good grade at the end of the term, the trophy at the end of the sports season. Indeed, to function effectively we must learn to delay gratification. In one of psychology’s most famous studies, some 4- year-olds showed this ability. In choosing a piece of candy or a marshmallow, these impulse-controlled children preferred having a big one tomorrow to munching on a small one right away. Learning to control our impulses in order to achieve more valued rewards is a big step toward 783 maturity and can later protect us from committing an impulsive crime (Åkerlund et al., 2016; Logue, 1998a,b). Children who delay gratification have tended to become socially competent and high-achieving adults (Mischel, 2014). “Oh, not bad. The light comes on, I press the bar, they write me a check. How about you?” To our detriment, small but immediate pleasures (the enjoyment of watching late-night TV, for example) are sometimes more alluring than big but delayed rewards (feeling rested for a big test tomorrow). For many teens, the immediate gratification of risky, unprotected sex in passionate moments prevails over the delayed gratifications of safe sex or saved sex. And for many people, the immediate rewards of gas-guzzling vehicles, air travel, and air conditioning prevail over the bigger future consequences of global climate change, rising seas, and extreme weather. Reinforcement Schedules 27-4 How do different reinforcement schedules affect behavior? In most of our examples, the desired response has been reinforced every time it occurs. But reinforcement schedules vary. With continuous reinforcement, learning occurs rapidly, which makes it the best choice for mastering a behavior. But extinction also occurs rapidly. When reinforcement stops—when we stop delivering food after the rat presses the bar—the behavior soon stops (is extinguished). If a normally 784 dependable candy machine fails to deliver a chocolate bar twice in a row, we stop putting money into it (although a week later we may exhibit spontaneous recovery by trying again). reinforcement schedule a pattern that defines how often a desired response will be reinforced. continuous reinforcement schedule reinforcing the desired response every time it occurs. Real life rarely provides continuous reinforcement. Salespeople do not make a sale with every pitch. But they persist because their efforts are occasionally rewarded. This persistence is typical with partial (intermittent) reinforcement schedules, in which responses are sometimes reinforced, sometimes not. Learning is slower to appear, but resistance to extinction is greater than with continuous reinforcement. Imagine a pigeon that has learned to peck a key to obtain food. If you gradually phase out the food delivery until it occurs only rarely, in no predictable pattern, the pigeon may peck 150,000 times without a reward (Skinner, 1953). Slot machines reward gamblers in much the same way— occasionally and unpredictably. And like pigeons, slot players keep trying, time and time again. With intermittent reinforcement, hope springs eternal. partial (intermittent) reinforcement schedule reinforcing a response only part of the time; results in slower acquisition of a response but much greater resistance to extinction than does continuous reinforcement. Lesson for parents and babysitters: Partial reinforcement also works with children. Occasionally giving in to children’s tantrums for the sake of peace and quiet intermittently reinforces the tantrums. This is the very best procedure for making a behavior persist. 785 Skinner (1961) and his collaborators compared four schedules of partial reinforcement. Some are rigidly fixed, some unpredictably variable. Fixed-ratio schedules reinforce behavior after a set number of responses. Coffee shops may reward us with a free drink after every 10 purchased. Once conditioned, rats may be reinforced on a fixed ratio of, say, one food pellet for every 30 responses. Once conditioned, animals will pause only briefly after a reinforcer before returning to a high rate of responding (Figure 27.4). fixed-ratio schedule in operant conditioning, a reinforcement schedule that reinforces a response only after a specified number of responses. Figure 27.4 Intermittent reinforcement schedules Skinner’s (1961) laboratory pigeons produced these response patterns to each of four reinforcement schedules. (Reinforcers are indicated by diagonal marks.) For people, as for pigeons, reinforcement linked to number of responses (a ratio schedule) produces a higher response rate than reinforcement linked to amount of time elapsed (an interval schedule). But the predictability of the reward also matters. An unpredictable (variable) schedule produces 786 more consistent responding than does a predictable (fixed) schedule. Variable-ratio schedules provide reinforcers after a seemingly unpredictable number of responses. This unpredictable reinforcement is what slot-machine players and fly fishers experience, and it’s what makes gambling and fly fishing so hard to extinguish even when they don’t produce the desired results. Because reinforcers increase as the number of responses increases, variable-ratio schedules produce high rates of responding. variable-ratio schedule in operant conditioning, a reinforcement schedule that reinforces a response after an unpredictable number of responses. Fixed-interval schedules reinforce the first response after a fixed time period. Animals on this type of schedule tend to respond more frequently as the anticipated time for reward draws near. People check more frequently for the mail as the delivery time approaches. Pigeons peck keys more rapidly as the time for reinforcement draws nearer. This produces a choppy stop-start pattern rather than a steady rate of response (see Figure 27.4). fixed-interval schedule in operant conditioning, a reinforcement schedule that reinforces a response only after a specified time has elapsed. Variable-interval schedules reinforce the first response after varying time intervals. At unpredictable times, a food pellet rewarded Skinner’s pigeons for persistence in pecking a key. Like the longed-for message that finally rewards persistence in checking our phone, variable-interval schedules tend to produce slow, steady responding. This makes sense, 787 because there is no knowing when the waiting will be over (Table 27.2). variable-interval schedule in operant conditioning, a reinforcement schedule that reinforces a response at unpredictable time intervals. TABLE 27.2 Schedules of Partial Reinforcement Fixed Variable Ratio Every so many: reinforcement After an unpredictable number: after every nth behavior, such as reinforcement after a random buy 10 coffees, get 1 free, or pay number of behaviors, as when workers per product unit playing slot machines or fly produced fishing Interval Every so often: reinforcement for Unpredictably often: behavior after a fixed time, such reinforcement for behavior after as Tuesday discount prices a random amount of time, as when studying for an unpredictable pop quiz AP® EXAM TIP Students sometimes have difficulty with the schedules of reinforcement. The word interval in schedules of reinforcement means that an interval of time must pass before reinforcement. There is nothing the learner can do to shorten the interval. The word ratio refers to the ratio of responses to reinforcements. If the learner responds with greater frequency, there will be more reinforcements. In general, response rates are higher when reinforcement is linked to the number of responses (a ratio schedule) rather than to time (an interval schedule). But responding is more consistent when reinforcement is unpredictable (a variable schedule) than when it is predictable (a fixed 788 schedule). Animal behaviors differ, yet Skinner (1956) contended that the reinforcement principles of operant conditioning are universal. It matters little, he said, what response, what reinforcer, or what species you use. The effect of a given reinforcement schedule is pretty much the same: “Pigeon, rat, monkey, which is which? It doesn’t matter.... Behavior shows astonishingly similar properties.” “ The charm of fishing is that it is the pursuit of what is elusive but attainable, a perpetual series of occasions for hope.” Scottish author John Buchan (1875–1940) Check Your Understanding Ask Yourself In your everyday life, what type of reinforcement schedule do you respond to most strongly? Test Yourself People who send spam e-mail are reinforced by which schedule? Home bakers checking the oven to see if the cookies are done are on which schedule? Sandwich shops that offer a free sandwich after every 10 sandwiches purchased are using which reinforcement schedule? Answers to the Test Yourself questions can be found in Appendix E at the end of the book. Punishment 27-5 How does punishment differ from negative reinforcement, and how does punishment affect behavior? Reinforcement increases a behavior; punishment does the opposite. So, while negative reinforcement increases the frequency of a preceding behavior (by withdrawing something negative), a punisher is any 789 consequence that decreases the frequency of a preceding behavior (Table 27.3). Swift and sure punishers can powerfully restrain unwanted behavior. The rat that is shocked after touching a forbidden object and the child who is burned by touching a hot stove will learn not to repeat those behaviors. punishment an event that tends to decrease the behavior that it follows. TABLE 27.3 Ways to Decrease Behavior Type of Punisher Description Examples Positive Administer an Spray water on a barking dog; give a traffic punishment aversive ticket for speeding. stimulus. Negative Withdraw a Take away a misbehaving teen’s driving punishment rewarding privileges; revoke a rude person’s chat room stimulus. access. Criminal behavior, much of it impulsive, is also influenced more by swift and sure punishers than by the threat of severe sentences (Darley & Alter, 2013). Thus, when Arizona introduced an exceptionally harsh sentence for first-time drunk drivers, the drunk-driving rate changed very little. But when Kansas City police started patrolling a high crime area to increase the swiftness and sureness of punishment, that city’s crime rate dropped dramatically. AP® EXAM TIP You must be able to differentiate between reinforcement and punishment. Remember that any kind of reinforcement (positive, negative, primary, conditioned, immediate, delayed, continuous, or partial) encourages the 790 behavior. Any kind of punishment discourages the behavior. Positive and negative do not refer to values—it’s not that positive reinforcement (or punishment) is the good kind and negative is the bad. Think of positive and negative mathematically; a stimulus is added with positive reinforcement (or punishment) and a stimulus is subtracted with negative reinforcement (or punishment). What do punishment studies imply for parenting? One analysis of over 160,000 children found that physical punishment rarely corrects unwanted behavior (Gershoff & Grogan-Kaylor, 2016). Many psychologists note four major drawbacks of physical punishment (Finkenauer et al., 2015; Gershoff, 2002; Marshall, 2002). 1. Punished behavior is suppressed, not forgotten. This temporary state may (negatively) reinforce parents’ punishing behavior. The child swears, the parent swats, the parent hears no more swearing and feels the punishment successfully stopped the behavior. No wonder spanking is a hit with so many parents—with 60 percent of children around the world spanked or otherwise physically punished (UNICEF, 2014). 2. Punishment teaches discrimination among situations. In operant conditioning, discrimination occurs when an organism learns that certain responses, but not others, will be reinforced. Did the punishment effectively end the child’s swearing? Or did the child simply learn that while it’s not okay to swear around the house, it’s okay elsewhere? 3. Punishment can teach fear. In operant conditioning, generalization occurs when an organism’s response to similar stimuli is also reinforced. A punished child may associate fear not only with the undesirable behavior but also with the person who delivered the punishment or where it occurred. Thus, children may learn to fear a punishing teacher and try to avoid school, or may become more anxious (Gershoff et al., 2010). For such reasons, most European 791 countries and 31 U.S. states now ban hitting children in schools and child-care institutions (EndCorporalPunishment.org). As of 2017, 51 countries outlaw hitting by parents. A large survey in Finland, the second country to pass such a law, revealed that children born after the law passed were, indeed, less often slapped and beaten (Österman et al., 2014). 4. Physical punishment may increase aggression by modeling violence as a way to cope with problems. Studies find that spanked children are at increased risk for aggression (MacKenzie et al., 2013). We know, for example, that many aggressive delinquents and abusive parents come from abusive families (Straus & Gelles, 1980; Straus et al., 1997). Some researchers question this logic. Physically punished children may be more aggressive, they say, for the same reason that people who have undergone psychotherapy are more likely to suffer depression— because they had preexisting problems that triggered the treatments (Ferguson, 2013; Larzelere, 2000; Larzelere et al., 2004). So, does spanking cause misbehavior, or does misbehavior trigger spanking? Correlations don’t hand us an answer. The debate continues. Some researchers note that frequent spankings predict future aggression—even when studies control for preexisting bad behavior (Taylor et al., 2010). Other researchers believe that lighter spankings pose less of a problem (Baumrind et al., 2002; Larzelere & Kuhn, 2005). That is especially so if physical punishment is used only as a backup for milder disciplinary tactics, and if it is combined with a generous dose of reasoning and reinforcing. Parents of delinquent youths are often unaware of how to achieve desirable behaviors without screaming at, hitting, or threatening their children with punishment (Patterson et al., 1982). Training programs can help transform dire threats (“Apologize right now or I’m taking that cell phone away!”) into positive incentives (“You’re welcome to have your phone back when you apologize.”). Stop and think about it. Aren’t many threats of punishment just as forceful, and perhaps more effective, when rephrased positively? Thus, “If you don’t get your homework done, I’m 792 not giving you money for a movie!” could be phrased more positively as.... In classrooms, too, teachers can give feedback by saying, “No, but try this...” and “Yes, that’s it!” Such responses reduce unwanted behavior while reinforcing more desirable alternatives. Remember: Punishment tells you what not to do; reinforcement tells you what to do. Thus, punishment trains a particular sort of morality—one focused on prohibition (what not to do) rather than positive obligations (Sheikh & Janoff-Bulman, 2013). “ A pat on the back, though only a few vertebrae removed from a kick in the pants, is miles ahead in results.” Attributed to publisher Bennett Cerf (1898–1971) What punishment often teaches, said Skinner, is how to avoid it. Most psychologists now favor an emphasis on reinforcement: Notice people doing something right and affirm them for it. Check Your Understanding Ask Yourself Have you witnessed punishment that was effective? That was ineffective? Do you understand why there was a difference? Test Yourself Fill in the three blanks on the right with one of the following terms: positive reinforcement (PR), negative reinforcement (NR), positive punishment (PP), and negative punishment (NP). We have provided the first answer (PR) for you. Type of Stimulus Give It Take It Away Desired (for example, a teen’s use of the car): 1. PR 2. Undesired/aversive (for example, an insult): 3. 4. 793 Answers to the Test Yourself questions can be found in Appendix E at the end of the book. 794 Skinner’s Legacy 27-6 Why did Skinner’s ideas provoke controversy? B. F. Skinner stirred a hornet’s nest with his outspoken beliefs. He repeatedly insisted that external influences, not internal thoughts and feelings, shape behavior. He argued that brain science isn’t needed for psychological science, saying that “a science of behavior is independent of neurology” (Skinner, 1938/1966, pp. 423–424). And he urged people to use operant conditioning principles to influence others’ behavior at school, work, and home. Knowing that behavior is shaped by its results, he argued that we should use rewards to evoke more desirable behavior. B. F. Skinner “I am sometimes asked, ‘Do you think of yourself as you think of the organisms you study?’ The answer is yes. So far as I know, my behavior at any given moment has been nothing more than the product of my genetic endowment, my personal history, and the current setting” (1983). Skinner’s critics objected, saying that he dehumanized people by neglecting their personal freedom and by seeking to control their actions. Skinner’s reply: External consequences already haphazardly control people’s behavior. Why not administer those consequences toward human betterment? Wouldn’t reinforcers be more humane than the punishments used in homes, schools, and prisons? And if it is humbling to think that our 795 history has shaped us, doesn’t this very idea also give us hope that we can shape our future? 796 Module 27 REVIEW 27-1 What is operant conditioning? Operant conditioning is a type of learning in which behavior is strengthened if followed by a reinforcer or diminished if followed by a punisher. 27-2 Who was Skinner, and how is operant behavior reinforced and shaped? B. F. Skinner was a college English major and aspiring writer who later entered psychology graduate school. He became modern behaviorism’s most influential and controversial figure. Expanding on Edward Thorndike’s law of effect, B. F. Skinner and others found that the behavior of rats or pigeons placed in an operant chamber (Skinner box) can be shaped by using reinforcers to guide closer and closer approximations of the desired behavior. 27-3 How do positive and negative reinforcement differ, and what are the basic types of reinforcers? Reinforcement is any consequence that strengthens behavior. Positive reinforcement adds a desirable stimulus to increase the frequency of a behavior. Negative reinforcement reduces or removes an aversive stimulus to increase the frequency of a behavior. Primary reinforcers (such as receiving food when hungry or having nausea end during an illness) are innately satisfying—no learning is required. Conditioned (or secondary) reinforcers (such as cash) are satisfying because we have learned to associate them with more basic rewards (such as the food or medicine we buy with them). 797 Immediate reinforcers (such as a purchased treat) offer immediate payback; delayed reinforcers (such as a paycheck) require the ability to delay gratification. 27-4 How do different reinforcement schedules affect behavior? A reinforcement schedule defines how often a response will be reinforced. In continuous reinforcement (reinforcing desired responses every time they occur), learning is rapid, but so is extinction if rewards cease. In partial (intermittent) reinforcement (reinforcing responses only sometimes), initial learning is slower, but the behavior is much more resistant to extinction. Fixed-ratio schedules reinforce behaviors after a set number of responses; variable-ratio schedules, after an unpredictable number. Fixed-interval schedules reinforce behaviors after set time periods; variable-interval schedules, after unpredictable time periods. 27-5 How does punishment differ from negative reinforcement, and how does punishment affect behavior? Punishment administers an undesirable consequence (such as spanking) or withdraws something desirable (such as taking away a favorite toy) in an attempt to decrease the frequency of a behavior (a child’s disobedience). Negative reinforcement (taking an aspirin) removes an aversive stimulus (a headache). This desired consequence (freedom from pain) increases the likelihood that the behavior (taking aspirin to end pain) will be repeated. Punishment can have undesirable side effects, such as suppressing rather than changing unwanted behaviors, encouraging discrimination (so that the undesirable behavior appears when the punisher is not present), creating fear, teaching aggression, and fostering depression and low 798 self-esteem. 27-6 Why did Skinner’s ideas provoke controversy? Critics of Skinner’s principles believed the approach dehumanized people by neglecting their personal freedom and seeking to control their actions. Skinner replied that people’s actions are already controlled by external consequences, and that reinforcement is more humane than punishment as a means for controlling behavior. Multiple-Choice Questions 1. The purpose of reinforcement is to a. cause a behavior to stop. b. cause a behavior to diminish. c. cause a behavior to continue. d. strengthen the spontaneous recovery process. e. cause a behavior to occur for only a limited amount of time. 2. Which of the following best describes negative reinforcement? a. John stops shooting bad free-throws because his coach benches him when he does. b. Brian studies hard because it earns him “A” grades in math. c. Lillian used to walk to school but does not do so anymore because she was attacked by a dog last month. d. Charles smokes because his anxiety is reduced when he does so. e. Osel wears his seat belt because his driving teacher cited accident statistics in class. 3. Thorndike’s principle that behaviors followed by favorable consequences become more likely to be repeated is known as what? a. Law of effect b. Operant conditioning c. Shaping d. Respondent behavior 799 e. Discrimination 4. All of the following are examples of primary reinforcers except a a. rat’s food reward in a Skinner box. b. cold drink on a hot day. c. high score on an exam for which a student studied diligently. d. hug from a loved one. e. large meal following an extended time without food. 5. Shea bought 10 tickets for the raffle for free homecoming entry, but she did not win. Months later she also buys 10 tickets for the senior prom raffle, hoping this will be the time she wins. Which schedule of reinforcement is best used to explain this scenario? a. Fixed-ratio b. Variable-ratio c. Fixed-interval d. Variable-interval e. Continuous Practice FRQs 1. Mom is frustrated because 3-year-old Maya has started to spit frequently. She has decided to temporarily put away one of Maya’s toys every time she spits. Mom is going to continue to do this until Maya has stopped spitting. Explain whether Mom’s plan uses reinforcement or punishment. Explain whether Mom’s plan is a positive or negative form of reinforcement or punishment. Answer 1 point: The plan uses punishment, because it is designed to reduce the frequency of spitting. (Both the identification of punishment and the explanation are required to earn the point.) Page 289 800 1 point: This is negative punishment because toys are being taken away from Maya. (Both the identification of negative reinforcement and the explanation are required to earn the point.) Page 289 2. Your calculus teacher wants her students to be more diligent in completing their homework and, since you are taking AP® psychology, she has asked for your help. Give an example of how she could use each of the following to help her increase homework completion. Shaping Negative reinforcement Fixed-interval schedule of reinforcement (3 points) 801 Module 28 Operant Conditioning’s Applications, and Comparison to Classical Conditioning LEARNING TARGETS 28-1 Discuss ways to apply operant conditioning principles at school, in sports, at work, at home, for self-improvement, and to manage stress. 28-2 Identify the characteristics that distinguish operant conditioning from classical conditioning. 802 Applications of Operant Conditioning 28-1 How might operant conditioning principles be applied at school, in sports, at work, at home, for self-improvement, and to manage stress? Would you like to apply operant conditioning principles to your own life —to be a healthier person, a more successful student, or a high-achieving athlete? Reinforcement techniques are at work in schools, sports, workplaces, and homes (Flora, 2004). These principles can support our self-improvement and help us to manage stress as well. AT SCHOOL More than 50 years ago, Skinner and others worked toward a day when “machines and textbooks” would shape learning in small steps by immediately reinforcing correct responses. Such machines and textbooks, they said, would revolutionize education and free teachers to focus on each student’s special needs. “Good instruction demands two things,” said Skinner (1989). “Students must be told immediately whether what they do is right or wrong and, when right, they must be directed to the step to be taken next.” 803 Computer-assisted learning Computers have helped realize Skinner’s goal of individually paced instruction with immediate feedback. Skinner might be pleased to know that many of his ideals for education are now possible. Teachers used to find it difficult to pace material to each student’s rate of learning and to provide prompt feedback. Online adaptive quizzing does both. Students move through quizzes at their own pace, according to their own level of understanding. And they get immediate feedback on their efforts. FYI In later units we will learn more about how psychologists apply operant conditioning principles to help people moderate high blood pressure or gain social skills. IN SPORTS The key to shaping behavior in athletic performance, as elsewhere, is first reinforcing small successes and then gradually increasing the challenge. Golf students can learn putting by starting with very short putts, and then, as they build mastery, stepping back farther and farther. Novice batters can begin with half swings at an oversized ball pitched from 10 feet away, giving them the immediate pleasure of smacking the ball. As the hitters’ confidence builds with their success and 804 they achieve mastery at each level, the pitcher gradually moves back and eventually introduces a standard baseball. Compared with children taught by conventional methods, those trained by this behavioral method have shown faster skill improvement (Simek & O’Brien, 1981, 1988). In sports as in the laboratory, the accidental timing of rewards can produce superstitious behaviors. If a Skinner box food dispenser gives a pellet of food every 15 minutes, whatever the animal happened to be doing just before the food arrives (perhaps scratching itself) is more likely to be repeated and reinforced, which occasionally can produce a persistent superstitious behavior. Likewise, if a baseball or softball player gets a hit after tapping the plate with the bat, he or she may be more likely to do so again. Over time the player may experience partial reinforcement for what becomes a superstitious behavior. AT WORK Knowing that reinforcers influence productivity, many organizations have invited employees to share the risks and rewards of company ownership. Others focus on reinforcing a job well done. Rewards are most likely to increase productivity if the desired performance is both well-defined and achievable. The message for managers? Reward specific, achievable behaviors, not vaguely defined “merit.” Operant conditioning also reminds us that reinforcement should be immediate. IBM legend Thomas Watson understood this. When he observed an achievement, he wrote the employee a check on the spot (Peters & Waterman, 1982). But rewards need not be material, or lavish. 805 An effective manager may simply walk the floor and sincerely affirm people for good work, or write notes of appreciation for a completed project. As Skinner said, “How much richer would the whole world be if the reinforcers in daily life were more effectively contingent on productive work?” IN PARENTING As we have seen, parents can learn from operant conditioning practices. Parent-training researchers remind us that by saying, “Get ready for bed” and then caving in to protests or defiance, parents reinforce such whining and arguing (Wierson & Forehand, 1994). Exasperated, they may then yell or gesture menacingly. When the child, now frightened, obeys, that reinforces the parents’ angry behavior. Over time, a destructive parent-child relationship develops. FYI Notice how useful operant conditioning is. People with an understanding of the principles of operant conditioning possess a tool for changing behavior. If you don’t like the way your friends, teachers, coaches, or parents behave, pay attention to the uses of operant conditioning! To disrupt this cycle, parents should remember that basic rule of shaping: Notice people doing something right and affirm them for it. Give children attention and other reinforcers when they are behaving well. Target a specific behavior, reward it, and watch it increase. When children misbehave or are defiant, don’t yell at them or hit them. Simply explain the misbehavior and take away the iPad, remove a misused toy, or give a brief time-out. FOR SELF-IMPROVEMENT We can use operant conditioning in our own lives. To build up your self-control, you need to reinforce your own desired behaviors (perhaps to improve your study habits) and extinguish the undesired ones (to stop texting while studying, for example). Psychologists suggest taking these steps: 806 1. State a realistic goal in measurable terms and announce it. You might, for example, aim to boost your study time by an hour a day. To increase your commitment and odds of success, share that goal with friends. 2. Decide how, when, and where you will work toward your goal. Take time to plan. Those who specify how they will implement goals become more focused on those goals and more often fulfill them (Gollwitzer & Oettingen, 2012). 3. Monitor how often you engage in your desired behavior. You might log your current study time, noting under what conditions you do and don’t study. (When we began writing textbooks, we each logged our time and were amazed to discover how much time we were wasting.) 4. Reinforce the desired behavior. People’s persistence toward long-term goals, such as New Year’s resolutions to study or exercise more, is powered mostly by immediate rewards (Wooley & Fishbach, 2017). So, to increase your study time, give yourself a reward (a snack or some activity you enjoy) only after you finish your extra hour of study. Agree with your friends that you will join them for weekend activities only if you have met your realistic weekly studying goal. 5. Reduce the rewards gradually. As your new behaviors become more habitual, give yourself a mental pat on the back instead of a cookie. TO MANAGE STRESS In addition, we can literally learn from ourselves. There is some evidence that when we have feedback about our bodily responses, we can sometimes change those responses. When a few psychologists started experimenting with the idea that we could train people to counteract stress, many of their colleagues thought them foolish. After all, these functions are controlled by the autonomic (“involuntary”) nervous system. Then, in the late 1960s, experiments by respected psychologists made the skeptics wonder. Neal Miller, for one, found that rats could modify their heartbeat if given pleasurable brain stimulation when their heartbeat increased or decreased. Later research revealed that 807 some paralyzed humans could also learn to control their blood pressure (Miller & Brucker, 1979). “I wrote another five hundred words. Can I have another cookie?” Miller was experimenting with biofeedback, a system of recording, amplifying, and feeding back information about subtle physiological responses. Biofeedback instruments mirror the results of a person’s own efforts, thereby allowing the person to learn techniques for controlling a particular physiological response (Figure 28.1). After a decade of study, however, researchers decided the initial claims for biofeedback were overblown and oversold (Miller, 1985). A 1995 National Institutes of Health panel declared that biofeedback works best on tension headaches. And simple methods of physical activity and relaxation, such as exercise and meditation (see Module 44), can help us keep our stress levels under control. biofeedback a system for electronically recording, amplifying, and feeding back information regarding a subtle physiological state, such as blood pressure or muscle tension. 808 Figure 28.1 Biofeedback systems Biofeedback systems—such as this one, which records tension in the forehead muscle of a headache sufferer—allow people to monitor their subtle physiological responses. As this man relaxes his forehead muscle, the pointer on the display screen (or a tone) may go lower. Check Your Understanding Ask Yourself Have you ever found yourself engaging in any superstitious behaviors before a big sporting event or performance? How would you explain this phenomenon? Test Yourself Ethan constantly misbehaves at preschool even though his teacher scolds him repeatedly. Why does Ethan’s misbehavior continue, and what can his teacher do to stop it? Answers to the Test Yourself questions can be found in Appendix E at the end of the book. 809 Contrasting Classical and Operant Conditioning 28-2 How does operant conditioning differ from classical conditioning? Flip It Video: Associative Learning Principles Both classical and operant conditioning are forms of associative learning. Both involve acquisition, extinction, spontaneous recovery, generalization, and discrimination. But these two forms of learning also differ. Through classical (Pavlovian) conditioning, we associate different stimuli we do not control, and we respond automatically (respondent behaviors) (Table 28.1). Through operant conditioning, we associate our own behaviors—which act on our environment to produce rewarding or punishing stimuli (operant behaviors)—with their consequences. AP® EXAM TIP You may have to differentiate classical conditioning and operant conditioning on the AP® exam. Classical conditioning is involuntary (respondent) behavior, while operant conditioning is voluntary (operant) behavior. TABLE 28.1 Comparison of Classical and Operant Conditioning Classical Conditioning Operant Conditioning Basic idea Learning associations between Learning associations events we do not control. between our behavior and its consequences. Response Involuntary, automatic. Voluntary, operates on environment. Acquisition Associating events; NS is Associating a response with paired with US and becomes a consequence (reinforcer or CS. punisher). 810 Extinction CR decreases when CS is Responding decreases when repeatedly presented alone. reinforcement stops. Spontaneous The reappearance, after a rest The reappearance, after a recovery period, of an extinguished CR. rest period, of an extinguished response. Generalization The tendency to respond to Responses learned in one stimuli similar to the CS. situation occurring in other, similar situations. Discrimination Learning to distinguish Learning that some between a CS and other stimuli responses, but not others, that do not signal a US. will be reinforced. As we shall see next, our biology and cognitive processes influence both classical and operant conditioning. Check Your Understanding Ask Yourself Can you recall a time when a teacher, coach, family member, or employer helped you learn something by shaping your behavior in little steps until you achieved your goal? Test Yourself Salivating in response to a tone paired with food is a(n) _____________ behavior; pressing a bar to obtain food is a(n) ______________ behavior. Answers to the Test Yourself questions can be found in Appendix E at the end of the book. 811 Module 28 REVIEW 28-1 How might operant conditioning principles be applied at school, in sports, at work, at home, for self-improvement, and to manage stress? Teachers can use shaping techniques to guide students’ behaviors, and they can use interactive media to provide immediate feedback. In sports, where the accidental timing of rewards can produce superstitious behavior, coaches can nevertheless build players’ skills and self-confidence by rewarding small improvements. Managers can boost productivity and morale by rewarding well-defined and achievable behaviors. Parents can reward desired behaviors but not undesirable ones. We can shape our own behaviors by stating our goals, monitoring the frequency of desired behaviors, reinforcing desired behaviors, and gradually reducing rewards as behaviors become habitual. We can learn from our bodily responses to manage stress; biofeedback is one studied method. 28-2 How does operant conditioning differ from classical conditioning? In operant conditioning, an organism learns associations between its own behavior and resulting