Literature Research Methods (Chapter 4 & 5) PDF
Document Details
![WholesomeOnyx8581](https://quizgecko.com/images/avatars/avatar-6.webp)
Uploaded by WholesomeOnyx8581
University of Amsterdam
Tags
Summary
This document is a chapter from a textbook, focusing on observational research and different research methods. The chapter details observational research, case studies, and surveys, alongside a discussion on strengths and weaknesses of case studies, providing examples in neuropsychology.
Full Transcript
CHAPTER 4 OBSERVATI ON AN D DESCRIPTION You can observe a lot by watching. -YOGI BERRA IN THIS CHAPTER, WE'LL DISCUSS: * Howobservationalresearchdiffersfromexperimental research * Some methods use...
CHAPTER 4 OBSERVATI ON AN D DESCRIPTION You can observe a lot by watching. -YOGI BERRA IN THIS CHAPTER, WE'LL DISCUSS: * Howobservationalresearchdiffersfromexperimental research * Some methods used in observational research: Case studies Surveys and interviews Participant observation Direct observation of behavior * How observational studies can: Test hypotheses Provide an overview of a problem area Answer further questions raised by the original findings Show us how we can replace opinion with data 100 CASE STUDIES In the previous chapter, we saw some examples of what data look like and what can be done with them. Now we begin to address the nuts-and-bolts questions, such as: How shall we proceed to gather good data-solid facts on which to base our belief s? We can divide research methods into two broad classes. In one- experimental research-the investigator intervenes to make some- thing happen. He deliberately causes something to vary. In a word, he manipulates some variable. Then he observes the effect of that manipulation. That is what an experiment is. In the second kind-nonexperimental research-the investigator does not deliberately make something happen, but instead sets out to describe some aspect of nature as he finds it. He observes and de- scribes it carefully, without attempting to influence events. We will call this very broad class of methods observational research. This chap- ter and the next one will be devoted to it. In observational research, it's true that the researcher may intervene in the situation in order to gather data-as, for example, when a pollster asks someone to answer questions. However, this doesn't make the procedure experimental. The difference is that the pollster is not investigating the effect of his intervention per se. In- deed, he typically assumes that the intervention will have no effect at all, but only allow him to find out something about the subject- her opinion on some issue, or whom she intends to vote for-that would be true whether the pollster found it out or not. Of course this assumption may not always be justified (see "observer effects" later). But now let us zero in on some examples of observational re- search. We will see what sorts of things can be learned by careful observation. CASE STUDIES In the simplest case, it may be important that some event occurs at all, even if only rarely. The mere fact that it can occur may tell us something interesting. In such instances, we may devote a great deal of study to just a few cases or even only one. Hence the term case study. 101 OBSERVATION AND DESCRIPTION AN EX A MPLE FR O M N E U R O P S Y C HO LO GY : B RO C A 'S S P E E C H AR EA In 1861, the French physician Paul Broca (1824-1880), a distin- guished pathologist and neuroanatomist, encountered a patient called "Tan." The patient was given that name because that was the only word he could say! Except for that one word, he could not speak. He suffered from what we now call expressive aphasia- the in- ability to produce speech. "Tan" was placed in Broca's care, but he died only six days later. An autopsy revealed damage to the brain at a certain area (now called Broca's Area) on the surface of the left frontal lobe (Figure 4.1). In the little time he had, Broca studied "Tan" very intensively, trying to pinpoint just what the difficulty was. This patient had trouble producing speech, though he could understand it perfectly well. Why? It was not paralysis; the speech muscles were working just fine. "Tan"'s comprehension of language was intact, and his general in- telligence seemed normal. Later case studies have supported this con- clusion, showing that patients with Broca's aphasia can understand speech, many can write, and intelligence may be unimpaired. Broca's FIGURE 4.1 Area Broca's area, on the frontal lobe of the left hemisphere of the brain. Rather, Broca concluded, such patients suffer from the failure of a specialized memory mechanism. It is as if they cannot remember how to use their speech apparatus to form words. Six months after the death of "Tan," Broca had the opportunity to study a second patient. The symptoms were very similar to "Tan"'s and, it developed, so was the damage to the brain. Broca concluded that he had located an area of the brain that was specialized for speech production. 102 CASE STUDIES That conclusion fed into a longstanding controversy. Could one assign specific functions to different areas of the brain at all? Many students of brain function doubted it, and there were logical-and theological-reasons to doubt it. Grant that the brain is the organ of mind. Then, if a mind is single and indivisible-you speak of your mind, not of your minds!-it would seem that the brain also must work as a unit, or so the argument went. Now, here were Broca's two cases and the others that followed: damage just here in the brain, associated with just this defect- difficulty in producing speech. Encouraged by Broca's convincing demonstrations, other scientists, in the decades following, sought- and found-other instances of such "localiza tion of function" in the brain. These findings and the surrounding controversy remind us again that small findings can bear on big issues. Does a person have one mind, or many? Does the mind have parts? If it does, does the soul have parts? (See Churchland, 1986, for discussion.) Questions about the very nature of humanity lie behind these limited, "localized" clinical case studies. Broca himself was well aware of this: [T]he days were over when one could say without hesitation... that the soul being single, the brain, in spite of neuroanatomy, must be single also. Everything concerning the connection of mind with matter had been called into question, and in the midst of the un- certainties that surrounded the solution to this great problem, anatomy and physiology, until then reduced to silence, finally had to raise its voice. ST R E NGTH S A N D LI MITA TI O NS O F C A S E STU D I E S Case studies can be useful in telling us what can happen; for example, that a specific failure can be linked to a specific area of brain damage. The very fact that such things can happen may be important, for they bear on more general ideas, such as whether there is localization of function in the brain. They also may suggest new ideas or new lines of study, and they may add to the "weight of evidence" favoring or challenging a theoretical principle ( Chapter 2), as Broca's studies did. But case studies have two important limitations. First, they tell us that this or that sort of event can ha ppen, but they do not tell us anything about what typically does happen. Sigmund Freud's work, 103 OBSERVATION AND DESCRIPTION for example, depended on case studies (though we present none here, for they are extremely long and involved). And his theory has been criticized on just those grounds-among others. From his conclusion that neurotic symptoms can express certain unconscious urges, he went on to claim that all behavior has such sources. Many writers feel that Freud's theory of the human mind simply went too far beyond his data, which consisted of a f ew case studies of very atypical people: affluent Viennese persons with symptoms. Second, case studies are especially susceptible to two sources of error. These can damage any research project, but are especially dan- gerous here because of the close, repeated contact between the observer-scientist and his subject matter. First, there is the problem of observer bias. The observer (e.g., a therapist) may see what she ex- pects to see; or she may select, from all that she sees, those events that she expects or that fit her theory. Thus a psychoanalyst may, consciously or not, attach importance to those events that fit the the - ory and less importance to those that do not. Then there is the problem of observer effects. Even if an observer sees what is going on in an unbiased way, she may have unintended effects on what goes on. An observer or a therapist, by listening at- tentively to some remarks but not to others, or asking for elabora - tion of some remarks but not others, may steer the conversation in the direction in which she, the observer or therapist, thinks it should go. For all these reasons, scientists tend to treat case studies with caution. SURVEYS:INTERVIEWS AND QUESTIONNAIRES Whereas case studies are likely to focus on one or a f ew instructive "cases," survey research is more likely to ask: What is true of large groups of cases-many people, or many groups, on which we obtain data? A survey may be small-scale or large; but either way, individual cases will not be the focus of attention. Rather they will be treated as parts of a whole sample of cases. And the sample in tum may be considered a part of a larger po pulation, to which we want our conclusion to apply. Survey research as used in political or opinion polling, or in market research, is familiar to the reader, and we will say relatively little about it here. It does face technical problems. The sampling of cases can be tricky, and the phrasing of survey questions is an art in 104 SURVEYS: INTERVIEWS AND QUESTIONNAI RES "Next question: I believe that life is a constant striving for balance, requiring frequent tradeoffs between morality and necessity, within a cyclic pattern of joy and sadness, forging a trail of bittersweet memories until one slips, inevitably, into the jaws of death. Agree or disagree?" FIGURE 4.2 The construction of survey questions is an art in itself. This survey was not con- structed by an artist. itself (Figure 4.2)-a highly specialized one, and we cannot get into it here (see Babbie, 1989). Let us look at just one example of how such surveys can bear on questions. of psychological interest. A D E MO G RA P HIC S U RVE Y : C O HA B ITATI O N A N D D IVO R C E Remember the decrease in the proportion of young American adults who were married, and the increase in those who were living to- gether unmarried, from the l 940s to the 1960s ( Chapter l ) ? Here is one objective indicator of the marked change in American society's attitudes toward sexuality-a psychological phenomenon in its own right-over that period. It suggests some f urther questions. We might ask, for instance: Is it a good idea for couples to live together before marrying? Maybe such a "trial marriage," in which the partners learned all about each other first, would serve to prevent many unsuitable marriages, thus reducing the risk of divorce later. (In.a survey of nearly 300,000 American college students, 51% agreed that "a couple should live together before marriage"; Astin, Korn, & Riggs, 1993). Or we could just as well argue the opposite: Maybe those who cohabit first are more likely to regard marriage itself as an experiment rather than 105 OBSERVATION AND DESCRIPTION a commitment, and so be more likely to divorce if they become dissatisfied. We could argue it either way-or we could look and see. The data tell us that couples who cohabited before marriage were more likely to divorce than those who did not, by as much as two to one. This has been found in the United States ( Greeley, 1991), in Canada (Bal- akrishan, Rao, Lapierre-Adamcyk, & Krotki, 1987), and in Sweden ( Bennett, Blanc, & Bloom, 1988). The evidence is pretty weighty! Now, a note of caution about these studies. They do not necessarily mean that the prior cohabitation causes a greater risk of divorce later on. Correlation does not imply causality (something we all know, and routinely forget). Maybe cohabitation in itself makes no difference either way, but perhaps people with certain attitudes and values are more likely than others ( 1) to cohabit before marriage and (2) to dissolve the marriage if it becomes unsatisfying. What we can say is that cohabitation does not make divorce less likely, though it might have. Remember? We don't prove theories; we rule out alternatives (Chapter 2) ! A N EX A MP LE F RO M S O C IA L PSY C HO LO G Y : TH E B E NNINGTO N ST U D I ES Surveys need not be big, nationwide projects. They are often con- ducted on a much smaller scale in psychological research. Let us turn to some examples of what can be learned that way. In the late 1930s, Bennington College in Vermont was almost brand new (Figure 4.3). It had been established in 1932 as an experimental college for women. It was a close-knit community, in which most faculty lived on campus, usually as masters of student dorms. The college was small-in 1936 it consisted of 300 people, students and faculty. Finally, it was imbued with enough of the experimental spirit that a large majority of its members was willing to participate in a research project, headed by Theodore Newcomb, concerned with the attitudes, and attitude changes, that seemed to occur in students as they studied there (Newcomb, 1943). Bennington had a reputation for social awareness and political liberalism among its faculty, and among the students who had been there a while. Incoming students, on the other hand, tended to be from affluent families that had, on average, quite conservative attitudes. Newcomb and his co-workers wondered: What happens when people with one set of attitudes are placed in close, long-term con- tact with a "society" whose attitudes are different? 106 SURVEYS: INTERVIEWS AND QUESTIONNAIRES FIGURE 4.3 Bennington College in 1933. The researchers conducted intensive interviews with students who had been at Bennington for varying periods of time; and, for a number of students, they were able to repeat the interviews as the students advanced from freshma n to senior. The researchers found a consistent drift toward more liberal attitudes on political and social issues. These attitudes were operationalized in many ways- pencil-and-paper "attitude scales," which candidates for political office the students favored, and their opinions on various social issues of the day. As just one example, when questioned before the 1936 election, about two-thirds of the incoming class preferred the Republican (and more conservative) Alfred Landon to the Democrat, Franklin D. Roosevelt. Juniors and seniors preferred Roosevelt 3 to 1. Consistently on all such measures, students tended to move away from their initial conservative a ttitudes and values, toward more liberal ones. Not all did, but, on average, the trend was clear. There was substantial social pressure toward such a change. The faculty, Newcomb tells us, made a conscious effort not to push their views on students, but older students were less restrained. Indeed, 107 OBSERVATION AND DESCRIPTION those students whose attitudes did not change tended to be, or to become, rather isolated; they were named less often by other stu- dents as friends or as "admired," or as suitable to be leaders, than were those students whose attitudes did change. That raises questions: Did the students' attitudes really change as a result of the Bennington experience? Or were they only paying lip service to the attitudes that prevailed at Bennington, while keeping their private and more conservative attitudes to themselves? Well, one way of addressing that question is to ask: Were there lasting changes in attitudes? Once away from the social pressures they encountered at Bennington, did these women maintain their more liberal views, or did they move back to the more conservative atti- tudes they had entered college with? Twenty-five years later, Newcomb and his co-workers were able to locate 94% of the women they had interviewed originally-a re- markable feat in its own right-and study them again (Newcomb, Koenig, Flacks, & Warwick, 1967). Many were reinterviewed; where this was not possible, they responded to mailed questionnaires. New questions had to be devised, of course, because the issues that sep- arated liberals from conservatives in the 1960s were different f rom those that had distinguished them in the 1930s. The women were questioned (among many other things) about their attitudes toward a variety of social issues, and toward certain public figures identified as conservative (e.g., Dwight Eisenhower, Joseph McCarthy) or as liberal (e.g., Adlai Stevenson) in the early 1960s. Would these Bennington alumnae remain more liberal years later? But wait-more liberal than whom? We need a comparison group here (Chapter 3). From the Survey Research Center at the University of Michigan,· the investigators were able to obtain survey data from women roughly comparable to the Bennington group in age, income, and geographical location; and, on average, the Bennington alumnae expressed more liberal views than these "controls" did. But Newcomb saw a tighter comparison: What if we interview the sisters of women who had gone to Bennington, where the sisters had gone to college elsewhere or not at all? That gives us a comparison group closely matched to the Bennington graduates for family ethnicity, affluence, and parental attitudes. In addition, the sisters-in-law of Bennington graduates were questioned. This increased the size of the sample while still providing a partial control for ethnicity and social class, for people tend to marry people who are similar to themselves in these respects. As it turned out, Bennington alumnae in the 1960s did, on average, express more liberal attitudes and values than their non-Bennington sisters and sisters-in-law did. 108 SURVEYS: IN TERVIEWS AND QUESTIONNAIRES Apparently, then, the change in attitude was genuine-and last- ing. Further questioning suggested that if it was social pressure that produced the attitude change, it was social support that maintained it. Bennington women tended to select, or be selected by, husbands with more liberal attitudes than non-Bennington women did. They also tended to describe their close friends as having such attitudes and values. There is much more, but this is enough to show us what inter- view and questionnaire data can do. Finally, the conclusions fit in with many other data, observational and experimental, on what hap- pens to those who deviate from a group's majority attitude. The de- viant is likely to be pressured to change his attitude toward the majority view; and, if the pressure doesn't work, he is likely to be rejected and isolated by the group (Schachter, 1951). This literature has some practical implications, too: If you want to change some- one's attitude, the best way may be to immerse him in a group that already holds the attitude you wish to bring about in him. A N EX A MP L E FR O M C U LT URA L PS Y C HO LO G Y : E X P LA NA TION S OF A C TIO N Here is another example of what we can learn from small-scale sur- veys. In the United States, when we are asked to explain someone else's actions-"Why did she do that?"-we tend to do it in a par- ticular way, by referring the action to underlying personality traits: "She did it because she's that kind of person." We take it for granted that what determines a person's actions is the "kind of person" he or she is. It may come as a surprise to learn that this "habit of thought" is by no means universal. Miller ( 1984) interviewed people of two cultures-Hindus living in India, and Americans living in the United States. She asked them to think of various examples of actions performed by their friends, and to explain why these actions were performed. Sure enough, the Americans explained their friends' behavior in terms of personality traits: "He or she is that kind of person." But the Indian respondents favored a different kind of explanation. They more of ten referred to outside situational factors: "He did it because the situation was such and such." Perhaps the two cultures differ, then, as to the kinds of expla- nations for someone's actions that come most readily to mind. Per- haps Americans think first of personality variables, whereas Indians look to the situation. However, there is a loophole here. All these 109 OBSERVATION AND DESCRIPTION "situations" were ones that the interviewees thought up. It might be that Americans are more likely to think of situations where the causes really are personal, whereas in India, actions may come to mind that really are determined by the situation. So, to check this possibility, Miller took some of the scenarios generated by Indians, and gave them to Americans to comment on. Sure enough: Where the Indians had given situational explanations for the actions, Americans explained the same actions, in the same situations, by reference to the personality of the actor.* Thus Americans prefer the one kind of explanation, Indians the other, even for the same events. If we are Americans, that tells us that the way we tend to explain things in this society is not universal. Not everyone jumps to the same conclusions as we! PARTICIPANT OBSERVATION Let us now turn to another way of observing events. Rather than asking questions, as in an interview or questionnaire, scientists may study the behavior of a group of individuals "from the inside." They may actually become part of the group, and observe it from the perspective of a member. Because the scientists participate in the group's activities while also studying them, this method is known as participant observation. A N EX A MP LE F R O M S O C IA L PS Y C HO LOGY : THE "S E E K E RS " The following episode took place in the 1950s. Mrs. Marian Keech was a middle-aged woman who believed she was receiving messages from outer space. One evening in September, such a message in- formed her that, in December, most of the world would be destroyed by a great flood. She, and those close to her, would be rescued by flying saucers coming in from outer space. Mrs. Keech attracted a *For example, one scenario concerned a lawyer on his way to work on bis motorcycle, with a friend as passenger. There was an accident and the friend was injured. After taking his friend to the hospital, the lawyer went on to work ratber than staying at the hospital with his friend. Why? Most Americans assumed that it was something about him:· "The lawyer is obviously ir- responsible." But most Indians assumed that it was something about the situation rather than the person: "It was his responsibility to be in court to represent his client." 110 PARTICIPANT OBSERVATION small but loyal group of followers who believed her messages and prepared with her to meet the catastrophe. The commitment of some members of the group was quite striking: They quit their jobs and gave away their possessions; some even sold their homes. Why not? They would not need them on another planet! A team of social psychologists (Festinger, Riecken, & Schachter, 1956) heard about the group, and decided to watch from close at hand what would happen when the flood failed to arrive. They and their student observers penetrated the group, claiming to be believ- ers themselves. They were able to sit in on the group's discussions and keep careful records of what was said a nd done. It is important to realize that the believers-the Seekers, they called themselves-were perfectly normal people. There was a physi- cian among them, and several college students. They were quiet and reclusive, making no attempt to publicize their beliefs and refusing to grant interviews to reporters. In the end, there was no flood and no spacepeople came. What did the group do? We might have expected a reaction such as, "Well, we were wrong" (Figure 4.4). But we'd be wrong. After a predicted "rescue" had failed to ;occur not once but twice, the group showed every in- dication that their beliefs had not weakened, but instead had strengthened. The change was quite dramatic. Group members be- gan making speeches and handing out leaflets, and Mrs. Keech made FIGURE 4.4 After prophesy fails, we might expect a reaction like this. What really happened was quite different. 111 OBSERVATION AND DESCRIPTION tapes for the media. These were things that they had never done before. Was this because only the most devout and activist members re- mained with the group after the predicted Doomsday had come and gone? Fortunately, because Festinger et al. had kept detailed notes, they were able to go back through these, check that possibility, and rule it out. Those members who were now proselytizing and publicizing had, before the crisis of Doomsday, expressed no more commitment and activism than other group members ha d. It would appear that even in perfectly sane people, a strong belief can survive even the clearest evidence that it is wrong. Indeed, a belief can not only survive such disconfirmation, but also show every sign of intensifying following it. How could that be? In explaining it, Festinger et al. appeal to the theory of cognitive dissonance. The idea is this: We are uncomfortable if our actions conflict with our beliefs and values. We don't like to be inconsistent. Suppose then that we are confronted with such an inconsistency: We believe one thing, but our actions suggest the opposite belief. Then, if the actions have already occurred and cannot be changed, we may modify our beliefs instead, to bring them into line with our actions. The Seekers (or many of them) had devoted a great deal of time and energy to promoting their liason with the extraterrestrial beings and, once an actual date was set for the rescue, some had quit their jobs and sold their homes. Clearly such actions would be inconsistent with a change in their beliefs: "Oh well, it was all a mistake." One member (the physician) said: "I've had to go a long way. I've given up just about everything. I've cut every tie; I've burned every bridge. I've turned my back on the world. I can't afford to doubt. I have to believe" (Festinger et al., 1956, p. 168). So the group members who stayed on felt strong pressures to- ward sticking to their guns, and convincing themselves that they had done the right things; they had just got the details wrong, , that was all. How could they support their belief that they had done the right thing? By convincing others! Hence, perhaps, the new willingness to publicize their belief s and argue for them publicly. One other observation sheds light on this episode. Those who did begin to publicize the group's belief were those who remained with the group after the disconfirmations. Those who drifted away, though they may have been just as committed earlier, did not do these things. It appears that just as social pressure can promote attitude change (as at Bennington), so too can social support promote the constancy of an 112 DIRECT OBSERVATION OF BEHAVIOR opinion, even in the face of the clearest possible evidence that it is wrong. The method of participant observation has given us some real ad- venture stories. One graduate student in sociology joined a Satanic cult, and described its (rather innocent) activities "from inside" (see Babbie, 1989). Eventually, he grew uncomfortable at the deception, which does raise ethical concerns. He went to the cult's leader and confessed that he had penetrated the group for research purposes. The leader was quite untroubled by this; indeed, he told the student that it was a properly Satanic thing to do! DIRECTOBSERVATIONOF BEHAVIOR Rather than asking subjects questions, or participating in their activities, a researcher may observe their behavior directly, from out- side the situation. "Just watching," in a careful and systematic way, can tell us much. A N EX A M P L E F R O M ET HO LO GY : S EX A ND T H E STI C K L E BA C K A research tradition that has affected many areas of psychology is known as ethology: the study of animal and human behavior in its natural setting (see Tinbergen, 1951). A classic in ethology is the observations on the reproductive be- havior of the three-spined stickleback, a freshwater fish. This seemingly simple, instinctive sequence of events turned out to be marvelously complex. In the spring, a male stickleback fish comes into reproductive con- dition, as shown by certain changes in his color caused by reproductive hormones. He will build an underwater nest (an ingenious tubular structure made of bubbles), and he will then patrol back and forth around the entrance to the nest. He is now prepared to react in either of two ways to the approach of another stickleback. If the approacher is another male, our male fish is likely to make a characteristic "threat display," standing on his head with fins outspread (Figure 4. 5). If the approacher is f emale, he will do a characteristic "zigzag" dance; this is the way he courts a female. ( How does the fish distinguish a male from a f emale? Hang 113 OBSERVATION AND DESCRIPTION FIGURE 4.5 A male stickleback fish threatens an intruder with his head-down posture (from ter Pelkwijk & Tin- bergen, 1937). on.) To an experienced fish watcher, these "displays" are clearly recognizable-but it does take some practice. If courtship is successful, there follows a meticulously choreo- graphed mating sequence, with each member of the pair responding to the actions of the other. The male leads the female to his nest, pointing to its entrance with his nose. She swims through the nest and deposits the eggs; he swims through after her and deposits sperm to fertilize them. He then chases her callously away, but he does take over the job of caring for the eggs until they hatch, "fanning" them by moving his fins so as to direct a stream of oxygen-rich water 1 over them. Once we identify these action patterns, they can serve as jumping- off points for further investigation-and that is a great strength of "jl1st watching" as a first step in research. Having seen what courtship and mating look like in this species, we can ask such questions as: What is it about an approaching fish that triggers threat on the one hand or courtship on the other? Or, whatever the cues are, does a stickleback have to learn to use them? Answering these questions requires us to manipulate certain aspects of the situation while holding others constant-in a word, to do experiments. By moving from observing to experiments, ethologists have shown that each reaction is triggered by a characteristic releasing stimulus. An intruder male is identified by its red underbelly, and even crude models (introduced into the aquarium tank by an experimenter) will be threatened if they have that underside, whereas even very detailed model fish will not be threatened if they lack it. Similarly, a swollen underbelly identifies a f emale; crude models with it will be courted, whereas detailed models without it will not. Experiments have also shown that these reactions do not require training: 114 DIRECT OBSERVATION OF BEHAVIOR Even if a male has been brought up in isolation, never seeing an- other fish, he will react in the characteristic way the first time he does see one, or an appropriate model of one. What's interesting about that? It revolutionized the way we look at behavior, that's all. By the 1930s or 1940s, most psychologists had thrown the concept of "instinct" out the window. They argued force- fully that all behavior, above the level of simple reflexes, was learned. But these ethological observations showed them, and us, that very complex sequences of actions could occur in the absence of a ny experiences that could have "taught" them. The concept of instinctive behavior made a strong comeback in light of such findings. H U MAN ETH O LO G Y I: FA C IA L EXP R E S S IONS "Just watching" can tell us much about human behavior, too. The next time you see two friends greet each other, look carefully at their eyebrows. You will see the characteristic "eyebrow flash"-a quick lifting and lowering of the eyebrows. Observations show that this gesture also occurs in cultures very different f rom our own, ones that have been separated from each other for many thousands of years (Figure 4.6). Again, observation leads the way to further exploration with other methods. As just one example: Can the members of one culture recognize the facial expressions of members of a very different FIGURE 4.6 The "eyebrow flash," used in greeting by (Panel A) a Papuan and (Panel B) a Waika Indian of the Orinoco in South America (from Eibl-Eibesfeldt, 1970) 115 OBSERVATION AND DESCRIPTION FIGURE 4.7 Two people from very different cultures, a man of the Fore of New Guinea and an American woman, smile happily. Mem- bers of each society have no trouble seeing what this expres- sion means in members of the other (from Ekman, 1973, 1975). one? They can (Figure 4.7). Case study data are also relevant here. In Figure 4.8, we see a little girl laughing in the normal way: open mouth, with head thrown back and narrowed eyes. This unfortunate girl was born both blind and deaf. It is not obvious how she could have learned to make that expression. Perhaps, then, certain facial expressions in humans begin as instinctive action patterns, anal- ogous to courtship or attack in the stickleback. FIGURE 4.8 A child born both blind and deaf shows the characteristic fa- cial expression that goes with laughter (from Eibl-Eibesfeldt, 1970). 116 DIRECT OBSERVATION OF BEHAVIOR Of course humans are more complicated than stickleback fish, and even if certain facial expressions arise without having to be learned, learning can certainly modify them later. We are taught certain "display rules" by the society we grow up in, and these rules tell us what expressions are appropriate, when. Some cross-cultural observations are relevant here. In Japan, one is taught to mask any negative emotion with a polite smile, and the "eyebrow flash" is considered rude and is deliberately suppressed. Notice that in this case an entire culture becomes a kind of "case study," rather than an individual. And the fact of cultural variations in "display rules" tells us something important: Even if facial expressions can develop without the need for learning, they still may be greatly modified by the learning experiences we encounter as we grow up within a society. H UMA N ET HO LO GY II: B OO K C A RRYING As a final example, here is an observation you can easily check for yourself. As you wander around the campus or the streets of your town, notice how men and women carry their books. Women are likely to carry their books with arm bent, cradling them against their chests. Men are likely to carry them with arms straight, books sup- ported by the fingers (Figure 4.9). This difference has been observed FIGURE 4.9 Males and females carry loads of books in different ways. The man is from Chile, the woman from Togo. 117 OBSERVATION AND DESCRIPTION in a number of colleges in the United States and Canada and, in ad- dition, in two Central American colleges ( Jenni & Jenni, 1976); and I have confirmed them in my own workplace (see later discussion of this topic). Observations at public libraries show that it extends to the elderly as well as to college-age people. For a thought problem (or perhaps a class project!), see if you can think of some reasons why this might be so. And then-the real test- think of some observations you might take to check your ideas. The consistency of the difference is quite remarkable; something in- teresting must be going on. Perhaps you will help us understand what it is. TESTING HYPOTHESES WITH OBSERVATIONS Thus far, most of our examples have dealt with exploratory research, asking questions in an open-ended Way. What happens when peo- ple greet each other? When one stickleback meets another? When a strongly held opinion is disconfirmed? How do males and females carry books? And so on. But observational research can also be used to test specific hy- potheses, just as our experiments in Chapter 2 did. Rather than ask- ing "What happens when... ?" we can say, "This theory predicts that such and such should happen. Does it?" Then the data will con- firm or disconfirm the prediction, and add to the weight of evidence for the theory or against it. Here are some examples. A N EX A M PLE F R O M C LINIC A L PSY C HO LO GY : SMO KING , O B ES ITY , AN D S ELF- H ELP Certain behavioral disorders are notoriously difficult to correct. It is fairly easy to lose weight, but it is very difficult to keep it off. It is easy to stop smoking-but to stay stopped is another matter. Indeed, follow- up studies of clients in treatment programs report that a dis- couragingly high percentage of smokers or overeaters relapse within the first year-around 80% or more. But then, all the people who were studied in those investigations were ones who had come to clinics for help. Perhaps the picture will be brighter if we look at those who have not done that, but have fought their problems on their own. 118 TESTI NG HYPOTHESES WITH OBSERVATIONS To test this idea, Schachter (1982) took a sample of people in the psychology department where he taught, and also in a small New England town where he spent his summers and knew many residents. He asked his subjects whether they had, or had ever had, a weight problem or a smoking problem or both; what they had done about it; and what had happened. So this was another small-scale survey and a questionnaire/interview study, like the Bennington one. The resulting data looked much more optimistic than the clinical literature. About 40 % of the people who had once smoked, had quit; and many had remained smoke- free for years. A roughly com- parable percentage of those who had once had weight problems said, when interviewed, that they had lost their excess weight and kept it off-again, often for many years. Why, then, are the data from clinics so much less encouraging? Maybe the people who go to clinks are those with the most severe problems, who have not been able to deal with their problems them- selves. Well, that too can be tested. Schachter asked· his subjects whether they had sought professional help for their problems. Sure enough: Those in his sample who had sought help tended to be the most overweight, or the heaviest smokers or ex-smokers. Thus, the very discouraging statistical outlook for people with these problems may be misleading. It may be a matter of sampling bias (Chapter 5): Perhaps the cases who seek professional help, and so appear in our statistics, are the ones whose problems are especially difficult to overcome. For the rest of us, the difficulty may be less-perhaps much less. A comforting conclusion! A N EX A MP L E F R O M S O C IA L D EV E LO P ME NT: MA T E R NA L R ES P O NS IV E N E SS A ND INFA NT C RYING In this second example, we see direct observation of behavior used to test a theory-or rather a pair of them. Before describing it, let us set the stage a bit. WHAT'S A PARENT TO DO? Suppose that a baby has been fed, changed, and put in his crib- and he cries. All his needs have been met. Presumably he would just like some company and some "contact comfort." What is the parent to do-respond to the crying, or ignore it? 119 OBSERVATION AND DESC:ROPTION We could argue the case either way! And either way we would find substa ntial support from both common sense and psychological theory. Suppose that: 1. The attention a caretaker can bring-the warmth, the jiggling, the contact comfort-acts as a reward or, as we say, a reinforcer. Then, if the baby's crying brings about that comfort, crying will be reinforced-and a persistently crying baby should result. 2. There is the similar view of Dr. Benjamin Spock, the world- famous pediatrician. Dr. Spock is of ten called an advocate of permissiveness, but he took a hard line on this one: If you come running to the baby every time she cries, she will soon learn that she can manipulate you! You don't want to raise a manipulative baby, do you? 3. There is the view of one grandparent, who warns you: "You are going to spoil that baby rotten if you go running to answer her cries!" All very convincing, and offered with great authority. The problem is that we can muster just as much common sense1 and just as much authority, on the other side. Thus: 1. If the baby learns that he cannot manipulate you, then he is also learning that he cannot depend on your responding to the signals he gives. This disrupts an important component of the attachment process (John Bowlby). 2. The baby may fail to form "basic trust" in you and in people generally (Erik Erickson). 3. True, the baby will not be reinforced for crying, but if reinforcement is not dependent on what the baby does, he may form "conditioned helplessness" (Martin Seligman). 4. Then there is the other grandparent, who will say in ef ect: Go with your impulses! You know you want to go and comfort the baby, so do it! So, from the first point of view-call it "reinforcement theory" for short-the advice to the parent is dear: Let the baby cry. Crying should drop out if it is unrewarded, and after a while you'll have a baby who does not cry unless he is in real distress. But by the second account-call it "attachment theory" for short-a baby who is only tended when she needs feeding or changing is like a baby mon- key with a wire mother: She will become insecure and fretful-a fussy, crying baby. So the attachment theorist's advice to the parent is: Go ahead. Comfort the baby. 120 TESTING HYPOTHESES WITH OBSERVATIONS SOME PREDICTIONS What should the parent do?Thetwo theories make exactly oppositepredictionsabout the outcomes ofthe parents' actions. If reinforcement theory is right, parents who respond promptly to babies' cries should reinforce crying, and have babies who cry a lot. Parents who don't respond as often or as promptly are not re- warding their babies' crying, and should have babies who don't cry much.* If we look over a sample of families, the amount of crying that occurs in different homes should be positively correlated with parents' responsiveness (Figure 4.1OA). Responsive parents should tend to have babies who cry a lot (closed circles). Unresponsive parents should tend to have babies who cry less (open circles). If attachment theory is right, the prediction is reversed (Figure 4.1OB). Responsive parents should tend to have secure, nonfussy babies (dosed circles). Unresponsive parents should tend to have insecure, fussy babies (open circles). So, amount of crying should be FIGURE 4.10 Pitting two theories against each other. The "reinforcement" view would predict a positive correlation between mothers' responsiveness and their babies' fussiness (Panel A). The "attachment" view predicts a negative correlation (Panel B). *Reinforcement theorists quarrel with this rather naive analysis. They point out that unre- sponsive parents, who wait until they can't stand it any more before responding, may be re- inforcing loud, persistent crying. This objection has been raised (Gewirtz & Boyd, 1977) and replied to (Ainsworth & Bell, 1977). 121 OBSERVATION AND DESCRIPTION negatively correlated with parents' responsiveness. The more respon- sive the parent, the less the baby should cry (Figure 4.1OB). Thus we make predictions-that is, specify in general what the data should look like if a certain theory holds-and we can do this before we gather any data at all. In this case, we can pit two theories against each other, just as Harlow did ( Chapter 2), by finding a situation in which they make opposite predictions. Then we gather som e actual data, and look and see what actually happens. THE BELL AND AINSWORTH STUDY A team of investigators (Bell & Ainsworth, 1972) repeatedly spent whole af ternoons with a sample of mother-infant families during the child's first year. Bell and Ainsworth noted (among other things) how often the babies cried, and how of ten and how promptly the mothers went to attend to them when they did. This was an observational study, not an experiment, because the researchers did not introduce any changes to see what effect they would have. They simply observed carefully what went on in each of the homes that they visited (direct observation of behavior). What happened? The correlation was negative. More responsive mothers tended to have less fussy babies than did mothers who of - ten ignored their babies' cries or were slow to respond to them. The data looked like Panel B, not Panel A, in Figure 4.10. As Bell and Ainsworth ( 1972, p. 1187) sum it up, parents should not be afraid of spoiling their babies by responding to them. The data "suggest the contrary-that those infants· who are conspicuous for fussing and crying... and who fit the stereotype of the 'spoiled child,' are those whose mothers have ignored their cries or have de- layed long in responding to them." The concerns of Dr. Spock, of re- inforcement theorists, and of hardline grandparents simply are not borne out by the data. Alookback This study is a good example of how correlational data can be used to test predictions. Of course, this one study does not demolish what we have called the "reinforcement" theory of mother-infant attach- ment, but it takes its place in what is now a substantial literature, emphasizing the role of communication, and of responsiveness to that communication, in attachment formation. Notice too that we speak of these data as posing problems for a theory, not as "proving" that another theory is true. The "attach- 122 THE VALUE OF OBSERVATIONAL RESEARCH ment theory" could have been disconfirmed by the data; it wasn't, and so we could say that it has survived one test. Still, there are other ways of interpreting the data. Maybe it is true that an unresponsive mother is likely to make the baby fussy. Or it could be the other way around: Maybe a baby who is already fussy, for whatever reason, exhausts the mother and makes her unresponsive! Or maybe neither of these is true-there could be some third variable that affects both the mother and the baby. For instance, in homes with a lot of noise going on, both mother and baby might be stressed, tense, and therefore fussy (the baby) and unresponsive (the mother). Homes with less noise could leave the baby placid and the mother rested and responsive. The data are compatible with that idea, too. This principle- correlational data do not establish what causes what- is so important, and so of ten forgotten, that it rates a discussion all to itself. We will come back to it. Returning to the Bell and Ainsworth study, let's see how it incorporates some familiar concepts. The mother's responsiveness was operationalized in several ways: For example, the number of cries, and the number of times the mother responded to them, were counted. Another measure was time-a physical va riable (Chapter 3). When a cry was heard, the observer would quietly start a stopwatch, and stop it when the mother left what she was doing and went to deal with the baby. The reliability of these measures was spot-checked by sessions in which more than one observer kept records, without communicating with each other. The interobserver reliability of the measures ( Chapter 3) was acceptably high. Perhaps most important of all, these data show that the original question-Will we make the baby more fussy if we respond to its cries, or if we don't?-need not be argued in a vacuum. It is possible to direct research to questions of this sort, and so let our conclusions be grounded in solid facts. THE VALUE OF OBSERVATIONAL. RESEARCH Back in Chapter 2, we talked about the theory/data cycle-the back- and-forth interplay between our ideas and the facts (pp. 29- 32). We also pointed out that a researcher can enter the cycle from either "place." He may begin with a theory and seek to test it; the examples in Chapter 2 were of that kind. But he may also enter the cycle by collecting data first-by ex- 123 OBSERVATION AND DESCRIPTION ploring something that he observes happening. The guiding idea here is not so much "Here is what this theory predicts should happen," but rather, "Something interesting is happening here; let's find out more about it." Pavlov's work is an example, and most of the re- search summarized in this chapter (with exceptions) is also in that spirit. Thus many questions are "open-ended" ones: What happens when students from a conservative background find themselves in a liberal social environment? What do stickleback fish do when they encounter one another? What regularities can we see in how the facial muscles move when people greet each other? This patient can- not speak; why? And so on. Simple as they are, these questions (What happens? What is it like? Under what conditions does it happen, and what are its consequences?) give us an overview that can move us a long way toward understanding. A bird's wing would be a most puzzling structure if we didn't know that birds can fly. As we explore, more specific questions come to mind that we can zero in on. How does a male stickleback tell male from female? Must he learn to do so? Do Bennington students only conform to the immediate situation, or are there lasting changes in attitude? Do people in different cultures use similar facial expressions in similar circumstances? Thus exploratory observation sets up questions to be addressed, perhaps by more systematic methods. The "transition to experiment," which we have seen in the stickleback case, is an in- stance of this. Then too, observation without interference can test specific hy- potheses. Schachter's study is one example. Are our clinical outcome statistics for smoking and obesity misleadingly pessimistic, because of a sampling bias in favor of those with more severe problems? Schachter's data (together with others) suggest that the answer is yes. Does being responsive to a baby's cries make for a spoiled, fussy baby? Bell and Ainsworth's observations (along with many others) suggest not. Thus, whereas correlation does not establish causality, correlational studies can sometimes disconfirm the predictions of a causal theory. We see it again: We don't prove theories, we rule out alternatives ( Chapter 2) ! A final comment. Looking backward over even just these few ex- amples, I hope the reader is struck (as the writer is) by the sheer number and diversity of questions that can be asked and answered- not with opinions or impressions, but with data! Do women who were relatively conservative when they graduated from Bennington tend to describe themselves that way 25 years 124 THE VALUE OF OBSERVATIONAL RESEARCH Table 4.1 Self-Description of Political Attitude in 1960 Final Conservatism Score in College Conservative Middle of Road Liberal Above median 19 12 30 Below median 4 9 61 Table 4.2 Favor Conservative Political Figures? Final Conservatism Score in College Above Median Below Median Above median 44 (73%) 16 (27%) Below median 22 (33%) 44 (67%) Table 4.3 Candidate Voted for in 1960 Final Conservatism Score in College Nixon Kennedy Above median 40 (61 %) 24 (37%) Below median 13 (18% ) 59 (81%) later, as compared to ones more liberal? Yes, as Table c4.l shows.* Are they more likely to favor conservative political figures? Yes (Table 4.2 ). Were they more likely to vote for the more conservative candidate (Nixon) in the 1960 Kennedy / Nixon election? Yes (Table 4.3). And so on-page after page of it. Again: Schachter tells us how many among his sample of ex- smokers went for professional help, and, among those who did and those who didn't, how much they had smoked. The relation between these variables, in this sample, is not a matter of opinion. There it was in the data: Those who sought help smoked more on average than those who didn't. Will parents spoil a child by responding to her cries? The Bell and Ainsworth data "suggest the contrary-that those infants who... fit the stereotype of the 'spoiled child,' are those whose mothers have ignored their cries or have delayed long in responding to them" (1972, p. 1187). *In looking over the tables, do not be confused by the fact that about as many Bennington women were "high" as "low" in conservatism. This doesn't contradict what we've said about the drift toward liberal views. These women scored above or below the median of graduating Bennington students. Thus, a person might score "above the median" for students even if she were pretty liberal relative to, say, her parents and home community. 125 OBSERVATION AND DESCRIPTION As we continue our journey, notice how of ten this happens. Questions that could be debated endlessly as matters of opinion can become matters of looking and seeing-if we pose the questions that way, a nd if we are willing to put our opinions to the test. SUMMARY In many research projects, something is manipulated, to see what effect it has. This is an experiment. But much has been learned by observing and describing what happens naturally, without manipu - lating anything. This chapter focuses on the latter case. This kind of research can take many forms. These include case studies, as in Broca's careful observation of a patient with speech prob- lems. Then there are surveys, as in studies of cohabitation; Newcomb's studies of the drift toward liberal attitudes in students at Benning- ton College; Miller's interviews with people from two different cul- tures, showing how they tended to explain the actions of others; and Schachter's interviews with people who had smoked or who had had weight problems There is participant observation, in which investiga- tors become part of a group in order to study its workings "from the inside," as in the study of the Seekers. Finally, there is the direct ob- servation of behavior. Examples of this range from the description of reproductive behavior in sticklebacks, to the details of the "eyebrow flash" and of book- carrying in humans, to the description of mother-infant interactions in Baltimore families. In each case, the observations bear on much broader issues: for example, the implications of a "speech center" for the unity of mind (Broca); the role of social pressure in attitude formation and the re- sistance of attitudes to change (the Bennington studies, the Seekers); the differences between cultures in the "preferred" explanations for another person's actions (the comparison of American and Indian respondents); the complexities of instinctive behavior even in sim- ple creatures; and the universality across cultures of certain nonver- bal signals and ways of carrying objects such as books. Observation can be used to test specific hypotheses (predictions), as the Bell and Ainsworth observations supported an "attachment" as opposed to a "reinforcement" theory of mothers' effects on their infants, and as Scha chter's interview data on those who had recov- ered on their own from smoking and weight problems identified a bias in the clinical data for recovery rates. 126 CHAPTER 5 OBSERVATI ON AN D DESCR I PTION I I: SOME TECH NICAL PROBLEMS [A] scientific theory may please you and may seem in accord with your own feelings and beliefs, but your own pleasure is no proof of its possible validity. -ISAAC ASIMOV IN THIS CHAPTER, WE'LL DISCUSS: Some problems of method that a researcher may face, including: Ways of selecting an unbiased sample The concept f random sampling When, and when not, to be concerned about sampling bias Effects of the observer on what the subjects do or say Bias in the observations themselves Inferential bias, or drawing illogical conclusions from the data Some ways of dealing with these problems when they arise 138 OBSERVATION AND DESCRIPTION II: SOME TECHNICAL PROBLEMS Science bases its conclusions on data (Chapter 3), and we can obtain data by observing nature as we find it, without interfering with it (Chapter 4). But researchers want good data-that is, data that really will permit valid conclusions. Data that mislead us are worse than no data at all, for they ma y make us think we have answered a ques- tion when we have not. This chapter is about some ways in which we might gather misleading data, and what we can do to prevent this from happening. To look ahead: What are some ways we might gather mislead- ing data? First of all, we might observe organisms or events that are diff erent, in some consistent way, from those organisms or events that we want to draw conclusions about. This is the problem of sam- pling bias. Second, we might distort what we observe by the very act of ob- serving it. If we observe a group of people (or chimpanzees, or whomever), and if they know they are being observed, they may be- have differently because they are being observed. In such cases, the observer is having an unintended effect on what is going on, so we will call this the problem of observer effects. Third, even if an observer does not distort what happens, she may distort how she sees it happening. We can show (with data!) that observers ma y see what they expect to see, or what they hope to see-as Asimov puts it, what pleases them-rather than what is "out there" to be seen. Such observer bias could seriously distort the data we record. There is a fourth danger. Even if the data themselves are accu- rate, we may draw conclusions from those data that the data just do not support. Here the problem is with our logic, not our procedures; we make inferences from our data in a biased way. So, to have a name for this problem, we will call it inferential bias. And we will look later at an insta nce of it that crops up again and again: the correlation and causality fallacy. Here then are some things we want not to happen: sampling bias, observer effects, observer bias, and inferential bias. We wa nt to take steps to ensure that these do not distort our conclusions. Such steps are an integral and important part of the procedures we adopt in gathering our data. One final comment before we begin: We will discuss these tech- nical problems in the context of observational research; but the prob- lems also arise in experimental research, and there too they can play havoc if they are not dealt with. So many of the ideas that we dis- cuss in this chapter _will arise in later ones as well, especially when we talk about experimental control ( Chapters 7 and 8). 139 OBSERVATION AND DESCRIPTION II: SOME TECHNICAL PROBLEMS SAMPLING In all our examples, we have seen conclusions drawn from observa- tions of what certain people, groups, or animals did or said. A ques- tion we have not yet addressed is: Which animals? What people? Whom, or what, are we going to observe? Or, as we say: Which par- ticular people, or groups, or animals, are going to be the subjects of our investigation?* That question is closely related to another one: What people, groups, or animals do we want to draw conclusions about? To what population do we want to generalize the results from this sample of cases? As we saw in Chapter 2, science is the search for general princi- ples. We have little interest in just these particular monkeys, college students, or stickleback fish that we happened to observe. We want more general conclusions that will tell us about monkeys, stickle- backs, and students (or human beings) in general. So this is our problem: How do we go from particular observations to general con- clusions? This is a topic that many beginning students find conf using. On the one hand, they may have been told that an investigation should always begin with a representative sample of some population. This is es- pecially likely if they have studied such matters as survey method- ology, in which representativeness of ten is a concern. But when they come to research in psychology, students find that most investigations draw their subjects from rather limited popula- tions: Not "people in general" but, for example, college students- and not even college students in general, but students in one college who elect psychology courses and sign up for resea rch participation! Such people are clearly not representative of all the people even in one society, much less of all the human beings there are. Is something wrong here? Not necessarily, and in the.'tlext few pages I will try to clear up the confusion. Briefly, the difference arises *The term subjects is a traditional way of referring to the organisms-human or animal-that we observe in a research investigation. Some journals recommend that we refer to them as participants instead. That is a useful reminder that if a person comes to our lab, or stops what he is doing to answer our questions, he is doing us a favor by cooperating with us. But the journals are not consistent about this, and it can be misleading. If I collect data on book car- rying by watching people from my window, it is not clear that they are "participating" in my project. Still less is it clear that stickleback fish are doing so! Rather than have different words for different cases, I shall use the older term subjects throughout this book. 140 SAMPLING because a survey researcher wishes to draw conclusions about a spe- cific population, based on the findings of just this survey. Most psy- chological research is diff erent. Its general conclusions depend on the agreement among different investigations. We don't try to let one study do it all. We'll return to that idea shortly. But first, let's get clear on the ideas of sample and po pulation. SA MPLE A ND PO PULATIO N Let us talk about soup for a minute. Suppose a chef has prepared a large vat of soup. She wonders whether there is enough salt in it, so she tastes a spoonful of the soup. If it tastes salty enough, she will conclude that the whole vat of soup probably is adequately salted. Some research can be considered directly analogous to this. The big vat of soup is like a population. The spoonf ul that the chef tastes is like the sample of subjects that is actually observed. Then, from what is observed in the sample, one draws inferences about the pop- ulation: If the spoonf ul tastes all right, chances are the whole vat of soup does. In other words: There will be some po pulation of people (or groups or fish or whatever) about which we wish to draw conclusions. We want to know something, not about the behavior of just the sample ( the spoonful), but about the population (the vat). Now, today's vegetable soup will be diff erent from tomorrow's chicken soup. From this we get to an important idea: We must spec- if y what population we are talking about. There is no such thing as "the population" pure and simple. Our po pulation is whatever we say it is. I can, if I wish, look over the students in a classroom and say, "You people here in this room are my population." I could then draw a sample from that population. And I could draw a biased sample or an unbiased one. What do these terms mean? THE P R O B LEM OF SA MPLING BIAS Returning to the soup analogy, notice that a chef will stir the soup before he tastes it. Why? Because if he didn't, and if (say) the heav- ier ingredients tended to sink to the bottom of the vat, then his spoon- ful might not taste at all the way the whole vat tastes. He has, we 141 OBSERVATION AND DESCRIPTION II: SOME TECHNICAL PROBLEMS would say, an unrepresentative or biased sample of the soup. He would then draw very wrong conclusions if he assumed that all the soup tasted the way his sample did. In short, a sample (like the spoonf ul) and the population that is of interest (like the vat of soup) may dif - fer in some consistent way (like heavy ingredients in the soup but not in the spoonful). When that happens, we speak of a biased sam- ple of that population.* Some "research" permits sampling biases that are almost laugh- ably obvious. An example is the notorious "Hite report," in which a "researcher" reported a survey of American women's attitudes to- ward men. The survey distributed some 100,000 questionnaires through various organizations, of which around 4,500 were returned. American males came out looking, shall we say, pretty bad. A moment's thought tells us that such data are useless if we want to know how American women in general-the intended population-f eel about men. This is not because 4,500 is too small a number-it is quite large, as surveys go-but because with a return rate of only 4.5%, the sample we end up with is almost certain to be biased toward people with strong attitudes on the topic. Chances are that it is mostly these people who took the trouble to fill out and return the questionnaires. Similar limitations attend most of the "survey results" we en- counter in magazines and on the Internet, based on readers who bothered to volunteer their responses to published survey questions. Such volunteer or "self -selected" samples are almost certain to be bi- ased toward those who ( 1) have strong opinions and (2) are eager to express them. Such "findings" are published for entertainment, not for enlightenment. ) However, biases can creep in more subtly than that. Consider a political pollster who samples haphazardly, "as he f eels like it," within a community-which is not what we mean by random sampling'( see later)! If I were that pollster, I would certainly "feel like" sampling from the lower, rather than the upper, floors of walk-up apartment *The term is unfortunate, perhaps, because the word bias may connote prejudice, and perhaps suggest that someone is being malicious. As used here, the term means neither of these. Think of a car, and suppose that the alignment is not quite right, so that the car has a consistent tendency to pull to the right. An engineer or mechanic might speak of a "right-turning bias" in such a case. Of course, the car is neither prejudiced nor malicious. Even so, an engineer will try to design cars so that no such bias occurs! Now, just as a car can be biased toward a rightward drift, so a sampling procedure can be biased toward certain subgroups within a population. This means that members of that sub- group are likely to be overrepresented in the sample-there will be more of them than there should be. When this happens, our inferences about the population can be thrown badly off. 142 SAMPLING buildings. But what if those who live on lower floors pay higher rent, and so are likely to have higher incomes? A pollster's bias against stair-climbing could become a sampling bias in favor of the higher- income members of a community. Or consider this: Earlier (p. ll8), I mentioned that I had repli- cated earlier observations about the way men and women carry their books. I did the study at the college where I teach-a "convenience" initial sampling stage (see later). Suppose I had sampled "as I felt like it," tallying the men and the women I happened to notice as I walked around campus. Mightn't I be more likely to notice the cases that fit my expectations-men, arms straight; women, arms bent-than those that did not fit them? If so, then the sample I ended up with- those men and women whose book-carrying techniques were actu- ally recorded-could have been biased in favor of my expectations.* Here is another example-a classic cautionary tale, or actually a pair of them. In 1936, when polling was in its infancy, the Literary Digest conducted a poll before the presidential election. They confi- dently predicted victory for the Republican candidate, Alfred Lan- don, over the Democratic candidate, Franklin D. Roosevelt. Came election day, and there was indeed an overwhelming victory-for Roosevelt! Why had the Digest been so wrong? The story that has been handed down goes like this: They had selected their sample from telephone directories. But in 1936, far f ewer voters owned phones than do now. Those who did tended to be more affluent than those who didn't. So low-income voters were underrepresented in the sam- ple, and these voters were overwhelmingly in favor of Roosevelt. The Digest' s sample was biased toward high-income voters, and the Digest had egg on its face as a result. But in writing this book I learned, to my surprise, that that story is a myth! What happened was not at all like that (Bryson, 1976; Fowler, 1988). The Digest survey was not a phone survey but a mail *I avoided that pa rticular source of bias as follows: I watched the men and the women who passed by my office window (an accidental sample ) that morning, over a predetermined time period, at a time when classes were in session. That way, there were not too many students to observe, and I was able to tally every student who walked past a particular tree I had picked out. Male or female? Books held with arm straight or bent? All cases, whether they fit my ex- pectations or not, were included in my sample-that is, were tallied in the appropriate cell of the data sheet. This is a variant of systematic sampling (p. 146). It remains true that students who happened to be in that area at that time may not be representative of all students even at that college. But remember that our sample is to be "like" the population in ways that matter. It is unlikely that the book-carrying habits of my sample differ consistently from·those of students on other parts of the campus. 143 OBSERVATION AND DESCRIPTION II: SOME TECHNICAL PROBLEMS survey. Like the Hite report, it had a low rate of return, and those who did return the questionnaire tended to favor Landon, perha ps (this is an interpretation) because those who are rooting for the un- derdog are more likely to take the trouble to express their views. Thus this little story gives us two examples of how a sample can be biased. In the one story, which didn't happen but could have, the sample would have been biased in favor of the affluent. In actuality, it was biased in favor of the vocal. Either way, we have a sample that differed consistently from the population of interest. It tended to fa- vor Landon, whereas the population as a whole favored Roosevelt- and showed it on election day. O BTAINING R E P R ES ENTA TIV E SA M P LES To avoid sampling bias, we first define our population, and then seek to obtain a representative sample from that population. This is the re- search parallel to the chef's stirring the soup. We want the sample (the spoonful) to be like the population (the vat of soup) in all the ways that matter. How can we encourage that to happen? RANDOM SAMPLING The best way of trying for a representative sample from a popula- tion is to draw the sample at random from that population.* We need to get clear on what that means. It definitely does not mean "hit or miss," or "any old way," or "as we feel like it." No, the phrase random sample has a technical meaning. A random sample is one selected in such a way that every member of the po pula- tion has an equal chance of being selected. An easy way to visualize this is to imagine that each member of the population has a name or a number written on a slip of paper, the papers are tossed into a hat and stirred like the soup, and someone reaches in blindly and selects a number of slips equal to the size of the sample we want. The names or numbers on the slips so selected identify the population members who will make up our sample. Or we can replace the hat with a table of random numbers, generated so that at each place in the table all digits (0 through 9) have an equal chance to appear. Appendix A (pp. 504-510) presents a table of random numbers, and shows how to use them for selecting a sample at random from a population. *Pollsters often speak of a "probability sample, which means the same thing. 144 SAMPLING Alternatively, this manual labor can be replaced by the operation of a computer, which can be instructed to generate a series of num- bers that is random or nearly so. Then the numbers so generated will identify subjects for our investigation-provided of course that each population member was assigned a number beforehand. Such a method assumes a great deal. To use it, we have to ( 1) list all members of the population, (2) assign each member a num- ber (or name), ( 3) draw our random sample of numbers, and then (4) find the person corresponding to each of the numbers, so as to interview or observe him or her or it. In practice, grave difficulties can arise. Do we want a random sample of all the monkeys in the world, or even in some one country? Just how do we assign a num- ber to every monkey and keep track of who has which number? If our population is a community, we can assign a number to each hu- man being in that community's phone book and select a sample based on those.* But then what about those who have unlisted numbers? Or no phones? If our project involves interviews, what about those who ref use to be interviewed when contacted? In such cases, there may be nothing we can do except to select more subjects (again at random!) to replace the missing ones, and hope that the resulting bias in our sample (from the loss of unlisteds, no-phones, and re- fusers) is not so serious as to affect our results importantly. Finally, it's important to remember that our population is what- ever we say it is; there is no such thing as "the population" pure and simple. And so a sample representative of one population may not be representative of another. Thus, I could draw a random sample from the population: students in this classroom now. But that same sam- ple would not be a random sample of the population students enrolled in this class. Why not? Because the students who cut class or slept through it would not have a chance to be selected as subjects, and students who miss class might have different characteristics from those who attend. VARIANTS OF RANDOM SAMPLING As we see, strict random sampling from a population can be a real chore, especially if the population to be listed is large. There are short- cut methods to reduce the labor involved. *Today, selecting a random sample from phone owners is less burdensome than it used to be; we can program a computer to dial phone numbers at random. Even so, the unlisteds, the phoneless, and the ref users will not be sampled. 145 OBSERVATION AND DESCRIPTION II: SOME TECHNICAL PROBLEMS One of these is multistage sampling. Suppose we want a random sample of college students in the United States. It would be inhu- manly difficult to list all members of that population. Instead we could list all colleges in the United States (a much more manageable task) and take a random sample of the colleges (Stage 1). Then, from every college selected, we could take a random sample of its students (Stage 2). Or, in political polling, rather than list all the voters in the country, we might list all the counties in (say) the United States (Stage 1); within each county, take a random sample of streets (Stage 2); and, on each street, take a random sample of the voters who live there (Stage 3). Another variation has been called systematic sampling. This method is useful when one cannot keep track of individuals (as in the mon- key case), or where the population cannot be listed in advance (e.g., we might want a random sample of transactions in a marketplace). The idea is to observe or interview every mth case, starting with the nth one, where both m and n are randomly chosen. Thus, suppose I want a random sample from a large classroom. Rather than listing everybody, I might go to my random number table, and point a pencil anywhere on it. Doing that now, I get the number 7. Doing it again, I get 4. So, counting round the room, I would select the fourth student to be in my sample. Then I select every 7th student after that, until I have gone all around the room. In survey research, the parallel case would be to start with the fourth house on a randomly selected street and choose that house, and every seventh house thereafter, as the sample of houses to visit. Or, in the above examples, we could observe every seventh market. transaction, or every seventh person (or monkey) that passes some landmark, starting from the fourth one that does so after we begin our obser- vation session. Random sampling, then, takes many forms. Some variant of the procedure is our best bet if we truly need a representative sample of some population. Random sampling does not guarantee a represen- tative sample, but it does make it unlikely that the sample will be very different from the population from which it is drawn. And the larger the sample, the more unlikely any large differences become. Notice something else, too, that is common to all these methods. The researcher sets up a rule, in advance, that dictates whom or what she will observe. After that, she has no choice at all in the matter (Fowler, 1988). That way, her own biases-toward lower floors in apartment buildings, or toward more noticeable cases, or toward cases that fit her theory- cannot affect the data. 146 SAMPLING OTHE R A PP R OA C HES TO SA M P LING For certain pu rposes-like political or opinion polling-representa- tive samples are very important, as we've seen. Yet we noted earlier that much of psychological research pays little attention to repre- sentativeness and here arises the confusion that I promised to ad- dress. Not all researchers insist upon a representative sample from a specific population. Nor, for some purposes, would it be sensible to do so. "PURPOSIVE SAMPLING" First, we may "purposely" go looking for subjects with certain characteristics-hence the name-where those characteristics make them releva nt to the question we are asking. Schachter "purposely" looked for people who had had smoking or weight problems. Such people will not be representative of people in general. Again: To have a strongly held belief sharply and clearly disconfirmed may not be a typical event. But the question was: What happens when such a rare event does occur? Hence the selection of the Seekers for study (Chapter 4). Newcomb chose to study attitude change at Benning- ton College partly because it was not a typical college, but instead had a reputation for unusually liberal views among its students. What happens when students from conservative backgrounds find them- selves there? In all these cases, the cases we observe are selected be- cause of the question we ask, not because they are representative of any particular population-though it is true that we can always de- fine a population of "subjects like these" (see below). "CONVENIENCE SAMPLING" This kind of sampling is by far the most common in behavioral re- search. We select our subjects for accessibility and convenience. Suppose a researcher teaches at the University of Washington and uses col- lege students as subjects. He will likely use students enrolled at that university. A researcher at the University of Virginia will use stu- dents enrolled there. And so on. The initial studies on "bystander in- tervention," for instance, were conducted with students at just one university. The study of mother-infant interactions was conducted not with college students but with families; these families all lived in Baltimore, where the researchers worked. And even then, they were mothers who were willing to be observed by the researchers- 147 OBSERVATION AND DESCRIPTION II: SOME TECHNICAL PROBLEMS which probably made them unrepresentative even of Baltimore mothers (or those of any other city)! Why use such unrepresentative samples? For one thing, research is the art of the possible. A study could draw a random sample of, say, college students in the United States, as by multistage sampling. But then what? Do we fly to Seattle to observe one subject, to Bal- timore to observe the second, to Iowa City for the third... ? Some- thing of the sort could be done for mailed questionnaires or telephone interviews, but for direct observation of behavior, few studies would get off the ground (so to speak) if all good research really required anything of the kind.* Fortunately, it does not. Let us see why. The Defined Population: Subjects "Like These." The number-of - bystanders findings would indeed be of little interest if they applied only to students at one particular university. On the other hand, it would also be very surprising if that were so. Even if we use a con- venience sample, we can reasonably expect the results to apply to persons who are similar to those subjects in all the ways that mat- ter. We can generalize, in other words, to "subjects like these." Much of what we do is based on the assumption that people are pretty much alike in the way their minds work, especially in the ba- sic processes like visual perception, or generation of language by the brain. Thus, if we find out something about how vision works in stu- dents at one college, it seems safe to assume that it will apply to stu- dents at others, and probably to other age and occupational groups as well. (Except perhaps for people with visual defects, and these may be excluded from our study-pur posive sampling again!) Now this is an assumption, and it will not always apply. Other human beings may be "like" college students in some wa ys (e.g., how their eyes work), but not in others. Take the study of memory, for instance. College students have had extensive experience at learn- ing things for the purpose of remembering them later. Might they have developed certain strategies for remembering things, strategies *This is all the more apparent if we ask ourselves: Just what population is of interest in psy- chological research? It has been said that we seek the principles of the behavior of all species, from humans to flatworms. Yes, but no individual research project attempts this level of gener- ality. There are no data on bystander intervention in flatworms, and a random sample of the Earth 's creatures has never. to my knowledge, been attempted in any investigation. tstatisticians are familiar with this approach. No less a figure than Sir Ronald Fisher, one of the founders of modern statistics, pointed out that we can always define our population in terms of the sample. It can be that population, whatever it is, from which this sample can be con- sidered a random one. 148 SAMPLING that might be quite different from the way an older adult, or an adult from a preliterate society, goes about the matter? Quite possibly; and if so, research on memory that depends heavily on student subjects might turn out to apply less generally than we think. In short, convenience sampling has its dangers. If we were to re- strict our observations to (say) students in college, we would miss the differences there may be-in social relations, use of facial ex- pression, or even strategies for remembering things-in other age or occupation groups, or in other cultures. Generality from Diversity. In fact, however, our observations are not so restricted as all that. Remember that no general conclusion rests on a single study, and that what one researcher doesn't study, an- other will. Researchers are in fact studying (for example) memory in children, and in the elderly, and in members of different cultures. These studies will have been conducted in different places and per- haps even with different methods, as well as with subjects drawn from different populations. The generality of the conclusions comes from the consistency among these different findings, not from the "representativeness" of the subjects in any one of them. So, for example: Does the number-of -bystanders effect occur in other societies? The way to find out is to look and see. If the results can be replicated in other cultures, that supports the original finding and adds generality to them. And if the results do not hold up in other cultures or age groups, we can go on to ask why they don't. By checking out possible explanations for this, we may expand our understanding. We'll discuss this f urther in Chapter 12 (see also Stanovich, 1998). Now, all this does not mean that we can be complacent. It re- mains true that a very large proportion of experimental findings comes from a very limited database: college students in the United States. We need to check our conclusions in other age and income groups, and in other cultures, much more carefully than we have done in the past. There are indications that the gap is narrowing (Markus & Kitayama, 1991; Shweder, 1991), but there is much yet to be done. W H E N D O WE NE E D R E PR ESE NTATIVE S A MPLES ? As we've seen, any study has a certain generality built in-we can always generalize to a population of "subjects like these." But then 149 OBSERVATION AND DESCRIPTION II: SOME TECHNICAL PROBLEMS what about the specific procedures- like random sampling-by which we try for representative samples of populations? When do we use these methods and when do we not? When do we need them? I think the answer is: We need them when there is a real-world pop- ulation out there-a specific existing one, not just one that we define- to which we want to generalize our findings. Political polling is an obvious example. We want, let us say, to predict the outcome of an election. It would be foolish to select a sample of very affluent community members, determine whom they will vote for, and say, "Well, people like these favor Smith over Jones, so Smith will probably win." Here we are not interested in "people like these" but in what the whole population of voters is likely to do. If we don't have a representative sample of that population, then we have no business making predictions about it. Or: If we are inter- ested in how American women in general (the population of interest) feel about men, we will not find out if our sample is a self -selected one, consisting only of those energetic or angry enough to return questionnaires. This, clearly, is something we have to decide case by case. Is there a specific population out there to which we want our conclusion to apply? If the answer is yes, we do need a representative sample of that population. But of ten the answer will be no--.._we may let our general conclusions rest on the convergence among many findings, not on the representativeness of this one. OBSERVER EFFECTS As we observe and describe behavior, we must consider that what our subjects do may be affected by our presence as observers, or by something that we do while observing. Then they may not behave as they normally would. That is the problem of observer effects. EX A MPLE S OF OB S E RVE R EFFE C TS Imagine this: You and I are visiting a first-grade classroom, to ob- serve. We walk quietly to the back of the room and sit down, to see what the kids do. What will we see? Of course-we will see a room- ful of little heads swiveled around to observe us! If we concluded that such kids usually spend most of their time staring at the back of the room, we'd be pretty foolish. 150 OBSERVER EFFECTS If you're not a good buy, Santa will bring you only educational toys. " FIGURE 5.1 We may be very, very good when we know we're being watched. Obvious enough. So are some other instances of observer effects. A child may be on her best behavior, rather than her typical behav- ior, if an adult is watching (Figure 5.1) By the same token, a person may give socially correct rather than accurate answers to an inter- viewer. Then again, the problem can be much more subtle. The fol- lowing is a classic cautionary tale. AN EXAMPLE FROM ANIMAL BEHAVIOR: THE CASE OF CLEVER HANS This happened in Germany some years ago. A schoolteacher, a cer- tain Mr. Ostler, had discovered a true genius among horses. This horse, whose name was Hans, could do arithmetic! Ask him to add 7 and 4, and he would tap with his hoof 11 times and stop. He was just as gif ted at subtraction, and even multiplication and division. He was at least as mathematically skilled as your average fifth- or sixth- grader. Herr Ostler was not a faker. To his many doubters, he would say in true scientific spirit: See for yourself. Scientists checked out Clever 151