CHAPTER 2 Research Methods in Social Psychology PDF

Chapter 2 Research Methods in Social Psychology Antony S. R. Manstead and Andrew G. Livingstone Source: © René Hendriks (One of the laboratory rooms of the social psychology laboratory at Utrecht University.) KEY TERMS confederate construct construct validity control group convergent validity cover story debriefing demand characteristics dependent variable discourse analysis experiment experimental confound experimental group experimental scenario experimenter expectancy effect external validity factorial experiment field experiment hypothesis implicit measures independent variable interaction effect internal validity Internet experiment main effect manipulation check mediating variable meta-analysis one-shot case study operationalization participant participant observation post-experimental enquiry post-test only control group design quasi-experiment quota sample random allocation reactivity reliability sampling simple random sample social desirability social neuroscience survey research theory triangulation true randomized experiment unobtrusive measures validity variable CHAPTER OUTLINE INTRODUCTION Summary RESEARCH STRATEGIES Experiments and quasi-experiments Survey research Qualitative approaches Summary A CLOSER LOOK AT EXPERIMENTATION IN SOCIAL PSYCHOLOGY Features of the social psychological experiment Experimental designs Threats to validity in experimental research Social psychological experiments on the Internet Problems with experimentation Summary DATA COLLECTION TECHNIQUES Observational measures Self-report measures Implicit measures Choosing a measure Social neuroscience Summary ROUTE MAP OF THE CHAPTER This chapter provides an overview of research methods in social psychology, from the development of theory to the collection of data. After describing three quantitative research strategies (experiments and quasi-experiments, and survey research), the chapter briefly discusses qualitative approaches, focusing on discourse analysis. There follows a description of the key elements of experimentation, as this is the most popular research method in social psychology. We also consider threats to validity in experimental research, and discuss problems with experimental research in social psychology. The final section of the chapter contains a description of three methods of data collection (observation, self-report and implicit measures). INTRODUCTION How do social psychologists go about testing their theories? Why should a chapter about the technical aspects of research methods come so early in this textbook? Why do we need to study research methods, rather than go directly to the real substance of social psychological phenomena and explanations for those phenomena? To answer these questions, we need to consider an even more fundamental question: Why do psychologists conduct research in the first place? As social psychologists, we are of course interested in ‘big’ phenomena. What causes intergroup conflict? Why do people stereotype members of other groups? How do we form impressions of other people? Why do people behave differently when they are in a group? What leads people to change their attitudes? What factors influence whether close relationships succeed or fail? To answer these questions, we develop theories. For example, we might want to develop a theory about the causes of intergroup conflict. In the first instance, this involves identifying constructs (abstract concepts, such as ‘threat’, or ‘prejudice’) or variables (a measurable representation of a construct, such as scores on questionnaire measures of threat perceptions or intergroup hostility) that we think are relevant to the question, and speculating about how these relate to one another. Crucially, our theories typically consist of propositions about causal relationships between constructs. In this way, we are not content with simply describing these ‘big’ phenomena; rather, we seek to explain them by identifying their antecedents. In developing a theory of intergroup conflict (see Chapter 14), we are not only interested in what conflict is like, but also in what causes it and how it might be reduced. For example, we might theorize that intergroup conflict is caused by feelings that the interests or well-being of one’s own group are threatened by another group (see Branscombe, Ellemers, Spears, & Doosje, 1999). ---------- theory a set of abstract concepts (i.e. constructs) together with propositions about how those constructs are related to one another. ---------- ---------- construct an abstract theoretical concept (such as social influence). ---------- ---------- variable the term used to refer to the measurable representation of a construct. ---------- We might initially base our theories on observation of real-life events, on intuition or on existing theories. But coming up with theories is only part of the story. Many other disciplines – such as philosophy, sociology, and anthropology – are concerned with the same issues and phenomena that interest social psychologists. What helps to distinguish social psychology – and psychology as a whole – from these other disciplines is not simply the type of explanation we provide, but also a commitment to the scientific method, in which we test our theories against evidence. This introduces an essential characteristic of a theory: it must be testable. This means that we should be able to derive specific predictions (or hypotheses) from the theory concerning the relationship between two or more constructs, and to gather evidence that could support or contradict those predictions. ---------- hypothesis a prediction derived from a theory concerning the relationship between variables. ---------- Consider Janis’s (1982) theory about the poor quality of decision-making that is apparent even in groups of competent and experienced persons – a phenomenon he termed ‘groupthink’ (see Chapter 8). Janis’s (1982) theory consists of one set of concepts representing the antecedent conditions of poor group decision-making, another set representing the symptoms of groupthink, a third set representing symptoms of poor decision-making and a final set representing the process linking antecedent conditions to the symptoms of groupthink and poor decision-making (see Theory Box 2.1). One of the antecedent conditions is a ‘cohesive group’, a group whose members are psychologically dependent on the group. Because they are dependent on their group membership, they are more likely to conform to what they believe to be the consensual position in the group. An example symptom of groupthink is the presence of ‘mind guards’, a term Janis used to describe group members who take it on themselves to protect the group from information that questions the correctness or morality of an emerging decision. An example symptom of defective decision-making is failure to examine the risks of the preferred decision. Janis also specified how groupthink is brought about (i.e. the mediating process). In this case, the mediating process is a premature ‘concurrence-seeking tendency’, a powerful preference for agreement with fellow group members, before all these issues have been properly discussed. Thus antecedent conditions are linked to symptoms via a mediating process; we discuss the concept of mediation in more detail later in this chapter. A prediction that we can derive logically from Janis’s theory is that groups that are more cohesive should be more prone to making poor-quality decisions than groups that are less cohesive (see Research Close-Up 2.1). To the extent that the evidence is consistent with the prediction, we can be more confident in the theory from which we derived the prediction. Correspondingly, if the evidence is inconsistent with the prediction, we should be less confident in the underlying theory. Evidence might also reveal limits or boundary conditions to a predicted effect, showing that it only occurs under specific circumstances. An example of boundary conditions in relation to Janis’s theory comes from research conducted by Postmes, Spears, and Cihangir (2001), who found that the effect of group cohesiveness on the quality of a group’s decisions depended upon the norm of the group. Specifically, the quality of decisions was improved when the group had a norm of critical thinking compared to when it had a norm of maintaining consensus. In the light of this sort of evidence, the original theory may need to be modified, or even rejected entirely in favour of an alternative. THEORY BOX 2.1 ANTECEDENT CONDITIONS, MEDIATING PROCESS AND SYMPTOMS OF GROUPTHINK IN JANIS’S (1982) THEORETICAL MODEL Source: Adapted from Janis (1972, 1977, 1982). Now we can see why research methods are so important. They are the essential tools of our trade, providing a way of translating our ideas into actions, and of testing, challenging and improving our theories. The quality of our research depends not only on the quality of our theories, but on the quality of the research methods we use to test those theories. Summary Methods are the tools researchers use to test their theoretical ideas. These ideas can come from a variety of sources, but two that are quite common in social psychology are observations of real-life events and inconsistencies between previous research findings. A theory consists of a set of constructs linked together in a system, and specifies when particular phenomena should occur. RESEARCH CLOSE-UP 2.1 ARCHIVAL ANALYSES OF ‘GROUPTHINK’ Janis, I. L. (1972). Victims of groupthink: A psychological study of foreign-policy decisions and fiascoes. Boston: Houghton Mifflin. Introduction Janis’s research on groupthink provides an excellent example of ‘archival research’, a research strategy that is not described or discussed elsewhere in the present chapter. In archival research the data come from archives – that is, from stored records of facts. ‘Archival data may include such items as personal documents (letters or diaries), creative products (poems, paintings, essays), biographies or autobiographies, and histories or governmental records’ (Simonton, 1981, p. 218). Janis (1972) decided to study in detail archival material relating to four major US foreign policy fiascoes: the Bay of Pigs invasion of Cuba in 1961; the decision to escalate the Korean War in 1950; the failure to be prepared for the attack on Pearl Harbor in 1941; and the decision to escalate the Vietnam War in 1964. Janis argues that in the case of each of these disastrous decisions, information was potentially or actually available to the policy-making groups that should have led them to different decisions. Method Janis’s research took the form of careful scouring of all the documentary sources of information on the circumstances in which these faulty decisions were made. In his 1972 book Victims of Groupthink, Janis attempted to show how the archival data on each of these decisions can be regarded as forming a consistent social psychological pattern, the essence of which is shown in Theory Box 2.1. Janis (1982) published a second edition of his book in which he applied the notion of groupthink to the Watergate incident that ultimately led to US President Richard Nixon’s resignation in 1974. Later research Tetlock (1979) conducted a more quantitative analysis of archival materials. He applied standardized procedures for analysing the content of public statements made by key decision-makers involved in the ‘groupthink’ and ‘non-groupthink’ decisions examined by Janis (1972). Tetlock was particularly interested in assessing the extent to which public statements made by key decision-makers reflected ‘a tendency to process policy-relevant information in simplistic and biased ways’ (p. 1317), and the extent to which these statements reflected ‘a tendency to evaluate one’s own group highly positively and to evaluate one’s... opponents highly negatively’ (p. 1317). To assess these two aspects of groupthink, Tetlock identified six key decision-makers who were intimately involved in five different foreign policy decisions, two of which were classified by Janis as ‘non-groupthink’, while he classified the other three as ‘groupthink’ decisions. He then randomly selected and analysed 12 paragraph- sized passages from the public statements made by each decision- maker at the time of each crisis. He found that the public statements of decision-makers in groupthink crises were significantly less complex than were the public statements of decision-makers in non-groupthink crises. He also found evidence that decision-makers in the groupthink crises gave more positive evaluations of their own political groups than did decision-makers in crises not characterized by groupthink. However, contrary to predictions, there was no difference between groupthink and non-groupthink decision-makers in terms of the intensity of negative evaluations of their political opponents. With the exception of this last finding, the results of Tetlock’s study are consistent with Janis’s conclusions, which were based on a more qualitative analysis of historical documents. Discussion A key advantage of the archival research strategy is that the evidence gleaned from archives is not distorted by participants’ knowledge that researchers are investigating their behaviour. The behaviour took place in natural settings at an earlier time than that at which the behaviour was studied. There is, therefore, little or no chance that the behaviour could have been ‘contaminated’ by the research process. As Simonton (1981) put it, ‘Because archival research exploits data already collected by others for purposes often very different from the intentions of the researcher, this methodology constitutes a class of “unobtrusive measures”’ (p. 218). Offsetting this advantage are some disadvantages. The most obvious of these are (1) that the researcher is dependent on the quality of the archival information, which may not contain a good basis for assessing key variables, and (2) that even when associations between variables (such as the quality of a decision and the complexity of statements made by decision-makers) are found, it is unclear whether or how they are causally related. ---------- participant a person who takes part in a psychological study. ---------- RESEARCH STRATEGIES What are the strengths and weaknesses of the principal research strategies available to the social psychologist? Researchers who want to test their ideas and predictions have a range of different research strategies available to them. In this section we will consider experimental and quasi-experimental research, survey research and qualitative approaches. Experiments and quasi-experiments Experimental research is designed to yield causal information. The goal of an experiment is to see what happens to a phenomenon when the researcher deliberately modifies some feature of the environment in which the phenomenon occurs (‘If I change variable B, will there be resulting changes in variable A?’). By controlling the variation in B, the researcher who finds that there are changes in A can draw causal conclusions. Instead of just knowing that more of variable A is associated with more of variable B, the experimental researcher discovers whether A increases when B is increased, decreases when B is reduced, remains stable when B is left unchanged and so on. Such a pattern of results would suggest that changes in B cause the changes in A. ---------- experiment a method in which the researcher deliberately introduces some change into a setting to examine the consequences of that change. ---------- The experimental method is a theme with many variations. Two common variations are the quasi-experiment and the true randomized experiment. They differ with respect to the realism of the setting in which the data are collected and the degree of control that the researcher has over that setting. A quasi-experiment is typically conducted in a natural, everyday setting, one over which the researcher does not have complete control. The true randomized experiment, by contrast, is one in which the researcher has complete control over key features of the setting; however, this often involves a loss of realism. ---------- quasi-experiment an experiment in which participants are not randomly allocated to the different experimental conditions (typically because of factors beyond the control of the researcher). ---------- ---------- true randomized experiment an experiment in which participants are allocated to the different conditions of the experiment on a random basis. ---------- To grasp the key difference between a quasi-experiment and a true experiment, we need to consider further what is meant by the term experiment. Experiments are studies in which the researcher examines the effects of one class of variables (independent, or manipulated, variables) on another class of variables (dependent, or measured, variables). In a true randomized experiment the researcher has control over the independent variable and over who is exposed to this variable. Most importantly, the researcher is able to allocate research participants randomly to different conditions of the experiment (random allocation). In a quasi-experiment the researcher usually cannot control who is exposed to the independent variable. In a typical quasi-experiment, pre-existing groups of people are either exposed or not exposed to the independent variable. ---------- random allocation (sometimes called random assignment) the process of allocating participants to groups (or conditions) in such a way that each participant has an equal chance of being assigned to each group. ---------- Examples of each method may help to bring out the points of difference. Social psychologists interested in aggression have studied whether exposure to violent film and television material has an impact on the subsequent behaviour of the viewer (see Chapter 9 and Figure 2.1). This can be done using true randomized experiments or quasi-experiments. An example of a true experiment on this issue is the study reported by Liebert and Baron (1972). Male and female children in each of two age groups were randomly allocated to one of two experimental conditions, one in which they viewed an excerpt from a violent television programme and another in which they viewed an exciting athletics race. Later both groups of children were ostensibly given the opportunity to hurt another child. Those who had seen the violent material were more likely to use this opportunity than were those who had seen the non-violent material. Because children had been randomly allocated to the violent and non-violent conditions, the observed difference can be attributed to the difference in type of material seen, rather than any difference in the type of children who saw the material. FIGURE 2.1 What research method might be used to study the impact of viewing violent television on subsequent behaviour? Source: © Edouard Berne. Used under licence from Getty Images. An example of a quasi-experimental study of the same issue is the study reported by Black and Bevan (1992). They asked people to complete a questionnaire measure of tendency to engage in aggressive behaviour under one of four conditions: while waiting in line outside a cinema to see a violent movie; while waiting in line to see a non-violent movie; having just seen a violent movie; and having just seen a non-violent movie. As can be seen in Figure 2.2, the researchers found that those waiting to see the violent film had higher aggression scores than those waiting to see the non-violent film; and also that those who had just seen the violent film scored higher than those waiting to see the violent film (although there was no difference in aggression scores between those who had just seen a non-violent movie and those waiting to see a non-violent movie). These findings are consistent with the notion that viewing a violent movie increases the tendency to aggress (Figure 2.3), but the fact that participants were not allocated at random to the different conditions makes it impossible to rule out alternative explanations. For example, it may be that violent movies only increase aggressive tendencies among those who are attracted to view such movies in the first place. FIGURE 2.2 Self-reported tendency to aggress, as a function of type of movie, and whether or not the respondent was waiting to see the movie or had just seen the movie. Source: Based on data reported by Black and Bevan, 1992. Reproduced with permission from John Wiley & Sons, Inc. FIGURE 2.3 Does viewing a violent movie increase the tendency to aggressive behaviour in us all, or simply increase aggressive tendencies only in those who are attracted to viewing such movies in the first place? Source: HOSTEL 2 © 2007 Screen Gems, Inc. and Lions Gate Films Inc. All Rights Reserved. COURTESY OF SCREEN GEMS. Often the only way in which to conduct an experimental study of a social phenomenon is via a quasi-experiment. Ethical and practical considerations frequently make it impossible to allocate people randomly to different experimental conditions. If, like Stroebe, Stroebe, and Domittner (1988), you wish to study the effects of bereavement, for example, you obviously cannot randomly allocate research participants to a ‘bereaved’ and a ‘non- bereaved’ condition. The same applies in many other fields of research. Thus, the choice of research strategy is often a compromise between what is optimal and what is practicable. Fortunately, the sophistication of some quasi-experimental designs is such that it is possible to draw conclusions about causality with some confidence (Judd & Kenny, 1981a, b; West, Biesanz, & Pitts, 2000; see Leader in the Field, Charles M. Judd). LEADER IN THE FIELD Charles M. Judd (b. 1946) is a leading scholar in the field of social cognition and stereotyping, but also an outstanding methodologist. He took his BA at Yale University, majoring in French, then studied theology at Union Theological Seminary, New York, before discovering a greater calling and pursuing graduate studies in psychology at Columbia University, where he obtained his PhD under Morton Deutsch. He began his teaching career at Harvard University, then moved to the University of Colorado, Boulder, where he has remained since 1986 (with the exception of two years at the University of California, Berkeley), and is now Professor of Distinction, College of Arts and Sciences. He is an expert on experimental design and analysis, and evaluation and quasi-experimental designs. He has contributed numerous highly influential articles on statistics (notably mediation and moderation analysis), has served as editor of many of the leading journals in the field, and received the 1999 Thomas M. Ostrom Award for Lifetime Contributions to Social Cognition Theory and Research. He was awarded the degree of Doctor Honoris Causa, Faculté de Psychologie et des Sciences de l’Education, Université Catholique de Louvain, Louvain-la-Neuve, Belgium, in 2006. It is also possible to conduct true experiments in field settings, in which case they are referred to as field experiments (see Chapter 1), which attempt to combine the control of a laboratory experiment with the realism of a quasi-experiment. An example of such a field experiment is given in Research Close-Up 2.2. Survey research The practical and ethical considerations that can make quasi-experiments a useful strategy also apply more generally to survey research (Oppenheim, 1992; Schwarz, Groves, & Schuman, 1998; Visser, Krosnick, & Lavrakas, 2000). Surveys differ from experiments and quasi-experiments in that they focus on measuring existing levels of relevant variables, rather than manipulating them (see Social Psychology Beyond the Lab 2.1). Like experiments, survey designs are typically concerned with associations and/or cause-and-effect relationships between variables. However, the lack of control over independent – or predictor – variables means that it is virtually impossible to be certain about their causal role. For this reason, surveys often also measure other variables that can be taken into account in statistical analyses – that is, controlled for – in order to rule out possible alternative explanations for a relationship between a predictor variable and an outcome variable. ---------- survey research a research strategy that involves interviewing (or administering a questionnaire to) a sample of respondents who are selected so as to be representative of the population from which they are drawn. ---------- An example of this strategy is provided by Pratto, Sidanius, Stallworth, and Malle (1994). They proposed that many different types of intergroup prejudice are predicted by a personality variable which they call social dominance orientation, or SDO (see Chapter 14). This variable reflects an individual’s preference for unequal status relations between social categories. So, someone who scores high in SDO should be more sexist, and more likely to oppose equal rights for racial minorities. In support of this prediction, Pratto et al. (1994) found that SDO was indeed positively correlated with sexism, and negatively correlated with support for racial equality. However, they were also concerned that the relation between these variables and SDO could actually be due to the influence of another variable, such as political conservatism. If so, the argument that SDO is a unique predictor of prejudice would be greatly undermined. To test this possibility, Pratto et al. re-examined the correlations between prejudice and SDO while controlling for (or partialling out) the effect of political conservatism. The correlations remained significant, increasing their confidence that the role of SDO is not simply due to political conservatism. Another strategy that can increase certainty regarding causal relationships in survey research involves taking measures of relevant variables at several points in time – a longitudinal survey design. The logic here is that if variable A at time 1 predicts variable B at time 2 (especially when controlling for differences in variable B at time 1), then we can be more certain that variable A has a causal effect on variable B. Qualitative approaches Traditionally, social psychological research – including the overwhelming majority of the research discussed in this book – has involved quantitative data analysis. That is, the data we analyse are represented as numbers. This makes our data amenable to statistical analyses, allowing researchers to say something about the average score on a variable (e.g. the mean, median, and/or modal score in a sample); the range of scores on a variable (e.g. the variance or standard deviation); and the strength and reliability of relations between two or more variables (e.g. through inferential tests such as t-tests, analysis of variance or regression). These analyses usually provide clear and interpretable outcomes, and moreover there is broad consensus among researchers about the meaning of these outcomes. RESEARCH CLOSE-UP 2.2 A FIELD EXPERIMENT TO STUDY HELPING BEHAVIOUR Darley, J. M. & Batson, C. D. (1973). ‘From Jerusalem to Jericho’: A study of situational and dispositional variables in helping behavior. Journal of Personality and Social Psychology, 27, 100–108. Introduction The researchers were interested in testing the idea that one reason why bystanders do not come to the assistance of others, even when these others clearly need help, is that helping is costly. The particular ‘cost’ they studied in their research was time. To come to a stranger’s assistance often involves a departure from your original plan. Such a departure can throw you off your schedule. The researchers also wanted to examine whether reminding people of the parable of the Good Samaritan, in which a passer-by does come to the assistance of a stranger in need of help, would influence willingness to give help. They tested these notions in a field experiment (see also Chapter 10). They also measured individual differences in religiosity, to see whether these would influence helping. Method The participants in the study were male seminary students (i.e. trainee priests) who believed that they were taking part in a study on ‘religious education and vocations’. Each participant began the study in one building and was then asked to proceed to a second building to complete the study. Before leaving the first building, the participant was led to believe one of three things about the speed with which he should go to the other building: that there was no special hurry, that there was an intermediate degree of hurry, or that he was late for the second part of the study and should hurry up. This was the manipulation of the first variable, time pressure (no versus medium versus high degree of hurry). In the second part of the study, the participant expected to do one of two things: either talk about the parable of the Good Samaritan, or talk about job prospects for seminary students. This constituted the second manipulation: either having or not having the parable of the Good Samaritan made psychologically salient. The design of the study is shown in Figure 2.4. FIGURE 2.4 Design of the Darley and Batson (1973) field experiment. Source: Copyright © 1973 by the American Psychological Association. Adapted with permission. Journal of Personality and Social Psychology, 27, 100–108. The use of APA information does not imply endorsement by APA. On his way to the other building, the participant passed through an alley in which a person (the ‘victim’, but actually an accomplice of the experimenters) was sitting slumped in a doorway, head down, eyes closed. As the participant passed the victim, the latter coughed twice and groaned. The dependent variable in this field experiment was the extent to which the participant did anything to help this person apparently in distress. The extent of the participant’s helping behaviour was observed and coded. Results Helping was significantly influenced by the time pressure manipulation. The results are summarized in Figure 2.5. Those in the ‘no hurry’ condition were more helpful than those in the ‘medium hurry’ condition, who in turn were more helpful than those in the ‘high hurry’ condition. There was also a tendency for being reminded about the parable to have an influence; those who were reminded were more helpful than those who were not. Individual differences in religiosity did not predict whether or not participants stopped to help, although they were related to the nature of the help given. FIGURE 2.5 Mean helping scores as a function of degree of hurry (no, medium, or high) and anticipated topic of talk (parable of Good Samaritan or job prospects for trainee priests). Source: Based on data reported by Darley & Batson (1973). Copyright © 1973 by the American Psychological Association. Adapted with permission. Journal of Personality and Social Psychology, 27, 100– 108. The use of APA information does not imply endorsement by APA. Discussion Even those who have chosen to be trained in a vocation in which helping others is supposed to play a central role were affected by the time pressure variable. When they were in a hurry, even those trainee priests who thought that they were on their way to a discussion of the parable of the Good Samaritan were less likely to offer help to a stranger in need than were their counterparts who were in less of a hurry (Figure 2.6). From a methodological perspective, the neat thing about this field experiment is that it was conducted in a natural, everyday setting. Participants were randomly allocated to one of the six conditions of the experiment, so any differences found between these six conditions resulted, in principle, from the experimental manipulations. Thus internal validity was high (i.e. the researchers could be confident that changes in the independent variable caused changes in the dependent variable). But the fact that the setting of the experiment was such an everyday one means that this study also scores quite highly on realism (i.e. it has high external validity too). It is a good example of a field experiment. FIGURE 2.6 Would you be more likely to help someone in need after hearing a sermon on the parable of the Good Samaritan? Source: © Digital Vision. Used under licence from Getty Images. However, the quantitative approach is far from universal in social psychology. An alternative is qualitative analysis, in which data are typically textual rather than numerical, focusing on the content and meaning of the words and language used by participants. Qualitative approaches encompass a wide range of analytic techniques. Some of these are perfectly consistent with the philosophical assumptions that underpin quantitative research and the scientific method more generally. Among these assumptions is the belief that the phenomena in which we are interested represent an objective set of ‘facts’ that exist independently of the researcher’s (or anyone else’s) perspective on them. According to this view, the fact that many of the phenomena we study – such as an attitude – cannot be directly observed or measured simply means that we require sensitive and sophisticated research methods in order to study them. However, other qualitative approaches are more radical and explicitly reject these assumptions. What unifies qualitative approaches is a concern with the limitations of quantitative research of social phenomena, and the belief that qualitative techniques can provide additional, or even radically different insights (see Henwood, 1996). The supposed limitations of quantitative techniques relate to the ways in which they potentially misrepresent and/or over-simplify phenomena and participants’ perspectives on them. In particular, quantitative measures – especially self-report measures, which we discuss later – require the researcher to make assumptions regarding the range and content of possible responses, not to mention their meaning to participants. This is important, because a participant’s understanding of a task or a questionnaire item may be quite different to that of the researcher. Moreover, the range of possible responses may not allow participants to communicate responses or perspectives that the researcher did not anticipate. SOCIAL PSYCHOLOGY BEYOND THE LAB 2.1 SURVEY RESEARCH Descriptive survey designs focus on describing overall levels of relevant variables, such as the characteristics of one or more groups of people. Examples of descriptive survey designs include political opinion polls, market research and a census (Figure 2.7). Such descriptions can range from the simple (e.g. describing the percentage of people eligible to vote in a particular constituency who say that they intend to vote for a particular political candidate) to the more complex (e.g. describing the personal and social characteristics associated with use of recreational drugs among school-age children and teenagers). FIGURE 2.7 One strategy for gathering research evidence is to survey public opinion by interview. Source: © pixsooz. Used under licence from Shutterstock. Descriptive survey research is often concerned with large populations, such as all adults living in a particular community, region or country. To ensure that responses are representative, one could interview or collect completed questionnaires from the entire population in question (as is done in a census). In most cases, however, collecting data from all members of a population is simply not possible; even where it is possible, it is typically not cost- effective. The result is that the researcher has to choose which members of that population to survey. The process of selecting a subset of members is known as sampling. ---------- sampling the process of selecting a subset of members of a population with a view to describing the population from which they are taken. ---------- Two main types of sampling are used in survey research: probabilistic and non-probabilistic. The most basic form of probabilistic sampling is the simple random sample. A simple random sample is one which satisfies two conditions: first, each member of the population has an equal chance of being selected; second, the selection of every possible combination of the desired number of members is equally likely. To explain the second condition, imagine that the population size is 10 (consisting of persons labelled A to J) and the sample size is two. There are 45 possible combinations of two members of the population (A + B, A + C, A + D, and so on, to I + J). In simple random sampling, each of these 45 possible combinations of two members has to be equally likely. In practice, of course, the sample size of a random sample (e.g. of the whole population of a country) is much larger than two and the process is much more complex. Researchers achieve random sampling by allocating numbers to each member of the population and using computer-generated random numbers to select a sample of the required size, a feature available on the web. ---------- simple random sample a sample in which each member of the population has an equal chance of being selected and in which the selection of every possible combination of the desired number of members is equally likely. ---------- Because probability sampling is expensive and time-consuming, non- probability sampling is frequently used. The most common form of a non-probability sample is the quota sample. Here the objective is to select a sample that reflects basic attributes of the population. Such attributes might be age and sex. If you know the age and sex composition of the population concerned, you then ensure that the age and sex composition of the sample reflects that of the population. The term ‘quota’ refers to the number of people of a given type (e.g. females between the ages of 55 and 60) who have to be interviewed. The major advantage of quota sampling is that the interviewer can approach potential respondents until the quotas are filled without needing to recruit a specifically identified respondent. Some disadvantages of quota sampling are (1) that it is usually left to the interviewer to decide whom to approach in order to fill a quota, with the result that bias can enter into the selection process, and (2) that it is impossible to provide an accurate estimate of sampling error. ---------- quota sample a sample that fills certain prespecified quotas and thereby reflects certain attributes of the population (such as age and sex) that are thought to be important to the issue being researched. ---------- Several qualitative approaches help to address these concerns. At an early, exploratory stage of a research project, content analysis or thematic analysis of open-ended oral or written responses can shed light on potentially relevant factors. This approach can also be used to follow up quantitative analyses by exploring unexpected or ambiguous findings (see Livingstone & Haslam, 2008, for an example of the latter). Indeed, content analysis often ultimately produces quantitative outcomes, because it allows the researcher to count occurrences of particular words, phrases or themes and conduct statistical analyses on the results. Alternatively, grounded theory focuses on systematically generating theory about a specific phenomenon in an inductive or ‘bottom-up’ manner, for example on the basis of exploratory interview data. Other techniques (e.g. interpretative phenomenological analysis) focus on revealing and interpreting the subjective meaning that participants attach to particular issues or events. All of these techniques can complement or extend quantitative research. By contrast, other qualitative approaches such as forms of discourse analysis involve a more fundamental rejection of the assumptions that underlie quantitative and much qualitative research. Rather than searching for an objective, knowable set of ‘facts’ about social psychological phenomena, advocates of these approaches instead assume that there is no unique, valid interpretation of the world. Consequently, the focus of their research is on fine-grained features of everyday talk and interaction to explore how people actively construct particular interpretations of events. They might seek to show how, for example, racist or sexist attitudes arise not because of the beliefs or biases of the individual who expresses them, but rather as evaluations that emerge in the context of particular social interactions. Rather than being relatively fixed products of individual cognitive systems, such evaluations arise in the context of conversations and vary according to the particular cultural setting. ---------- discourse analysis a family of methods for analysing talk and texts, with the goal of revealing how people make sense of their everyday worlds. ---------- An example of the use of discourse analysis is the study reported by Wetherell, Stiven, and Potter (1987). These researchers were interested in male and female university students’ views about employment opportunities for women. They reasoned that analysing how a group of 17 students talk about these issues would reveal the practical ideologies that are used to reproduce gender inequalities. The students were interviewed in a semi- structured way and their responses were transcribed and analysed. A benefit of this approach is that it enabled the researchers to identify contradictions in the way ordinary people talk about issues like gender inequality. Rather than having a single attitude, the students tended to endorse different positions at different points during the interview. Some of these positions were inconsistent with each other, but served specific ideological and strategic purposes at the particular point at which they were adopted. This sort of qualitative approach is not represented in the present volume, where the emphasis is on the strengths of approaches that assume the objective existence of the phenomena we study. The role played by qualitative research methods in social psychology largely reflects differences in philosophical belief about the causation of social behaviour. From the standpoint of the research represented in the present volume, qualitative research seems to be more focused on description than explanation, and more concerned with how behaviour is constructed than with how it is caused. One of the most important decisions to be made during the research process is which of the strategies outlined above to adopt. It is worth pointing out that although some research strategies will be better suited than others to studying a given phenomenon, each and every strategy, however sophisticated its implementation, has its limitations. It is for this reason that one of the great pioneers of research methodology in the social sciences, Donald Campbell (see Leader in the Field, Donald T. Campbell), argued for triangulation. By this he meant that using multiple methods to study a given issue provides a better basis for drawing conclusions than does any single method. Because each method has its own strengths and weaknesses, using different methods helps the strengths of one method to compensate for the weaknesses of another (e.g. Fine & Elsbach, 2000). ---------- triangulation the use of multiple methods and measures to research a given issue. ---------- LEADER IN THE FIELD Donald T. Campbell (1917–1996) is regarded as having been a master research methodologist. Campbell completed his undergraduate education at the University of California, Berkeley. After serving in the US Naval Reserve during World War II, he earned his doctorate from Berkeley and subsequently served on the faculties at Ohio State University, the University of Chicago, Northwestern, and Lehigh. He made lasting contributions in a wide range of disciplines, including psychology, sociology, anthropology, biology and philosophy. In social psychology he is best known for co-authoring two of the most influential research methodology texts ever published, Experimental and Quasi-Experimental Designs for Research (1966, with Julian C. Stanley) and Quasi-Experimentation: Design and Analysis Issues for Field Settings (1979, with Thomas D. Cook). Campbell argued that the sophisticated use of many approaches, each with its own distinct but measurable flaws, was required to design reliable research projects. The paper he wrote with Donald W. Fiske to present this thesis, ‘Convergent and discriminant validation by the multitrait– multimethod matrix’ (1959), is one of the most frequently cited papers in the social science literature. Summary Research strategies are broad categories of research methods that are available to study social psychological phenomena. We began by noting that it often makes sense to study a phenomenon using more than one strategy. We identified three quantitative strategies (experiments and quasi-experiments, and survey research) before discussing qualitative research strategies. A CLOSER LOOK AT EXPERIMENTATION IN SOCIAL PSYCHOLOGY What are the main elements of a social psychological experiment? Experimentation has been the dominant research method in social psychology, mainly because it is unrivalled as a method for testing theories that predict causal relationships between variables. Standard guides to research in social psychology (e.g. Aronson, Ellsworth, Carlsmith, & Gonzales, 1990; Aronson, Wilson, & Brewer, 1998) treat experimentation as the preferred research method. In fact there are some grounds for questioning the extent to which experimental studies provide unambiguous evidence about causation, as we shall see later. We will first describe the principal features of the experimental approach to social psychological research. To assist this process of description, we will use Milgram’s (1965; see Chapter 8) well-known study of obedience as an illustrative example. Features of the social psychological experiment The experimental scenario is the context in which the study is presented. In laboratory settings it is important to devise a scenario for which there is a convincing and well-integrated rationale, because the situation should strike participants as realistic and involving, and the experimental manipulations and the measurement process should not ‘leap out’ at the participant. In Milgram’s study, participants were told that the study was an investigation of the effects of punishment on learning. The participant was given, apparently at random, the role of ‘teacher’, while an accomplice of the experimenter posing as another participant (known as a confederate) took the role of ‘learner’. The learner’s task was to memorize a list of word pairs. The teacher’s task was to read out the first word of each pair, to see whether the learner could correctly remember the second word, and to administer a graded series of punishments, in the form of electric shocks of increasing severity, if the learner failed to recall the correct word (which he had been instructed to do from time to time). This scenario was devised with a view to convincing the participant that the shocks were genuine (which they were not), and that the learner was actually receiving the shocks. ---------- experimental scenario the ‘package’ within which an experiment is presented to participants. ---------- ---------- confederate an accomplice or assistant of the experimenter who is ostensibly another participant but who in fact plays a prescribed role in the experiment. ---------- The independent variable is the one that is deliberately manipulated by the experimenter. All other aspects of the scenario are held constant, and the independent variable is changed systematically. ---------- independent variable the variable that an experimenter manipulates or modifies in order to examine the effect on one or more dependent variables. ---------- Operationalization refers to the way in which the variable is measured or manipulated in practice. In Milgram’s research a key independent variable was the proximity of the ‘learner’ to the ‘teacher’. In one condition, learner and teacher were in separate rooms; in a second condition, the teacher could hear the learner but could not see him; in a third condition, the teacher could both see and hear the learner’s reactions; in a fourth condition, the teacher had to hold the learner’s hand down on a metal plate in order for the shock to be delivered. All other aspects of the experimental setting were held constant, so that any variations in the teacher’s behaviour in these four conditions should be attributable to the change in proximity between teacher and learner. ---------- operationalization the way in which a theoretical construct is turned into a measurable dependent variable or a manipulable independent variable in a particular study. ---------- The success of an experiment often hinges on the effectiveness of manipulations of the independent variable. By effectiveness we mean (1) the extent to which changes in the independent variable capture the essential qualities of the construct that is theoretically expected to have a causal influence on behaviour, and (2) the size of the changes that are introduced. For example, in Milgram’s study, we should consider how well the four proximity conditions capture the construct of proximity. What is being manipulated, clearly, is physical proximity. Then there is the question of whether the changes between the four conditions are sufficiently large to produce an effect. In this case it is hard to see how the proximity variable could have been manipulated more powerfully; an investigator who adopts weaker manipulations runs the risk of failing to find the predicted effects simply because the variations across levels of the independent variable are too subtle to have an impact. It has become standard practice in social psychological experiments to include among the measured variables one or more measures of the effectiveness of the manipulation; these are known as manipulation checks. ---------- manipulation check a measure of the effectiveness of the independent variable. ---------- Assessing whether an independent variable has had an effect requires the measurement of the participant’s behaviour or internal state. This measured variable is known as the dependent variable, so called because systematic changes in this measured variable depend upon the impact of the independent variable. In Milgram’s study, the dependent variable was the intensity of shocks in a 30-step sequence that the teacher was prepared to deliver. The results of Milgram’s experiments are often expressed in terms of the percentage of participants who gave the maximum shock level (corresponding to 450 volts). The results of the Milgram (1965) study are shown in these terms in Figure 2.8. A key question to ask of any dependent variable is the extent to which it is a good measure of the underlying theoretical construct. In addition to this question of the ‘fit’ between a theoretical construct and the measured or dependent variable, the most important issue involved in designing dependent variables is what type of measure to use. We will discuss this in more detail below. FIGURE 2.8 Percentage of participants who administered the maximum shock level, and who were therefore deemed to be fully obedient. Source: Based on data reported by Milgram, 1965. Reproduction with permission from SAGE Publications. ---------- dependent variable the variable that is expected to change as a function of changes in the independent variable. Measured changes in the dependent variable are seen as ‘dependent on’ manipulated changes in the independent variable. ---------- Laboratory experiments often involve deception, in the sense that the participant is misled about some aspect of the research. The extent of this deception can range from withholding information about the purpose of the research to misleading participants into thinking that the research is concerned with something other than its real purpose. The main reason for using deception is that participants would act differently if they were aware of the true objective of the study. If Milgram’s participants had known that his was a study of obedience, we can be sure that the rate of disobedience would have been higher: the participants would have wanted to demonstrate their ability to resist orders to harm a fellow human. Attitudes to the use of deception in social psychological research have changed during the past 45 years: misleading participants about the nature of an experiment is now viewed more negatively. The reason for this change is partly moral (i.e. where possible one should avoid deceiving someone else, whether or not in the context of an experiment) and partly practical (if participants are routinely misled about research, they will enter any future research in the expectation that they are going to be misled, which may influence their behaviour). Striking an appropriate balance between being completely honest with participants and wanting to study them free of the influence of their knowledge of the nature of the experiment is difficult. Psychological research conducted in universities in Europe, North America and Australasia is typically subject to prior approval by an ethics committee that evaluates and monitors research involving human participants, and national bodies such as the American Psychological Association (APA) and the British Psychological Society (BPS) have published guidelines concerning research using human participants that should be followed by researchers (see Table 2.1). Table 2.1 Summary of ethical principles governing psychological research. Source: Copyright © 2010 by the American Psychological Association. (2010a). Ethical principles of psychologists and code of conduct (2002, amended June 1, 2010). Reproduced with permission. No further reproduction or distribution is permitted without written permission from the American Psychological Association. 8.01 Institutional approval When institutional approval is required, psychologists provide accurate information about their research proposals and obtain approval prior to conducting the research. They conduct the research in accordance with the approved research protocol. 8.02 Informed consent to research (a) When obtaining informed consent, psychologists inform participants about (1) the purpose of the research, expected duration and procedures; (2) their right to decline to participate and to withdraw from the research once participation has begun; (3) the foreseeable consequences of declining or withdrawing; (4) reasonably foreseeable factors that may be expected to influence their willingness to participate such as potential risks, discomfort, or adverse effects; (5) any prospective research benefits; (6) limits of confidentiality; (7) incentives for participation; and (8) whom to contact for questions about the research and research participants’ rights. They provide opportunity for the prospective participants to ask questions and receive answers. 8.03 Informed consent for recording voices and images in research Psychologists obtain informed consent from research participants prior to recording their voices or images for data collection unless (1) the research consists solely of naturalistic observations in public places, and it is not anticipated that the recording will be used in a manner that could cause personal identification or harm, or (2) the research design includes deception, and consent for the use of the recording is obtained during debriefing. 8.04 Client/patient, student and subordinate research participants (a) When psychologists conduct research with clients/patients, students, or subordinates as participants, psychologists take steps to protect the prospective participants from adverse consequences of declining or withdrawing from participation. (b) When research participation is a course requirement or an opportunity for extra credit, the prospective participant is given the choice of equitable alternative activities. 8.05 Dispensing with informed consent for research Psychologists may dispense with informed consent only (1) where research would not reasonably be assumed to create distress or harm and involves (a) the study of normal educational practices, curricula, or classroom management methods conducted in educational settings; (b) only anonymous questionnaires, naturalistic observations, or archival research for which disclosure of responses would not place participants at risk of criminal or civil liability or damage their financial standing, employability, or reputation, and confidentiality is protected; or (c) the study of factors related to job or organization effectiveness conducted in organizational settings for which there is no risk to participants’ employability, and confidentiality is protected or (2) where otherwise permitted by law or federal or institutional regulations. 8.06 Offering inducements for research participation (a) Psychologists make reasonable efforts to avoid offering excessive or inappropriate financial or other inducements for research participation when such inducements are likely to coerce participation. (b) When offering professional services as an inducement for research participation, psychologists clarify the nature of the services, as well as the risks, obligations, and limitations. 8.07 Deception in research (a) Psychologists do not conduct a study involving deception unless they have determined that the use of deceptive techniques is justified by the study’s significant prospective scientific, educational, or applied value and that effective non-deceptive alternative procedures are not feasible. (b) Psychologists do not deceive prospective participants about research that is reasonably expected to cause physical pain or severe emotional distress. (c) Psychologists explain any deception that is an integral feature of the design and conduct of an experiment to participants as early as is feasible, preferably at the conclusion of their participation, but no later than at the conclusion of the data collection, and permit participants to withdraw their data. 8.08 Debriefing (a) Psychologists provide a prompt opportunity for participants to obtain appropriate information about the nature, results, and conclusions of the research, and they take reasonable steps to correct any misconceptions that participants may have of which the psychologists are aware. (b) If scientific or humane values justify delaying or withholding this information, psychologists take reasonable measures to reduce the risk of harm. (c) When psychologists become aware that research procedures have harmed a participant, they take reasonable steps to minimize the harm. 8.10 Reporting research results (a) Psychologists do not fabricate data. (b) If psychologists discover significant errors in their published data, they take reasonable steps to correct such errors in a correction, retraction, erratum, or other appropriate publication means. One way to address the ethical issues that arise from the use of deception is by carefully debriefing participants. This is done at the end of the experimental session and involves informing the participant as fully as possible about the nature and purpose of the experiment, and the reason for any deception. In Milgram’s study, for example, care was taken to assure participants that the ‘shocks’ they had administered were in fact bogus, and that the learner had not been harmed in any way; the reason for the deception was also carefully explained. The debriefing process should leave participants understanding the purpose of the research, satisfied with their role in the experiment, and with as much self-respect as they had before participating in the study. ---------- debriefing the practice of explaining to participants the purpose of the experiment in which they have just participated and answering any questions the participant may have. ---------- Experimental designs When and why is it important to have a control condition in an experiment? As we have seen, it is important that participants are allocated randomly to the different conditions of an experiment. Failure to achieve this goal constrains the researcher from concluding that observed differences between conditions in the dependent variable result from changes in the independent variable. We shall now examine more closely the issue of designing experiments in order to rule out alternative inferences as far as possible. First consider a study that may appear to be an experiment but cannot properly be described as experimental. This is the one-shot case study (Cook & Campbell, 1979). To take a concrete example, imagine that a researcher wanted to know the effect of a new teaching method on learning. The researcher takes a class of students, introduces the new method, and measures the students’ comprehension of the taught material. What conclusions can be drawn from such a design? Strictly speaking, none, for there is nothing with which the students’ comprehension can be compared, so the researcher cannot infer whether the observed comprehension is good, poor or indifferent. ---------- one-shot case study a research design in which observations are made on a group after some event has occurred or some manipulation has been introduced. ---------- A simple extension of the one-shot design provides the minimal requirements for a true experimental study and is known as the post-test only control group design. Here there are two conditions. In the experimental condition participants are exposed to the manipulation (participants in this condition are known as the experimental group), and possible effects of the manipulation are measured. In the control condition there is no manipulation (here the participants are known as the control group), but these participants are also assessed on the same dependent variable and at the same time point as the experimental group. Now the observation made in the experimental condition can be compared with something: the observation made in the control condition. So the researcher might compare one group of students who have been exposed to the new teaching method with another group who continued to receive the normal method, with respect to their comprehension of the course material. An important point is that participants are randomly allocated to the two conditions, ruling out the possibility that differences between the conditions are due to differences between the two groups of participants that were present before the new teaching method was implemented. So if the measure of students’ comprehension differs markedly between the two conditions, it is reasonable to infer that the new teaching method caused this difference. ---------- post-test only control group design an experimental design in which participants are randomly allocated to one of two groups; one group is exposed to the independent variable, another (the control group) is not. ---------- ---------- experimental group a group of participants allocated to the ‘experimental’ condition of the experiment. ---------- ---------- control group a group of participants who are typically not exposed to the independent variable(s) used in experimental research. ---------- There are several other more sophisticated and complex designs, each representing a more complete attempt to rule out the possibility that observed differences between conditions result from something other than the manipulation of the independent variable (see Cook & Campbell, 1979). A very common design in social psychological experiments is the factorial experiment, in which two or more independent variables are manipulated within the same study. The simplest case that can be represented is that in which there are two independent variables, each with two levels. Combining these, you have the design shown in Figure 2.9. A factorial design contains all possible combinations of the independent variables. In the design shown in Figure 2.9, each independent variable has two levels, resulting in four conditions. The main benefit of a factorial design is that it allows the researcher to examine the separate and combined effects of two or more independent variables. The separate effects of each independent variable are known as main effects. If the combined effect of two independent variables differs from the sum of their two main effects, this is known as an interaction effect. FIGURE 2.9 Factorial experimental design involving two factors, each with two levels. ---------- factorial experiment an experiment in which two or more independent variables are manipulated within the same design. ---------- ---------- main effect a term used to refer to the separate effects of each independent variable in a factorial experiment. ---------- ---------- interaction effect a term used when the combination of two (or more) independent variables in a factorial experiment yields an effect that differs from the sum of the main effects. ---------- To illustrate an interaction effect, let us consider Petty, Cacioppo, and Goldman’s (1981) study of the effects of persuasive communications on attitude change. To test Petty and Cacioppo’s (1986a) ‘elaboration likelihood model’, a theory of persuasion (see Chapter 7), these researchers manipulated two variables. The first was argument quality, i.e. whether the persuasive communication the participants read consisted of strong or weak arguments in favour of making the university examination system tougher. The second variable was involvement, i.e. whether the participants, who were students, thought that the university would introduce the tougher exam system next year, such that it would affect them personally (high involvement), or in the next decade, such that it would not affect them personally (low involvement). According to the elaboration likelihood model, argument quality should have a stronger impact on attitudes when participants are involved with the message topic than when they are not. Figure 2.10 shows some of the key findings from the study by Petty et al. (1981). It can be seen that the effect of argument quality on attitudes was indeed much greater when involvement was high than when it was low. Because the predicted effect is an interaction, testing this prediction requires a factorial design. FIGURE 2.10 Interaction between argument quality and involvement, showing that argument quality had a much stronger effect on attitudes when involvement was high. Source: Based on data reported by Petty, Cacioppo & Goldman, 1981. An interaction effect is an instance of what is known more generally as moderation, in which the effect of one variable on another varies in strength depending on the level of a third variable. In other words, the role of the third variable is in determining when (under what conditions) the independent variable influences the dependent variable. In Figure 2.11, for example, the effect of variable X on variable O is moderated by variable Z. Specifically, X has an effect on O, but only when Z = 2 (Figure 2.11b). When Z = 1, X does not affect O (Figure 2.11a). It is important to distinguish moderation from another type of statistical relationship between three or more variables: namely, mediation. FIGURE 2.11(a) and (b) Diagram to illustrate the moderating influence of variable Z on the relationship between variables X and O. X only has an effect on O when Z has the value 2 (b). When Z has the value 1 (a), there is no effect of X on O. To understand the concept of mediation, consider that the phenomena of interest to social psychologists often entail chains of events. If we strip this issue down to its bare essentials, we can ask whether variable X influences variable O directly, or whether the relation between X and O is mediated by another variable, Z. In other words, X may have an effect on O, but this is not a direct effect. Instead, X affects Z – the mediating variable – which in turn affects O (so, X affects O via Z). Researchers are therefore not only concerned with if or when X affects O: they are also concerned with how (through what process) X affects O. In modern social psychological research, researchers therefore often attempt to measure mediating variables and then to conduct mediational analysis, for which there are well-established procedures (see Judd & Kenny, 1981a, b; Kenny, Kashy, & Bolger, 1998; Preacher & Hayes, 2004). By conducting an experiment we may establish that there is a causal relation between X and O; but we also measure Z, and find that the relation between X and Z is also very high, as is the relation between Z and O. We can then examine whether, once the X–Z and Z–O relationships are statistically taken into account, the originally established relationship between X and O becomes smaller, or disappears. This is the type of situation in which one can infer that the relationship between X and O is mediated by Z (Baron & Kenny, 1986; see Leader in the Field, David A. Kenny). This type of relationship is illustrated in Figure 2.12. In part (a), X has a direct, positive effect on O. In part (b), when the relationships between X and Z, and between Z and O, are taken into account, the direct effect of X on O becomes non-significant. Instead, the indirect effect of X on O via Z is positive and significant. FIGURE 2.12(a) and (b) Diagram to illustrate that the effect of variable X on variable O is mediated by variable Z. When Z is not taken into account, X has a significant effect on O (a). Here X has a direct effect on O. When Z is taken into account (b), the effect of X on O is non-significant. Here X has an indirect effect on O, via Z. ---------- mediating variable a variable that mediates the relation between two other variables. ---------- LEADER IN THE FIELD David A. Kenny (b. 1946) took his undergraduate degree at the University of California, Davis, followed by both his MA and PhD at Northwestern University, where his adviser was the methodological pioneer, Donald T. Campbell (see previous Leader in the Field). He then taught at Harvard University, before moving to the University of Connecticut, where he is now Board of Trustees Distinguished Professor. He has published many influential methodological papers and books, and is interested in the study of naturalistic social behaviour and models of such behaviour. He has specialized in the analysis of dyadic processes (where each person’s behaviour affects the other person’s, with the result that they have to be treated as inter- dependent, rather than independent, pieces of data). His paper on moderation and mediation (co-authored with Reuben Baron; Baron & Kenny, 1986) is one of the most-cited articles in the field. He was honoured with the Donald T. Campbell Award by the Society of Personality and Social Psychology in 2006, and is a Fellow of the American Academy of Arts and Sciences. Threats to validity in experimental research What is the difference between internal and external validity? In a research context, validity refers to the extent to which one is justified in drawing inferences from one’s findings. Experimental research attempts to maximize each of three types of validity: internal validity, construct validity and external validity. ---------- validity a measure is valid to the extent that it measures precisely what it is supposed to measure. ---------- Internal validity refers to the validity of the conclusion that an observed relationship between independent and dependent variables reflects a causal relationship, and is promoted by the use of a sound experimental design. We have already seen that the use of a control group greatly enhances internal validity, but even if one uses a control group there remain many potential threats to internal validity (Brewer, 2000; Cook & Campbell, 1979). Chief among these is the possibility that the groups being compared differ with respect to more than the independent variable of interest. ---------- internal validity refers to the validity of the inference that changes in the independent variable result in changes in the dependent variable. ---------- For example, let’s assume that in Milgram’s obedience research a different experimenter had been used for each of the four conditions described earlier, such that experimenter 1 ran all participants in one condition, experimenter 2 ran all participants in another condition, and so on. It might seem efficient to divide the work among different experimenters, but to do so in this way poses a major threat to the internal validity of the experiment. This is because the four conditions would no longer differ solely in terms of the proximity of the ‘victim’; they would also have different experimenters running them. Thus the differing amounts of obedience observed in the four conditions might reflect the impact of the proximity variable, or the influence of the different experimenters (or, indeed, some combination of these two factors). The problem is that there would be an experimental confound between the physical proximity variable and a second variable, namely experimenter identity. It is impossible to disentangle the effects of confounded variables. ---------- construct validity the validity of the assumption that independent and dependent variables adequately capture the abstract variables (constructs) they are supposed to represent. ---------- Even when we are confident that the relationship between X and O is a causal one, in the sense that internal validity is high, we need to consider carefully the nature of the constructs involved in this relationship. Construct validity refers to the validity of the assumption that independent or dependent variables adequately capture the variables (or ‘constructs’) they are supposed to represent. Even if the researcher has reason to feel satisfied with the construct validity of the independent variable, there remains the question of whether the dependent variables actually assess what they were intended to assess. There are three main types of threat to the construct validity of dependent variables in social psychological experimentation: social desirability, demand characteristics and experimenter expectancy. ---------- experimental confound when an independent variable incorporates two or more potentially separable components it is a confounded variable. When an independent variable is confounded, the researcher’s ability to draw causal inferences is seriously compromised. ---------- Social desirability refers to the fact that participants are usually keen to be seen in a positive light, and may therefore be reluctant to provide honest reports of anything which they think would be regarded negatively. Equally, participants may ‘censor’ some of their behaviours so as to avoid being evaluated negatively. To the extent that a researcher’s measures are affected by social desirability, they fail to capture the theoretical construct of interest. An obvious way to reduce social desirability effects is to make the measurement process unobtrusive: if participants do not know what it is that is being measured, they will be unable to modify their behaviour. ---------- social desirability refers to the fact that research participants are likely to want to be seen in a positive light and may therefore adjust their responses or behaviour in order to avoid being negatively evaluated. ---------- An alternative strategy is to measure individual differences in the tendency to make socially desirable responses, and then to control for this statistically. Paulhus (1984, 1991) has developed a measure known as the Balanced Inventory of Desirable Responding (BIDR). This is a 40-item self-report questionnaire designed to measure the tendency to give socially acceptable or desirable responses. It consists of two sub-scales, one measuring self- deceptive enhancement and the other measuring impression management. Examples of items from each sub-scale are shown in Individual Differences 2.1. If scores on a self-report measure (like attitudes to an ethnic minority group) are correlated with BIDR scores, this suggests that self-reports are being biased in a socially desirable direction. We should note, however, that it is better to try to eliminate completely the tendency to make socially desirable responses (rather than measuring it, and controlling for it statistically), simply because we do not know precisely what scales such as the BIDR measure. Demand characteristics (see Chapter 1) are cues that unintentionally convey the experimenter’s hypothesis to the participant. Individuals who know that they are being studied will often have hunches about what the experimenter is expecting to find. They may then attempt to provide the expected responses. When behaviour is enacted with the intention of fulfilling the experimenter’s hypotheses, it is said to be a response to the demand characteristics of the experiment. Orne (1962, 1969) suggested ways of pinpointing the role demand characteristics play in any given experimental situation. For example, he advocated the use of post-experimental enquiry, in the form of an interview, preferably conducted by someone other than the experimenter, the object being to elicit from participants what they believed to be the aim of the experiment and the extent to which this affected their behaviour. Clearly, researchers should do all they can to minimize the operation of demand characteristics, for example by using unobtrusive measures, that is, measures that are so subtle that participants are unaware of the fact that they are being taken, or by telling participants that the purpose of the experiment cannot be revealed until the end of the study and that in the meantime it is important that they do not attempt to guess the hypothesis. A cover story that leads participants to believe that the purpose of the study is something other than the real purpose is a widely used means of lessening the impact of demand characteristics. ---------- post-experimental enquiry a technique advocated by Orne (1962) for detecting the operation of demand characteristics. The participant is carefully interviewed after participation in an experiment, the object being to assess perceptions of the purpose of the experiment. ---------- ---------- unobtrusive measures (also called non-reactive measures) measures that the participant is not aware of, and which therefore cannot influence his or her behaviour. ---------- ---------- cover story a false but supposedly plausible explanation of the purpose of an experiment; the intention is to limit the operation of demand characteristics. ---------- INDIVIDUAL DIFFERENCES 2.1 MEASURING THE TENDENCY TO RESPOND IN SOCIALLY DESIRABLE WAYS Paulhus (1984, 1991) has developed a measure known as the Balanced Inventory of Desirable Responding (BIDR). On the basis of detailed pre-testing, and prediction of actual differences in behaviour, Paulhus was able to create two sub-scales, one measuring self- deceptive enhancement and the other measuring impression management. Self-enhancement refers to the tendency to think of oneself in a favourable light, whereas impression management refers to a deliberate attempt to distort one’s responses in order to create a favourable impression with others. Below are example items from each sub-scale. Self-enhancement My first impressions of people usually turn out to be right I always know why I like things I never regret my decisions I am a completely rational person Respondents who agree with such items are deemed to have positively biased images of themselves (on the grounds that it is unlikely that these statements are true for most people). Impression management I never swear I never conceal my mistakes I never drop litter in the street I never read sexy books or magazines Respondents who agree with such items are deemed to be motivated to present themselves in positive ways (again, it is unlikely that these statements are true for most people). Scores on these scales can be used to detect respondents who may be attempting to present themselves in a favourable light. If their scores on self-enhancement or impression management scales are correlated with their responses to other measures, it seems likely that the latter responses have been affected by social desirability, and these effects can be partialled out using statistical techniques. Experimenter expectancy refers to the experimenter’s own hypothesis or expectations about the outcome of the research. This expectancy can unintentionally influence the experimenter’s behaviour towards participants in a way that increases the likelihood that they will confirm the experimenter’s hypothesis. Rosenthal (1966) called this type of influence the experimenter expectancy effect (see Chapter 1). The processes mediating experimenter expectancy effects are complex, but non-verbal communication is centrally involved. An obvious way of reducing these effects is to keep experimenters ‘blind’ to the hypothesis under test, or at least blind to the condition to which a given participant has been allocated; other possibilities include minimizing the interaction between experimenter and participant, and automating the experiment as far as possible. Indeed, in much current social psychological research, the entire experiment, including all instructions to the participants, is presented via a computer. This obviously limits the opportunity for experimenters to communicate their expectancies (either verbally or non-verbally). Even if the experimenter manages to avoid all these threats to internal and construct validity, an important question remains: to what extent can the causal relationship between X and O be generalized beyond the circumstances of the experiment? External validity refers to the generalizability of a finding beyond the circumstances in which it was observed by the researcher. One important feature of the experimental circumstances, of course, is the type of person who participates in the experiment. In many cases participants volunteer their participation, and to establish external validity it is important to consider whether results obtained using volunteers can be generalized to other populations. There is a good deal of research on differences between volunteers and non-volunteers in psychological studies (see Rosenthal & Rosnow, 1975). The general conclusion is that there are systematic personality differences between volunteers and non-volunteers. Such findings are explained in terms of volunteers’ supposedly greater sensitivity to and willingness to comply with demand characteristics. The external validity of studies based only on volunteers’ behaviour is therefore open to question, and the solution to this problem is to use a ‘captive’ population, preferably in a field setting. One factor that limits the influence of this problem is the use of ‘participant pools’ in many psychology departments at large universities. Typically, these pools consist of first- and (sometimes) second-year undergraduate students who have to accumulate a set number of participant credits as part of their course. Thus, participants in studies that recruit from these pools are not, strictly speaking, volunteers. ---------- external validity refers to the generalizability of research findings to settings and populations other than those involved in the research. ---------- Another criticism of social (and indeed other) psychological experiments is that the participants are often university students. Sears (1986) examined research articles published in major social psychology journals in 1985 and found that 74 per cent were conducted with student participants. Although students are certainly unrepresentative of the general population, being younger, more intelligent and more highly educated than the average citizen, this in itself is not a threat to the validity of the research. This is because the goal of much social psychological research is to understand the process(es) underlying a phenomenon (such as attitude change or stereotyping), rather than to describe the general population (a goal for which survey research is much better suited). In any case, there is often little reason to suppose that the processes underlying a phenomenon such as attitude change or stereotyping differ in some fundamental way between students and non-students. Social psychological experiments on the Internet What are the advantages and disadvantages of web-based experiments? A relatively new development in psychological research is the use of the Internet to recruit and conduct experiments (Internet experiments). People are invited to participate in the research by visiting a website where the server runs the whole study, from allocating participants to an experimental condition to debriefing them about the nature and purpose of the study once they have completed the experimental task. Birnbaum (2000) noted that the number of experiments listed on sites such as the one maintained by the American Psychological Society has grown very rapidly, by around 100 per cent per year, and that many of these studies are social psychological. ---------- Internet experiment an experiment that is accessed via the Internet; participants access the experiment via the web, receive instructions and questions on their computer screen and provide responses via their keyboard or touch screen. ---------- What are the primary advantages and disadvantages of such web-based experiments? A major advantage is the ease with which quite large amounts of data can be collected in a relatively short time. Other advantages are that participants are recruited from different countries, from different age groups and – now that access to the Internet is more widespread – from different socioeconomic backgrounds. An obvious disadvantage is that the researcher loses a degree of control over the running of the experiment. Participants complete the study in different physical settings, at different times of the day and night, and probably with differing levels of motivation and seriousness. There are three further issues that arise. The first concerns the representativeness of those who choose to participate in an Internet study (they tend to be white, to live in the US or Europe, and to be relatively young – but not as young as those who take part in laboratory experiments). The second issue concerns the effect of linguistic competence on the reliability and validity of responses (most studies posted on the web are in English, and although the majority of respondents tend to be from the US or other English- speaking countries, some are not). The third issue is that those who choose to participate in such studies are of course volunteers, raising the possibility that they differ systematically from the general population with respect to certain personality attributes. However, it is worth noting that Gosling, Vazire, Srivastava, and John (2004) examined several preconceptions about web-based studies and concluded that most of them were myths, without empirical foundation (see Table 2.2). Table 2.2 How well do some common criticisms levelled at web-based studies stand up to empirical scrutiny? Source: Adapted from Table 1 of Gosling et al. (2004). American Psychologist, 59, 93–104. Adapted with permission of APA. Preconceptions about Internet methods Preconception Finding 1. Internet samples are Mixed. Internet samples are more diverse than not demographically traditional samples in many domains (e.g. gender), diverse. but they are not completely representative of the population. 2. Internet samples are Myth. Internet users do not differ from non-users on maladjusted, socially markers of adjustment and depression. isolated, or depressed. 3. Internet participants Myth. Internet methods provide ways of motivating are unmotivated. participants (e.g. by providing feedback). 4. Internet data are Fact. However, Internet researchers can take steps compromised by the to eliminate repeat responders. anonymity of respondents. 5. Internet-based Myth? Evidence so far suggests that Internet-based findings differ from findings are consistent with those obtained using those obtained using traditional methods, but more data are needed. other methods. Gosling and colleagues (2004) compared a large Internet sample (N = 361,703) with a set of 510 published traditional samples and were led to the conclusions summarized above. Despite the potential problems associated with running experiments on the web, the evidence suggests that Internet studies yield results that parallel those of conventional experiments (see Gosling et al., 2004). It is clear that this way of conducting experiments is going to continue to expand very rapidly. Before embarking on such research it is important to consult sources such as Nosek, Banaji, and Greenwald (2002) and Reips (2002), who offer advice about how to avoid the potential pitfalls. Problems with experimentation What are the main criticisms that have been levelled at the use of experiments in social psychology? The experimental method is central to social psychology’s status as an empirical science, largely because it provides the ‘royal road’ to causal inference (Aronson et al., 1998). Nevertheless, there are numerous critiques of experimentation in social psychology, some of which relate to the knowledge that experiments provide, and some that question the assumptions that underlie experimentation. These assumptions – shared with other disciplines that employ the scientific method – include that the subject or object under study exists independently of the researcher, and that the researcher occupies an ‘objective’ and ‘neutral’ stance when it comes to theory and analysis of the subject. Indeed, some radical critiques of experimentation – such as those that inform discourse analysis, as discussed above – reject these assumptions entirely, questioning the extent to which there is an objective, knowable reality that can be studied scientifically (e.g. Potter & Wetherell, 1987). Other critiques take a less radical stance, but nevertheless question the role of the researcher in experimental settings, and emphasize how the broader social context can shape the meaning and value of experimental findings. One problem concerns what Gergen (1978) has called the cultural embeddedness of social events, by which he means that a laboratory experimental demonstration that independent variable X has an impact on dependent variable O needs to be qualified by adding that the wider social circumstances in which X was manipulated may have played a role in producing the effects on O. Smith, Bond, and (2006) review many social psychological experiments, including the Milgram obedience experiment, that have been conducted in different countries. It is not unusual for these experiments to produce different findings in different cultural settings (see Chapter 15). More generally, cross-cultural research often highlights how supposedly universal psychological tendencies are actually linked to cultural norms and values. For example, Miller (1984) studied the tendency to attribute other people’s behavior to dispositional factors rather than contextual factors – known as the fundamental attribution error – among Americans and Indian Hindus. She found that while American adults did tend to explain others’ behaviour in terms of dispositional factors rather than situational or contextual factors, Indian adults in contrast offered contextual explanations more frequently than dispositional explanations. These differences were attributed to cultural differences in how the self is conceptualized in everyday life. A second issue is that although the ostensible goal of social psychological experimentation is the accumulation of scientific knowledge, in the form of laws or principles of social behaviour that are valid across time, there is some reason to doubt whether experimentation (or, indeed, any other method) is capable of generating evidence that could be the basis of such laws. To understand why this is the case in social sciences but not in natural sciences, bear in mind that the relationship between researcher and the object of study is radically different in these two types of science. In contrast to the natural sciences, the objects of investigation in social sciences are people, who of course attribute meaning and significance to their own actions. Social psychology cannot therefore be neatly separated from what it studies. Laypeople are able to acquire social psychological knowledge and use it to modify their actions in a way that atoms, elements and particles cannot. One implication of this fact is that even well-supported social psychological theories should not be regarded as embodying ‘laws’ that hold good across time: if learning about a theory leads people to modify the behaviour that the theory tries to explain, the theory has limited temporal validity. It is also worth noting, however, that some of the problems of accumulation of knowledge in social psychology can be addressed through the use of meta-analysis. Meta-analysis is a technique for statistically integrating the results of independent studies of the same phenomenon in order to establish whether findings are reliable across a number of independent investigations (see Cooper, 1989; Hedges & Olkin, 1985; Johnson & Eagly, 2000). The increasing use of meta-analysis in social psychology (where relevant, one is cited in every chapter of this book) has shown, without doubt, that many social psychological claims have, in fact, been confirmed over multiple experiments, often conducted over many decades. ---------- meta-analysis a set of techniques for statistically integrating the results of independent studies of a given phenomenon, with a view to establishing whether the findings exhibit a pattern of relationsh

CHAPTER 2 Research Methods in Social Psychology PDF

Document Details

Tags

Related

Summary

Full Transcript