PhD Executive Methods Lecture Notes PDF
Document Details
Uploaded by ChasteMannerism
null
Bryan D. Edwards
Tags
Related
- RSM1024 Research Methods in Psychology 1 Lecture Notes PDF
- Research Methods Lecture 2 PDF
- University of Guyana Biology Department Research Methods in Biology Week 2 PDF
- Lecture Notes: Methods in Neuropsychological Research 2024/2025 - PDF
- Lecture Notes: Methods in Neuropsychological Research (2024/2025)
- Research Methods in Cognitive Neuroscience Lecture 2 PDF
Summary
These lecture notes cover research methods, including topics on science, variables and measurement, research validity, experimental designs, and more. The notes are intended for a PhD-level executive program.
Full Transcript
Research Methods Lecture Notes Bryan D. Edwards EDWARDS —RESEARCH METHODS LECTURE NOTES PAGE 1 TABLE OF CONTENTS...
Research Methods Lecture Notes Bryan D. Edwards EDWARDS —RESEARCH METHODS LECTURE NOTES PAGE 1 TABLE OF CONTENTS Page Topic 1—Science....................................................... 2 Topic 2—Variables and Measurement....................................... 7 Topic 3—Research Validity.............................................. 15 Topic 4—Experimental Research Designs.................................. 20 Topic 5—Experimental Control........................................... 34 Topic 6—Quasi–Experimental Designs..................................... 42 Topic 7—Observational Designs.......................................... 46 Topic 8—Correlational Designs........................................... 48 Topic 9—Survey Research.............................................. 51 Topic 10—Longitudinal and Cross–Sectional Designs......................... 58 Topic 11—Meta–Analysis............................................... 61 Topic 12—Ethics in Research............................................ 64 EDWARDS—RESEARCH METHODS LECTURE NOTES SCIENCE—PAGE 2 Topic #1 SCIENCE Methods of Acquiring Knowledge 1. Science—merely one of several ways or methods of acquiring knowledge about behavior. 2. Other (nonscientific) methods—some discussed in text. A. Intuition—spontaneous perception or judgment not based on mental steps (e.g., psychics; women's intuition; bad vibes) B. Common Sense—type of intuition; practical intelligence shared by a large group of persons (e.g., "Spare the rod, spoil the child"; "Old dogs can't learn new tricks"—but elderly can and do learn) # problem is that commonsense changes from time to time and place to place according the whims of the culture. # does not explain the "why" something works (pragmatic - not theoretical); cannot generate new knowledge C. Mysticism—belief in insight gained by means of a private experience such as an altered state of consciousness (e.g., hallucinogens) D. Authority—acceptance of information because it is acquired from a highly respected source (e.g., physician recommendation of aspirin; government; religious authorities; and many so-called experts) # Science is grounded in intellectual freedom from the dogmas of authority; however, if you did not have faith in authority you would not be in this class listening to me. ! These methods all have limitations which make them inappropriate or unsuitable. For example, intuition is characterized as being extremely subjective. EDWARDS—RESEARCH METHODS LECTURE NOTES SCIENCE—PAGE 3 Working Assumptions of Science Science is based on a set of assumptions which are: 1. Realism—The philosophy that objects perceived have an existence outside the mind. 2. Rationality—The view that reasoning and logic, and NOT authority, intuition, "gut feelings", or faith, are the basis for solving problems. 3. Regularity—A belief that phenomena exist in recurring patterns that conform with universal laws. The world follows the same laws at all times and in all places. 4. Causality/determinism—The doctrine that all events happen because of preceding causes. 5. Discoverability—The belief that it is possible to learn solutions to questions posed, and that the only limitations are time and resources. Characteristics of the Scientific Approach 1. Science is empirical—obtaining knowledge through objective observation. Instead of debating with logic, just make observations. 2. Science is objective—observations made in such a way that any person having normal perception and being the same place at the same time would arrive at the same observation; it is limited to those phenomena shared by all. 3. Science is self-correcting—commitment to change based on empirical evidence. The empirical nature of science guarantees that new knowledge will be discovered that contradicts previous knowledge. (quite different from law, religion, tradition). # e.g., we once thought that the environment caused much of behavior but now know that at least some of our behavior is hereditary. # e.g., Newton's laws replaced by the theory of relativity 4. Science is progressive—compare to other things that change but are not necessarily better (fashion, music, literature, art). 5. Science is tentative—we never claim to have the whole truth on any question because new information may make current knowledge obsolete at any time. 6. Science is parsimonious—we should use the simplest explanation possible to account for a given phenomenon. "Occam's razor" 7. Science is concerned with theory—science tries to understand "why" something works whereas technology is focused on making something work (tech is usually far ahead of science!) EDWARDS—RESEARCH METHODS LECTURE NOTES SCIENCE—PAGE 4 Role of theories 1. organizing knowledge and explaining laws 2. predicting new laws 3. guiding research Advantages of the scientific method 1. The primary advantage of science is that it is based on objective observation. This is, observation that is independent of opinion or bias. 2. It allows us to establish the superiority of one belief over another. Processes (objectives) of science 1. Description – e.g., leadership types, company strategies, market share 2. Explanation (development of theories) – e.g., goal-setting increases motivation; self- regulation leads to success 3. Prediction (formulated from theories) – the search for causes; this is the "why" 4. Control - ultimately we seek to control our environments; we can use description, explanation, and prediction to solve problems Elements of the scientific process 1. Control (single most important element of the scientific process)—The ability to remove or account for alternative explanations (or variables) for obtained relationships. 2. Operational definition—Defining variables/constructs in such a way that they are measurable; this also serves to eliminate confusion in communication. Operational definitions are empirical referents that indicate/denote how a variable is to be measured. EDWARDS—RESEARCH METHODS LECTURE NOTES SCIENCE—PAGE 5 A construct that cannot be operationally defined cannot be studied—contrast charismatic leadership and social skills to age and accident involvement. It is more difficult to specifically define charisma (e.g., eye contact, voice inflection, likeability?) compared to accident involvement (e.g., number of recordable accidents). How would you define and measure charisma in a way that it can be studied? It is not impossible - merely difficult. Other examples such as ones' faith is nearly impossible to operationalize. P Are BEAUTY and ATTRACTIVENESS the same? P Is it possible to reliably measure FIRM PERFORMANCE in a way that everyone would agree? How about AGE? 3. Replication—the reproduction of the results of a study. Science also follows a basic research sequence in an attempt to answer questions. EDWARDS—RESEARCH METHODS LECTURE NOTES SCIENCE—PAGE 6 RESEARCH SEQUENCE EDWARDS—RESEARCH METHODS LECTURE NOTES VARIABLES AND MEASUREMENT—PAGE 7 Topic #2 VARIABLES AND MEASUREMENT VARIABLES Some definitions of variables include the following: 1. symbol that can assume a range of numerical values. 2. some property of an organism/event that has been measured. 3. an aspect of a testing condition that can change or take on different characteristics with different conditions. 4. an attribute of a phenomena. Types Of Variables 1. Independent and Dependent variables A. An independent variable is the condition manipulated or selected by the researcher to determine its effect on behavior. P The independent variable is the ANTECEDENT variable and has at least 2 forms or levels that defines the variation. B. A dependent variable is a measure of the behavior of the participant that reflects the effects of the independent variable. ! A variable is not ALWAYS an independent or dependent variable. The definition of a variable as independent or dependent will change (or stay the same) as a function of the particular study. 2. Continuous and Discrete variables A. A continuous variable is one that falls along a continuum and is not limited to a certain number of values (distance or time). B. A discrete variable is one that falls into separate categories with no intermediate values possible (e.g., male/female, alive/dead, French/Dutch). EDWARDS—RESEARCH METHODS LECTURE NOTES VARIABLES AND MEASUREMENT—PAGE 8 ! A distinction can be drawn between naturally and artificially discrete variables (e.g., the male/female dichotomy of sex is natural, while the young/old dichotomy of age is artificial) Commonly used methods to generate artificially discrete variables are: (1) mean split, (2) median split, and (3) extreme groups. Continuous and discrete distinctions are important because this information usually influences the choice of statistical procedures or tests. A. Pearson's correlation—assumes that both variables are continuous. B. Point–biserial—most appropriate when one variable is measured in the form of a true dichotomy, and we cannot assume a normal distribution. C. Biserial—most appropriate when one variable is measured in the form of an artificial dichotomy, and we can assume a normal distribution. D. Phi coefficient (Φ)—is used when both variables are measured as dichotomies. 3. Quantitative and Qualitative variables A. A quantitative variable is one that varies in amount (e.g., reaction time or speed of response). B. A qualitative variable is one that varies in kind (e.g., college major or sex). The distinction between quantitative and qualitative variables can be rather fine at times (e.g., normal/narcissistic and introversion/extroversion). It is usually, but NOT always true that quantitative variables are continuous, whereas qualitative variables are discrete. EDWARDS—RESEARCH METHODS LECTURE NOTES VARIABLES AND MEASUREMENT—PAGE 9 MEASUREMENT ! Definition of measurement—the assignment of numbers to events or objects according to rules that permit important properties of the events or objects to be represented by properties of the number system. ! Measurement is closely associated with the concept of operational definitions. ! To scientifically study anything, we must be able to measure it. ! To do so, we first operationally define the construct, and then use measurement rules to quantify it. This permits us to study the construct. The key is that properties of the events are represented by properties of the number system. Levels of measurement A scale is a measuring device to assess a person's score or status on a variable. The five basic types of scales (levels of measurement) are: 1. Labels When numbers are used as a way of keeping track of things without any suggestion that the numbers can be subjected to mathematical analyses P Examples include social security numbers; participant ID number 2. Nominal scale Grouping objects or people without any specified quantitative relationships among the categories P Examples include coding all males as 1; and females as 2. 3. Ordinal scale Most common type = rank order People or objects are ordered form "most" to "least" with respect to an attribute There is no indication of "how much" in an absolute sense, any of the objects possess the attribute There is no indication of how far apart the objects are with respect to the attribute EDWARDS—RESEARCH METHODS LECTURE NOTES VARIABLES AND MEASUREMENT—PAGE 10 Rank ordering is basic to all higher forms of measurement and conveys only meager information P Examples include college football polls, top 50 "Best Places to Work". 4. Interval scale Measures how much of a variable or attribute is present Rank order of persons/objects is known with respect to an attribute How far apart the persons/objects are from one another with respect to the attribute is known (i.e., intervals between persons/objects is known) Does not provide information about the absolute magnitude of the attribute for any object or person P Examples include how well you like this course, where 1 = do not like at all, and 5 = like very much. 5. Ratio scale Has properties of preceding 3 scales in addition to a true zero–point Rank order of persons/objects is known in respect to an attribute How far apart the persons/objects are from one another with respect to the attribute is known (i.e., intervals between persons/objects is known). Ratio scales are extremely rare in the behavioral and organization sciences The distance from a true zero–point (or rational zero) is known for at least one of the objects/persons P Examples includes Kelvin temperature has a nonarbitrary zero point (0K = particles have zero kinetic energy), speed (no motion) Evaluation of Measurement Methods and Instruments ! The extent to which data obtained from method fit a mathematical model. A. Reliability—consistency over time, place, occasion, etc. B. Validity—extent to which a method measures what it is supposed to measure. EDWARDS—RESEARCH METHODS LECTURE NOTES VARIABLES AND MEASUREMENT—PAGE 11 Reliability ! Reliability—refers to the consistency of scores obtained by the same person when examined with the same test (or equivalent forms) on different occasions, time, places, etc. ! For a measurement to be of any use in science, it must have both reliability and validity. ! Reliability, like validity, is based on correlations. ! Correlation (reliability [rxx] and validity [rxy]) coefficients [rxy] can be computed by the formula: ! Correlation coefficients measure the degree of relationship or association between two variables. ! A correlation coefficient is a point on a scale ranging from –1.00 to +1.00. The closer this number is to either of these limits, the stronger the relationship between the two variables. Methods for Assessing The Reliability of a Test All things being equal, the more items a test has, the more reliable it is going to be. 1. Test–retest reliability—repeated administration of the same test. 2. Alternate–forms reliability—measure of the extent to which 2 separate forms of the same test are equivalent. 3. Split–half, odd–even (or random split) reliability—The primary issue here is one of obtaining comparable halves. 4. Kuder–Richardson (KR20) reliability and coefficient alpha—These are measures of inter–item consistency, (i.e., the consistency of responses to all items on the test). P This is an indication of the extent to which each item on the test measures the same thing as every other item. P The more homogeneous the domain (test), the higher the inter–item consistency. P KR20 is used for right/wrong, true/false items. P Cronbach alpha is used for Likert–type scales. 5. Scorer reliability or inter–rater reliability—the extent to which 2 or more raters are consistent. EDWARDS—RESEARCH METHODS LECTURE NOTES VARIABLES AND MEASUREMENT—PAGE 12 Test and Measurement Validity ! The validity of a test concerns WHAT it measures and HOW WELL it does so. ! It tells us what can be inferred from test scores. ! The validity of a test cannot be reported in general terms. ! Validity depends on the USE of the test; no test can be said to have "high" or "low" validity in the abstract. ! Test validity must be established with reference to the particular use for which the test is being considered (i.e., the appropriateness of inferences drawn from data). ! For example, the SAT may be valid for predicting performance in college but will it validly predict aggressive behavior? ! Validity is a key—maybe the key criterion in the evaluation of a test or measure. The validity of a test or measure is the extent to which inferences drawn from the test scores are appropriate. Note: The square root of a test's reliability sets the upper limit of its validity. "Types" of Test Validity P Criterion–related P Content–related P Construct–related 1. Criterion–related validity—effectiveness of a test in predicting an individual's behavior in specific situations. That is, the test or measure is intended as an indicator or predictor of some other behavior (that typically will not be observed until some future date). With criterion–related procedures, performance on the test, predictor, or measure is checked against a criterion (i.e., a direct and independent measure of that which the test is designed to predict). As mentioned earlier, validity is assessed using a correlation coefficient. As such, validity coefficients can range from –1.0 to +1.0. The absolute value is used to compare different validity coefficients in terms of magnitude. EDWARDS—RESEARCH METHODS LECTURE NOTES VARIABLES AND MEASUREMENT—PAGE 13 Types of criterion–related validity A. Concurrent B. Predictive C. Postdictive P Differences between these "types" of criterion–related validity have to do with differences in time–frames in the collection of criterion and predictor data. 2. Content–related validity—For some tests and measures, validity depends primarily on the adequacy with which a specified content domain is sampled. Content–related validity involves the degree to which a predictor covers a representative sample of the behavior being assessed (e.g., classroom tests). Content–related validity involves a systematic examination of test content to determine whether it covers a representative sample of the behavior domain being measured. Content–related validity is typically rational and nonempirical, in contrast to criterion–related validity which is empirical. Content validity is based on expert judgment and will not be evaluated based on a correlation coefficient. The content domain to be tested should be fully described in advance in very specific terms. 3. Construct–related validity—The construct–related validity of a test or measure is the extent to which the test may be said to measure a theoretical construct or trait. A construct is a label for a theoretical dimension on which people are thought to differ. A construct represents a hypothesis (usually only half–formed) that a variety of behaviors will correlate with one another in studies of individual differences and/or will be similarly affected by experimental treatments. Types of construct–related validity A. Convergent validity—different measures of the same construct should be correlated or related to each other. B. Divergent or Discriminant validity—different measures of different constructs should not be correlated or related to each other. EDWARDS—RESEARCH METHODS LECTURE NOTES VARIABLES AND MEASUREMENT—PAGE 14 ! The MULTI–TRAIT/MULTI–METHOD MATRIX is the best method for assessing the construct–related validity of a test or measure. P In the example below, the Edwards Workplace Stress measure is the measure being validated. METHOD Paper–and–pencil Physiological T Edwards Workplace Stress Cortisol levels, heart rate, R Scale blood pressure Stress A I A B T C D Physiological indicators of Verbal ability Wesman Personnel brain neural information Classification Test (verbal processing efficiency subscale) (EEG—brain wave patterns) as measure of verbal ability A and B = converge (convergent validity) A and C = diverge (divergent/discriminant validity) C and D = converge (convergent validity) B and D = diverge (divergent/discriminant validity) A and D = diverge (divergent/discriminant validity) 4. Face validity—Face validity has to do with the extent to which a test or measure looks like it measures what it is supposed to; the test-taker is in the best position to evaluate the face validity of a test or measure. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 15 Topic #3 RESEARCH VALIDITY Define validity as—the APPROPRIATENESS of INFERENCES drawn from DATA A key criterion in evaluating any test, measure, or piece of research is validity. 1. data/observations—not impressions or opinions. 2. drawing inferences 3. appropriateness—implies purposes for which inferences are drawn P e.g., polygraph—lying behavior or symptoms of anxiety?? The concept of validity is used in two different ways: 1. Test and measurement validity A. criterion–related B. content–related C. construct–related 2. Research or experimental validity A. internal B. external C. statistical conclusion D. construct EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 16 Research Validity—A conclusion based on research is valid when it corresponds to the actual state of the world. Four types of research validity are commonly recognized—internal validity, external validity, statistical conclusion validity, and construct validity. 1. Internal Validity—is the extent to which we can infer that a relationship between two variables is causal or that absence of a relationship implies absence of cause. That is, is the system of research internally consistent? Do the relationships obtained follow from the research design? A study has internal validity if a cause–effect relationship actually exists between the independent and dependent variables. The difficulty is determining whether the observed effect is caused only by the IV, since the DV could have been influenced by variables other than the IV. A. Extraneous Variable—any variable other than the IV that influences the DV B. Confounding—when an extraneous variable systematically varies with variations or levels of the IV Internal validity is driven by the quality of the research design, which is defined by control (of confounding variables). Threats A. History (events outside the lab)—the observed effects between the independent and dependent variable might be due to an event which takes place between the pretest and posttest when this event is not the treatment of research interest (e.g., effects of success/failure [IV] on job satisfaction [DV] with success condition run on a sunny day and failure on a gloomy, dark, cold, rainy day [weather=EV]). B. Maturation—a source of error in an experiment related to the amount of time between measurements; concerned with naturally occurring changes (e.g., in employee development research employees tend to get better with more experience - independent of any specific employee development program). C. Testing—effects due to the number of times particular responses are measured—familiarity with the measuring instrument (e.g., increased scores on 2nd test). D. Mortality/Attrition—the dropping out of some participants before a study is completed, causing a threat to validity (e.g., effect of empowerment [IV] on performance [DV]; effect of attrition as a result of feelings of low empowerment would be an EV because only those that felt empowered would be left at the end of the study). EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 17 E. Selection—many studies compare two or more groups on some dependent variable after the introduction of an IV. "Simpler" studies like surveys just assess attitudes or opinions on an issue. In either case, sampling or selection into the study is critical. Samples must be comparable — in multi–group designs — and/or must represent the population. (e.g., survey of attitudes towards endangered species — one would probably obtain very different results as a function of sampling from individuals in the logging industry vs. members of the Society for the Protection of Baby Seals). F. Regression effects—tendency of participants with extreme scores on first measure to score closer to the mean on a second testing; a statistical threat (e.g., the highest performing firms in a given year tend to do worse in subsequent years). These threats are corrected for by randomization 2. External Validity—is the inference that presumed causal relationships can be generalized to and across alternate measures of cause and effect, and across different types, persons, settings, and times. That is, how generalizable are findings? The concern is whether the results of the research can be generalized to another situation—specifically, participants, settings, and times. Threats A. Other participants (interaction of selection and treatment)—population validity Behavioral and organizational research often uses convenience samples—"the experimentally accessible population"—question is: How representative is the typical sample of the focal group of interest?; participants are often chosen by availability. B. Other settings (interaction of setting and treatment)—ecological validity C. Other times (interaction of history and treatment)—temporal validity External validity may be increased by random sampling for representativeness ! Importance of trade–off issues between internal and external validity. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 18 3. Statistical Conclusion Validity—appropriateness of inferences (or conclusions) made from data as a result (or function) of conclusions drawn from statistical analysis. That is, are the IV and DV statistically related? Threats A. Low statistical power—this is the ability (power) of a statistical test to identify relationships when they are actually present. The larger the sample size, the greater the power. B. Violated assumptions of statistical tests C. Reliability of measures These threats can be addressed by having adequate power, meeting the assumptions of tests, and using reliable measures. 4. Construct Validity—has to do with labels that can be placed on what is being observed and the extent to which said labels are theoretically relevant. Construct validity is a question of whether the research results support the theory underlying the research. That is, is there another theory that could adequately explain the same results? e.g., is emotional intelligence a better label than communication skills for what is being studied? If the labels being used are irrelevant to the theory being researched, then the study can be said to lack construct validity. Threats A. Loose connection between theory and experiment. B. Evaluation apprehension—tendency of research participants to alter their behavior because they are being studied (e.g., Hawthorne Effect). C. Experimenter expectancies ("good–subject response")—tendency of research participants to act according to what they think the researcher wants. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 19 "Good–subject response" and to some extent evaluation apprehension can be controlled/minimized by using the following procedures A. Double–blind procedures B. Single–blind procedures C. Deception O Need to realize that the four types of research validity are interrelated and NOT independent of one another. For example: ! statistical conclusion validity is necessary for demonstration of other types of validity. ! internal validity must be achieved for construct validity to be obtained. ! and external validity depends in part upon the demonstration of at least statistical conclusion validity and internal validity. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 20 Topic #4 EXPERIMENTAL RESEARCH DESIGNS As a strict technical definition, an experiment is a study or research design in which we MANIPULATE variables. This also implies that experimental designs are characterized by RANDOM ASSIGNMENT to groups, treatments, etc. That is, the researcher has high levels of control over the WHO, WHAT, WHEN, WHERE and HOW of the study. MANIPULATION and RANDOM ASSIGNMENT are the defining characteristics of experimental designs. RANDOM SAMPLING AND RANDOM ASSIGNMENT ! Experimental designs call for random sampling (all research designs do) and at the very least random assignment to groups. ! The use of random sampling and random assignment gives the strongest case for causal inferences. 1. Random Sampling—the process of choosing a "representative" group from an entire population such that every member of the population has an equal and independent chance of being selected into the sample. Achievement of random sampling is rare in applied research. 2. Random Assignment (Randomization)—a control technique that equates groups of participants by ensuring every member (of the sample) an equal chance of being assigned to any group. Randomization is of concern in experimental research where there is some manipulation or treatment imposed. As a systematic procedure for avoiding bias in assignment to conditions or groups, if we can avoid the said bias, then we can assert that any differences between groups (conditions) prior to the introduction of the IV are due solely to chance. The principal concern is whether differences between groups AFTER the introduction of the IV are due solely to chance fluctuations or to the effect of the IV plus chance fluctuations. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 21 EXAMPLES OF SOME EXPERIMENTAL RESEARCH DESIGNS 1. Control experiment with control group and experimental group PRETEST TREATMENT POSTTEST GROUP I YES YES YES GROUP II YES NO YES 2. Control experiment with no control group PRETEST TREATMENT POSTTEST GROUP I YES A1 YES GROUP II YES A2 YES 3. Control experiment with control condition within–subjects ALL PARTICIPANTS PRETEST TREATMENT POSTTEST CONDITION I YES YES YES CONDITION II YES NO YES EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 22 4. Solomon Four–Group design Generally accepted as the best between-subjects design, but requires a large number of participants. PRETEST TREATMENT POSTTEST GROUP I YES YES YES GROUP II NO YES YES GROUP III YES NO YES GROUP IV NO NO YES Some Possible Comparisons A. Effect of treatment (I and II) vs. No treatment (III and IV) B. Effect of pretest (I and III) vs. No pretest (II and IV) C. Effect of pretest on treatment (I vs. II) Why pretest? 1. equivalence of groups 2. baseline 3. effects of testing or practice effects What does one do if there are pretest/baseline differences? 1. Difference scores 2. Analysis of covariance (ANCOVA) 3. Partial/semi–partial correlations EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 23 EXAMPLES OF SOME RESEARCH DESIGNS TO AVOID 1. One–group posttest only design TREATMENT POSTTEST GROUP I YES YES 2. One–group pretest–posttest design PRETEST TREATMENT POSTTEST GROUP I YES YES YES 3. Posttest only design with nonequivalent control groups ALLOCATION TO TREATMENT POSTTEST GROUPS GROUP I NONEQUIVALENT YES [A1] YES NATURALLY GROUP II OCCURRING GROUPS NO [A2] YES Nonequivalent control group of participants that is not randomly selected from the same population as the experimental group EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 24 WITHIN– AND BETWEEN–SUBJECTS DESIGNS 1. Within–subjects design—a research design in which each participant experiences every condition of the experiment. A. Advantages 1. do not need as many participants 2. equivalence is certain B. Disadvantages 1. effects of repeated testing 2. dependability of treatment effects 3. irreversibility of treatment 2. Between–subjects design—a research design in which each participant experiences only one of the conditions in the experiment. A. Advantages 1. effects of testing are minimized B. Disadvantages 1. equivalency is less assured 2. greater number of participants needed EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 25 SUMMARY OF KEY CONCEPTS—EXPERIMENTAL DESIGNS 1. Control Any means used to rule out possible threats to a piece of research Techniques used to eliminate or hold constant the effects of extraneous variables 2. Control Group Participants in a control condition Participants not exposed to the experimental manipulation 3. Experimental Group Participants in an experimental condition 4. Control Condition A condition used to determine the value of the dependent variable without the experimental manipulation. Data from the control condition provide a baseline or standard to compare behavior under changing levels of the independent variable. 5. Experimental Condition Treatment condition in which participants are exposed to a non–zero value of the independent variable; a set of antecedent conditions created by the experimenter to test the impact of various levels of the independent variable. 6. Within–subjects Design Research design in which each participant serves in each treatment condition. 7. Between–subjects Design Research design in which different participants take part in each condition of the experiment. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 26 CAUSAL INFERENCES ! An advantage that experimental designs have over other research designs is that they permit us to make causal inferences. ! Causation implies the ability to make statements about the absence or presence of cause–effect relationships. ! While there are several methods—some of which are discussed in the text—to experimentally identify causality, there are three conditions that must be met to infer cause. A. Contiguity—between the presumed cause and effect. B. Temporal precedence—the cause has to precede the effect in time. C. Constant conjunction—the cause has to be present whenever the effect is obtained. ! The ability to make causal inferences is dependent on how well or the extent to which alternative causes or explanations are ruled out. ! Cause—is a necessary and sufficient condition. ! An event that only causes an effect sometimes is NOT a cause. ! The assessment of causation technically demands the use of manipulation. ! Caveats to determining causality A. Concerning cause–effect relationships, it cannot be said that they are true. We can only say that they have NOT been falsified. B. The use of correlational methods to infer casual relationships should be avoided. C. Although one might find that r =/ 0 or that the regression equation is significant, this does NOT prove a causal relationship (i.e., that X caused Y to change). At best we can only say that there is a linear relationship between X and Y. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 27 ANALYSIS OF VARIANCE (ANOVA), FACTORIAL DESIGNS, MAIN EFFECTS, AND INTERACTIONS 1. Analysis of Variance (ANOVA)—is a statistical procedure used to compare two or more means simultaneously; it is used to study the joint effect of two or more IVs. The ANOVA is based on the F–statistic and can be thought of as an extension of the t– test. P t2=F P t test = 2 means P ANOVA = 2 or more means One–factor or simple ANOVA—employs only one IV. We might use two or more levels of the variable but there is only ONE IV. EXAMPLE OF (SIMPLE) ANOVA SUMMARY TABLE SOURCE df SS MS F Factor A a–1 SSa SSa/dfa MSa/MSe Error dftot–dfa SSe SSe/dfe Total N–1 where a = number of levels The assumption is that: individual's score = base level + treatment effect + effects of error where treatment effect = Factor A or IV The above design is also referred to or described as a one–factor (experimental) design, primarily because only one IV is manipulated. "Factor" is merely another term for "IV". EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 28 2. Factorial Design—a design in which 2 or more variables, or factors are employed in such a way that all of the possible combinations of selected values of each variable are used. Examples of Factorial Designs 2x2 factorial design P 2 levels IVA P 2 levels IVB 2x2x2 factorial design P 2 levels IVA P 2 levels IVB P 2 levels IVC 2x3 factorial design P 2 levels IVA P 3 levels IVB An issue that arises when we use factorial designs is that of main effects and interactions. 3. Main Effect—the effect of one IV averaged over all levels of the other IV That is, the effect of IVA independent of IVB or holding IVB constant; can also be described as the mean of A1 and A2 across levels of B A main effect is really no different from a t–test for differences between means. 4. Interaction—when the effect of one IV depends upon the level of the other IV. Two or more variables are said to interact when they act upon each other. Thus, an interaction of IVs is their joint effect upon the DV, which cannot be predicted simply by knowing the main effect of each IV separately. Main effects are qualified by interactions (interpreted within the context of interactions). Thus, the occurrence of an interaction is analyzed by comparing differences among cell means rather than among main effect means. Graphical plots are used merely to illustrate the results of the ANOVA test. We plot graphs to aid us in interpreting the ANOVA results after we have run the test. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 29 P Number of job leads by social media and whether or not your resume is posted. A experimental design investigating this issue would be a basic 2x2 factorial design. IVA = 2 types of social media (Facebook and LinkedIn) IVB = 2 levels of resume posted (yes and no) O Which conditions would lead to the most job leads? What will the data look like? O Main effects or an interaction? O Plot the data. O What would an interaction look like? How would you interpret a resume/social media interaction? O On the next 2 pages are examples of data plots that illustrate main effects and interactions of variables (a) and (b). Can you look at a data plot (AKA Figure) and interpret the main effects and interactions? EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 30 50 50 40 40 B1 B2 30 30 B2 B1 20 20 10 10 0 0 A1 A2 A3 A1 A2 A3 A main effect significant (a) B main effect significant (b) 60 60 B2 B1 50 50 40 40 B2 B1 30 30 20 20 10 10 0 0 A1 A2 A3 A1 A2 A3 Interaction effect significant (c) A and B main effect significant (d) 50 50 B2 40 40 B2 30 30 B1 20 20 B2 10 10 B1 0 0 A1 A2 A3 A1 A2 A3 A main effect and interaction significant (e) B main effect and interaction significant (f) EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 31 70 B1 60 50 40 B2 B1 30 B2 20 10 0 A1 A2 A3 A and B main effect and interaction significant (g) EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 32 RESEARCH SETTING Behavior research usually takes place in one of two settings—lab or field. A. The major distinction between lab and field has to do with the "naturalness" or "artificiality" of the setting. Field research typically employs a real–life setting (e.g., in a company) B. Another important concept associated with naturalness and artificiality is that of control. Lab settings tend to permit higher degrees of control than field settings 1. Lab Experiment A. Advantages: 1. Is the best method for inferring causality. Permits the elimination of, or control for other explanations of observed behavior. 2. Measurement of behavior is very precise. 3. Experiments can be replicated by other researchers because experimental conditions are measured and recorded. B. Disadvantages: 1. There is a lack of realism—i.e., the degree of similarity between experimental conditions and the natural environment is limited. 2. Some phenomena cannot be studied in the lab. 3. Some variables may have a weaker impact in the lab than they do in the natural environment. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 33 2. Field Experiment A. Advantages: 1. Very realistic 2. Results are highly generalizable 3. Suggestions of causal inference are possible 4. Broader research issues dealing with complex behavior in real–life contexts can be addressed. B. Disadvantages: 1. Variables are less precisely measured 2. Individuals or groups may refuse to participate 3. Often cannot gain access to "natural" (business, organization, home or other) environment. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 34 Topic #5 EXPERIMENTAL CONTROL Central issue is, of course, one of research validity. Key questions in research: 1. What are the threats to the validity of a contemplated piece of research? 2. What means are available to neutralize these threats? Control—any means used to rule out possible threats to a piece of research. EXPERIMENTAL CONTROL ! The emphasis is on ability to restrain or guide sources of variability (extraneous variables) in research. ! With experimental control, we are concerned with our ability to rule out possible threats to our study and also the extent to which we can rule out alternative explanations of the experimental results. ! Extraneous Variable—variable other than independent or dependent variables; variable that is not the focus of the experiment and may confound the results if not controlled. Strategies to achieve/enhance control 1. Random assignment to groups 2. Subject as own control 3. Instrumentation of response 4. Matching 5. Building nuisance variables into the experiment 6. Statistical control EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 35 1. Random Assignment To Groups—rules out plausible alternative interpretations due to chance. Controls for both known and unknown effects. 2. Subject As Own Control Within–subjects designs and single–subject designs. One of the most powerful control techniques is to have each participant experience every condition of the experiment. Since each individual is unique and varies in several ways that may affect the experimental outcome, participants as their own control rules out these variations. Potential Problems: A. Practice effects. B. Irreversibility of treatment effects. C. Dependability of treatments P Order Effects—changes in participants' performance resulting from the position in which a condition appears in the study e.g., movie preference, warm–up or practice effects in learning studies, P Sequence Effects—changes in subjects' performance resulting from interactions among conditions themselves. e.g., taste tests, estimating weights EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 36 Methods of Control A. Block Randomization—is a control procedure in which the order of conditions is randomized, with each condition being presented once before any condition is repeated; BCAD ADCB This procedure controls for both order and sequencing effects. B. Counterbalancing—a method of control in which the conditions are presented in one order the first time and then another. P Reverse—ABC CBA P Intrasubject P Intragroup P Complete P Incomplete 3. Instrumentation Of Response—using a measurement or instrumentation process that is objective, standardized, sensitive, and also reliable and valid. Ideally, one should be able to take subjective states (e.g., stress) and use some objective instrumentation to generate measures or scores. e.g., for stress: cortisol levels versus self–report ratings 4. Matching—procedure whereby participants are matched on some variables or characteristics of interest. Matching is typically used when it is not possible to use random assignment. So matching is basically a procedure that attempts to obtain equivalent groups in the absence of random assignment. However, if given the topic domain, it is possible to use random assignment with/after participants have been obtained based on self–selection or natural occurrence, then matching is implemented before participants are randomly assigned to groups. Have to suspect there is an important variable or characteristic on which participants differ that can be measured and participants matched. Also have to suspect that there is a relationship between the matched variable and the DV. e.g., a study to assess the effectiveness of an educational intervention program—what are some variables or dimensions we might want to match people on? Therefore, matching controls for only known effects. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 37 5. Moderator Variable—a nuisance variable (another IV) that moderates or influences the relationship between the IV and DV. A control for moderator variables is to build them into the experiment by measuring them, essentially having another IV. 6. Statistical Control A. first determine if the design permits analysis by accepted statistical methods. B. increase the statistical power of a design (e.g., increase sample size). C. use a specialized statistical technique to enhance control (e.g., ANCOVA, partial correlations). D. increase the number of trials. GREATER DEGREES OF CONTROL RESULT IN HIGHER LEVELS OF INTERNAL VALIDITY EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 38 EXPERIMENTER EFFECTS ! Rosenthal (1976)—has extensively documented the existence of experimenter effects ! An experiment is typically a social interaction/situation among people. ! The issue is that the social nature of an experiment usually results in the taking of roles which are played by the experimenter and participants. Further, that this interaction between the two can influence the outcome of the study. ! In other words, if the psychological study is a social situation involving interactions among people (researcher and participants), then the social conditions of the study also involve roles played by the researcher and participant. Thus, the interactions between these two can influence the outcome of the study. ! Basically, experimenter effects tend to be strongest when the experimenter factor is related to the experimental task that is being performed by participants. ! For example: 1. If the research focuses on judgments of organizational commitment, then if the experimenter is employed by the organization could influence responses. 2. If the study concerns smoking, then whether the experimenter appears to be a smoker or not may be an important variable. A type of experimenter effect is experimenter expectancy The experimenter is seldom a neutral participant in the research project She/he is likely to have some expectations about the outcome of the study with often strongly vested interests in the meeting of these expectations The issue is one of whether the experimenter will somehow behave in ways that bias the research results in the expected direction The concept of experimenter expectancy is different and separate from intentional bias or fraud With expectancies, the interest is in some less than conscious or unconscious nonverbal way of communicating expectations EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 39 ISSUE: what are the mechanisms that produce these effects? Some plausible hypotheses that have been evaluated and discarded are: A. obvious efforts to influence participants B. cheating C. systematic errors in recording and analyzing data The suggestion is that nonverbal communications of some sort are responsible. This seems to supported by the fact that expectancy effects have also been noticed in animal studies. e.g., Clever Hans, the counting horse (who remembers this story?) EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 40 Some other forms of experimenter–generated communication to research participants that may also result in experimenter bias may be less subtle and more blatant, although still unintended as illustrated in the participant instructions for a conflict resolution study presented below. Conflict Resolution Presented below are nine stories involving three different kinds of conflict—between individuals in personal relationships, between individuals in organizations, and between countries. You should read the story about each conflict, and read the seven styles of conflict resolution that follow each story. Your task will be to rate the desirability of each style of conflict resolution from 1 (poor) to 10 (excellent). In other words, how desirable is each of the proposed strategies for resolving the conflict presented in the story? In general, there is no one correct solution for a conflict–resolution situation. It is worth remembering, however, that more intelligent people tend to try to defuse rather than exacerbate the conflicts in which they find themselves. There are many ways to defuse the conflict, only some of which are given below. A key listing of the conflict resolution appears after the problems. Reducing experimenter bias 1. More than one experimenter should be included in the design of the study 2. Standardization of the experimenter's behavior in participant–experimenter interactions This can be done by standardizing the instructions and procedures, and in the maintenance of a constant environmental setting The use of end–of–session checklists that are completed by both the experimenter and participants to document their perceptions of the session has also been recommended 3. Double–blind procedures—are those that call for keeping the experimenter in the dark about the expected outcomes or hypothesis being tested as well as the participant. Double–blind procedures serve to minimize bias due to experimenter expectations. Both the participant and experimenter are kept in the dark and are unaware of the research hypotheses and conditions. In contrast, single–blind procedures are those in which (only) the participant is not fully informed of either the nature of the study or the conditions under which she/he is participating. These procedures may sometimes even take the form of deception. Single–blind procedures serve to reduce participant expectations but not experimenter bias or expectancies. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 41 4. Automation—automate the procedures, treatments, and all other aspects of the experiment as much as is feasible. The advantage of automation is uniformity The disadvantage is inflexibility EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 42 Topic #6 QUASI–EXPERIMENTAL DESIGNS Central issue is, of course, one of research validity Q–designs are research procedures in which participants must be and are selected for different conditions from pre–existing groups (e.g., you cannot randomly assign people to be male or female; Atlanta office or Philadelphia office; manager or nonmanager; low or high SES). They are studies in which levels of the IV are selected from pre–existing values and not created through manipulation by the researcher. In true experimental designs, participants are randomly assigned to experimental and control groups; whereas with Q–experimental designs, they are NOT!! A Q–experiment DOES NOT permit the researcher to control the assignment of participants to conditions or groups. RANDOM ASSIGNMENT TO GROUPS IS THE BASIC DIFFERENCE BETWEEN TRUE AND Q–EXPERIMENTAL DESIGNS. Q–designs are characterized by lower levels of control over the WHO, WHAT, WHEN, WHERE and HOW of the study. Although the presence of uncontrolled or confounded variables reduces the internal validity of Q–experiments, they do not necessarily render them invalid. Basically, the likelihood that confounding variables are responsible for the study outcome must be evaluated. Types Of Q–Experimental Designs 1. Nonequivalent Control Group Designs—research designs having both experimental and control groups but the participants are NOT randomly assigned to these groups this is the most typical type of Q–design problems with this type of design have to do with how to compare the results between groups when they are not equivalent to begin with. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 43 EXAMPLE—EFFECT OF WORK SCHEDULES ON PRODUCTIVITY ALLOCATION PRETEST TREATMENT POSTTEST TO GROUPS GROUP I ANY YES YES YES NONRANDOM GROUP II METHOD YES NO YES Q–designs that employ nonequivalent control groups with pre– and posttest may or may not be interpretable. Interpretability depends on whether the pattern of results obtained can be accounted for by possible differences between the groups or by something else in the study. Examples of nonequivalent control group designs A. Delayed Control Group Designs—nonequivalent control group design in which the testing of one group is deferred. P i.e., the two groups are tested sequentially with an appreciable time interval between the two EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 44 B. Mixed Factorial Designs—have one between–subjects variable and one within–subjects variable (e.g., study of trait [between] and state regulatory focus [within] and impact on job performance). State Regulatory Focus Prevention Promotion S1 S1 Prevention S2 S2.... Trait S20 S20 Regulatory Focus S1 S1 S2 S2 Promotion.... S20 S20 2. Designs Without Control Groups A. Interrupted Time–Series Designs—this design allows the same group to be compared over time by considering the trend of the data before and after the treatment. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 45 # A variation of interrupted time–series design, which is really NOT a design without control groups, is the Multiple Time–Series Designs. This is a time–series design in which a control and experimental group are included to rule out HISTORY as a rival hypothesis (e.g., compare drunk driving accidents in Michigan to Wisconsin before and after Michigan raised the drinking age from 18 to 21; Wisconsin is the control group because they retained the 18-year-old limit during the months of the study). B. Repeated Treatment Designs—this research design allows the same group to be compared by measuring subjects' responses before and after repeated treatments. QUESTIONS P Can we make causal inferences based on Q–designs? P How strong will these inferences be? EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 46 Topic #7 OBSERVATIONAL DESIGNS Research methodology in which the researcher observes and records ongoing behavior. NOTE:Observational designs can be either experimental or nonexperimental. That is, we can have observational designs with manipulation!! The manner in which the data is collected is the defining characteristic of observational designs or research. P Data collection forfeits some degree of control. Some Salient Issues in the Use of Observational Designs 1. Sampling of observations—must be randomized or nonsystematic. 2. The observer as a test—must meet all the psychometric requirements of a good test, that is, he/she must be: A. reliable—scorer and test–retest B. valid C. standardized D. objective 3. When there is more than one observer, observer characteristics become a possible extraneous variable and must be controlled; A. train raters to standardize B. build into design as a moderator variable C. select for good observers EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 47 4. Levels of observation A. naturalistic/complete observation—research conducted in such a way that the participants' behavior is disturbed as little as possible by the observation process (e.g., observing customer care representatives or mystery shopper). B. observer–participant—observations are made such that there is no interaction, but the participants are aware of the observer's presence (e.g., camera crews on the hit TV show Cops). C. participant–observer—researchers participate in naturally occurring groups and record their behaviors (e.g., an outside consultant is contracted to deliver and evaluate training; inspectors in a work crew on the job site). D. complete participant—observations made within the observer's own group (e.g.,whistleblowers, company blogs, CEO letters to shareholders). ! These four levels fall along a continuum from the least intrusive/reactive (naturalistic observation) to the most intrusive/reactive (complete participant). ! Potential Problems A. intrusiveness B. reactivity C. issues of privacy QUESTION Can we make causal inferences from observational designs? And if we can, how strong will they be? EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 48 Topic #8 CORRELATIONAL DESIGNS These are research designs where we measure two or more variables and attempt to determine the degree of relationship between them. These designs are characterized by the following: 1. no manipulation 2. low control 3. no causal inferences Do not confuse correlational designs with correlations. Can have one without the other—i.e., one does not automatically imply the other. All things being equal, the mere fact that a study uses correlations to analyze data does not necessarily make it a correlational design. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 49 Consider the news article below describing a relationship between gum disease and heart attacks. Does the article imply causation from the data in this correlational study? What other third variables could account for the relationship between gum disease and heart attacks? Gum-disease link possible By LEIGH HOPPER with 29 percent in the comparison group, Houston Chronicle Medical Writer said the study's leader, Dr. Efthymios Deliargyris, a cardiologist at the University NEW ORLEANS - Gum disease and a of North Carolina. common virus that infects more than half of Researchers believe high levels of C- American adults by age 40 may be linked to reactive protein, or CRP, maybe a link heart attacks, possibly by causing between the two conditions. CRP is a sign, inflammation, researchers said Sunday at the or marker, of inflammation and is elevated in American Heart Association meeting. heart-attack patients. Although the nature of the connections In another experiment, scientists found is not clear, scientists say the findings point hints that cytomegalovirus, a virus that lies to new directions in treatment and prevention dormant in many adults, may have a direct of heart disease. relationship to the risk a woman has of People who have a heart attack are more developing atherosclerosis, the buildup of likely to have serious inflammation of gum fatty deposits on artery walls. tissue know as periodontal disease than Researchers looked at 87 women being people with no known heart disease, evaluated for heart disease because of chest according to a study of 38 men and women pain or other abnormalities. Among the who were admitted to a hospital with a first women who tested negative for heart attack. The patients were compared cytomegalovirus, 13 percent had coronary with 38 volunteers without coronary artery artery disease. Among those who tested disease. positive for cytomegalovirus, up to 68 Eighty-five percent of the heart-attack percent had diseased arteries. patients had periodontal disease compared EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 50 TYPES—characterized by time frames 1. Predictive 2. Concurrent 3. Postdictive selective sampling archival data—no control over how data were collected—unknown quality A. Archival research—refers to research conducted using data that the researcher had no part in collecting. Archival data are those that exist in public records, or archives. The researcher simply examines or selects the data for analysis. Limitations 1. Most archival data is collected for unscientific reasons by non–experimental researchers and thus, may not be very useful, may be incomplete, and is subject to bias (e.g., SEC filings). 2. Because archival research is by nature carried out after the fact, ruling out alternative hypotheses for particular observed correlations may be difficult. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 51 Topic #9 SURVEY RESEARCH Measurement and assessment of opinions, attitudes, etc usually by means of questionnaires and sampling methods Important Measurement Issues and Potential Problems with Questionnaires and Other Self–Report Measures 1. First determine the purpose of the questionnaire. ask the target participants for useful information. anticipate questions of interpretation that may arise. 2. Determine the types of questions. A. open–ended—permits the respondents to answer in their own words (e.g., fill-in-the blank, essay, short answer). B. closed–ended—limits the respondents to alternatives determined in advance by the designers (e.g., multiple choice, ranking, matching, T/F). 3. Item writing Potentially, the questions and items themselves can have a big and major influence on how people will respond A. Determine the format of the item P true/false P multiple choice P Likert scales B. Address only a single issue per item; avoid double-barreled questions P e.g., "How much confidence do you have in President Obama to handle domestic and foreign policy?" P difficult for respondents to answer and even more difficult to interpret the responses because it is asking about 2 issues EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 52 C. Loaded items generate/produce specified responses P e.g., "College students should receive grades in their courses because this prepares them for the competitive world outside of college." P e.g., “Don’t you agree that elementary school teachers should earn more money than they currently earn?” D. Topic or issue may be "sensitive" which can also have a major influence on how people respond, so avoid bias. Under these conditions, effects of loading are even more pronounced. P e.g., two items that measure attitude towards taxes: "Are you in favor of job-killing tax hikes?" "Do you think that the rich should also pay their fair share of taxes?" E. Make the alternatives clear. P e.g., two items from the Texas Recycles Day Survey used to find out how much students know about recycling on and off campus at Texas A&M University. 9. Have you seen recycling bins on–campus? Yes No 10. If you answered yes, how often do you see them? All the time and everywhere Eventually Once upon a time F. Avoid the use of negations or double negations P e.g., All of the following are examples of family friendly polices EXCEPT: P e.g., Do you favor or oppose not allowing flexible schedules? [what exactly is being asked of respondents here?] EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 53 G. Be very careful with percentages. P e.g., What percentage of companies offer tuition reimbursement for employee development? *A. 50% B. 40% C. 30% P 50% was the keyed response to this item. Because 40% and 30% are subsumed under 50% then technically all three responses are correct. H. Avoid using response options "all of the above" or "none of the above" I. For comparison of survey responses over time the items have to be worded exactly the same each time the survey is administered. J. There are a variety of ways in which participants' own characteristics may inadvertently alter the research outcome. 1. Response Styles—tendencies to respond to questionnaire items in specific ways regardless of content. a. willingness to answer—some people will not answer items/questions they are unsure about (will leave them blank). Others will go right ahead and guess. P can usually control for this with strong instructions to answer ALL questions. b. position preference—when in doubt pick (C). P control for this by randomization of alternatives. c. acquiescence or yea– and nay–saying—tendency to consistently agree or disagree with questionnaire statements or questions regardless of content. P controlled for by using method of matched pairs (repeat item and reverse); also controlled by using bi–directional responses. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 54 EXAMPLE (OF ACQUIESCENCE) Based on the role play instructions, please respond to each question by circling the appropriate number along the line. Please circle the number along the line corresponding to how you think you would feel if you were John Smith. Good 1 2 3 4 5 6 7 Bad Tense 1 2 3 4 5 6 7 Relaxed Pleased 1 2 3 4 5 6 7 Displeased Competent 1 2 3 4 5 6 7 Incompetent Happy 1 2 3 4 5 6 7 Unhappy 2. Response Sets—tendency to respond to a questionnaire or test content with a particular goal in mind. The primary example of this is social desirability (SD)—the most common response set. P social desirability—tendency to present self in a socially desirable manner; tendency to choose specified responses even if they do not represent ones true tendency or opinion. a. Self–deception occurs when an individual unconsciously views him/herself in an inaccurately favorable light; lack of self–awareness. b. Impression management refers to a situation in which an individual consciously presents him/herself falsely to create a favorable impression. P social desirability is the tendency to overreport socially desirable personal characteristics and to underreport socially undesirable characteristics P Also a tendency to present self in test–taking situations in a way that makes self look positive with regard to culturally derived norms and standards P e.g., which of the Big Five factors would you expect to be most susceptible impression management/faking effects and why? conscientiousness EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 55 EXAMPLE (OF SOCIAL DESIRABILITY) Sample items from a test used to select/hire firefighters 5. My friends think I am slightly absent–minded and impractical. A. Yes B. Uncertain C. No 10. I prefer a job with opportunity to learn new skills. A. a lot of B. some C. little or no P Control—administer MARLOWE–CROWNE or other SD scale and partial out or drop from sample EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 56 EXAMPLE—SAMPLE ITEMS FROM MARLOW–CROWNE (RESPONDED TO AS TRUE OR FALSE). 1. Before voting I thoroughly investigate the qualifications of all the candidates. 8. My table manners at home are as good as when I eat out in a restaurant. 18. I don’t find it particularly difficult to get along with loud–mouthed, obnoxious people. 29. I have almost never felt the urge to tell someone off. Anonymous and private collection of data may also help. Major Survey Techniques 1. face–to–face interviews 90% 2. telephone interviews 80% 3. mail 30-40% 4. web-based -30-40% 5. magazine 1–2% Overall response rate for survey research is 30% The quality of the data is a direct function of the response rate. SAMPLING ! The key to the meaningfulness of any survey is the soundness of the sampling procedure used to generate respondents. ! Examples of inadequate results from poor sampling A. Classic Coke B. Major League Baseball players are selected to the All–Star game by fans at ballparks. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 57 Types of Sampling Procedures 1. Uncontrolled—researcher has no control in the selection of respondents P e.g., magazines, radio and TV call–ins P usually a very small sample—about 2% P usually biased in favor of more vocal individuals motivated to respond 2. Haphazard sampling—sampling procedure where the researcher may have some control over selection into study but it is still basically a hit–and–miss method for selecting participants P e.g., TV station sending crew out to interview people on the street with instructions to include at least 5 women, 3 teenagers, and one person in a business suit 3. Probability sampling—sampling procedures in which the researcher makes an effort to assure that each person in the population has an equal chance of being represented A. Simple random—sample chosen from an entire population such that every member of the population has an equal and independent chance of being selected. B. Stratified random—sample is chosen to proportionally represent certain segments in the larger population C. Cluster—sample is selected by using clusters or groupings from the population P e.g., sampling every employee in 10th department rather than every 10th employee (simple random) EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 58 Topic #10 LONGITUDINAL AND CROSS–SECTIONAL DESIGNS These designs are of particular interest in developmental and/or gerontological psychological research where age and/or long time lags are of interest or are important. 1. Cross–sectional Designs These are research designs in which different cohorts or individuals are tested at a given point in time. Cross–sectional designs are between–subjects designs. The primary advantage of cross–sectional designs is that they are very economical. 2. Longitudinal Designs These are research designs in which a cohort is selected and studied over a relatively long period of time with repeated measurements. The same sample/group of individuals is studied over time. Longitudinal designs are typically within–subjects or repeated measurement designs. HOWEVER, they can also be between–subjects or independent group designs. This would be the case if in studying a given cohort at each individual time of measurement, we selected a different sample from that same cohort. This is still a longitudinal design because we are studying the same cohort; and it's a between–subjects design because at each time of measurement we are selecting a different sample but from the same cohort. An advantage of longitudinal designs is their strength in allowing us to assess the change in variables/constructs over time. 3. Time Lag Designs These designs are more popular in social psychology and sociology. They permit us to investigate changes across or differences between cohorts. They furnish us with cohort descriptive data because they are intended to map out changes across cohorts holding age constant. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 59 They use several cross–sectional designs over time They still do not totally eliminate confounding Specific Threats to Internal Validity Faced by Longitudinal and Cross–Sectional Designs 1. Selective survival This is intrinsic to both cross–sectional and longitudinal designs. This threat is more critical in the older adult years. This threat is associated with changes in the population composition across time because the weaker, less competent, and less adjustable individuals have typically died off, been fired from the company, etc. This makes it difficult to make any retrospective or prospective inferences because the population is NOT the same (at different times). 2. Selective dropout This applies to longitudinal research only. This is the situation in which participants drop out of the study sample. They might, for instance, move away, lose interest in the study, die, quit their job, etc. So individuals who continue to participate may be inherently "different". EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 60 3. Practice effects or retest effects This applies to repeated measures longitudinal designs where the same individual is tested and retested on the same psychological behavior and test over a long period of time. The problem is one of participants becoming task– or testwise. Also, if the particular task or test requires the use of particular skills, then with practice gained from repeated testing over a long period of time, participants become very skilled. A vivid example of this is the Berkeley Growth Study. This was a longitudinal study on intelligence in the 1930's. Over less than 20 years participants were tested on the same or different versions of the same test more than 40 times. It seems obvious that performance on these IQ tests were inflated by practice. 4. History or cohort/generation effects Cohort—is some group that has some characteristic(s) in common; usually thought of in terms of different age groups. Cohort effect—the variable by which the cohort is grouped confounds the IV P e.g., look at the effects of age on the use of facebook; age is confounded by date of birth so that the group that grew up in the 2000's grew up using computers and facebook but their grandparents did not. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 61 Topic #11 META–ANALYSIS Meta–analysis can be described as a set of statistical methods for quantitatively aggregating the results of several primary studies to arrive at an overall summary statement or conclusion about the relationship between specified variables. Calculating the Effect Size Statistics (d or r) ! In meta–analysis, cumulating the effects across studies requires that outcomes from all studies be converted to a common metric. Two of the most common effect size statistics or metrics are d and r. For convenience, we shall focus on d. ! The d statistic provides a measure of the strength of a treatment or independent variable (e.g., different training methods). The effect size statistic, d, is the standardized difference between two means. Thus, in experimental designs, it represents the observed difference between the experimental and the control group in standard deviation units. A positive d value indicates that the experimental group performed better than the control group on the dependent variable. Conversely, a negative d value indicates that the control group performed better than the experimental group, and a zero d value indicates no difference between the groups. Small, medium, and large effect sizes are typically operationalized as 0.20, 0.50, and 0.80, respectively. Thus, a medium effect size represents half a standardized difference between means. P In terms of rs,.10,.30, and.50, are considered to be small, medium, and large effects. ! As shown in Formula 1, the d statistic is calculated as the difference between the means of the experimental (ME) and control groups (MC) divided by a measure of the variation. M E − MC d= (1) SW ! The measure of variation used in the above formula is SW, which is the pooled, within–group standard deviation. ! For studies that report actual means and standard deviations for the experimental and control groups, effect sizes can be calculated directly using these statistics. For studies that report other statistics (e.g., correlations, t statistics, or univariate two–group F statistics), the appropriate conversion formulas can be used to convert them to ds. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 62 Cumulating Effect Sizes Across Studies ! Mean sample–weighted effect sizes (d) can be calculated using Formula 2 below: d= ∑ diNi (2) NT where d is the mean effect size; di is the effect size for each study; Ni is the sample size for each study; and NT is the total sample size across all studies. Sample weighting assigns studies with larger sample sizes more weight and reduces the effect of sampling error since sampling error generally decreases as the sample size increases. P EXAMPLE Study Ne Me SDe Nc Mc SDc Ntot Sw d dn 1 29 4.38 0.62 31 4.36 0.50 60 0.56 0.04 2.14 2 30 4.70 0.74 30 4.30 0.92 60 0.83 0.48 28.75 3 44 4.41 0.69 44 4.34 0.78 88 0.74 0.10 8.37 4 36 4.44 0.73 40 4.30 0.69 76 0.71 0.20 15.00 5 17 4.53 0.51 18 4.28 0.96 35 0.78 0.32 11.29 6 30 4.57 0.82 30 4.10 0.96 60 0.89 0.53 31.59 7 32 4.75 0.44 28 4.54 0.58 60 0.51 0.41 24.71 8 30 4.77 0.50 30 4.57 0.63 60 0.57 0.35 21.10 9 44 4.70 0.46 44 4.52 0.70 88 0.59 0.30 26.74 10 42 4.69 0.56 40 4.63 0.54 82 0.55 0.11 8.94 11 18 4.50 0.62 18 4.28 0.96 36 0.81 0.27 9.80 12 30 4.77 0.50 30 4.60 0.56 60 0.53 0.32 19.21 13 45 4.85 0.40 45 4.68 0.50 90 0.45 0.38 33.79 14 60 4.79 0.42 50 4.37 0.45 110 0.43 0.97 106.48 15 25 4.90 0.70 25 4.27 0.69 50 0.70 0.91 45.32 3 1015 393.23 Ne & Nc = sample sizes of experimental and control groups, respectively Me & Mc = mean of experimental and control groups, respectively SDe & SDc = standard deviations of experimental and control groups, respectively Ntot = total sample for specified study d= ∑ diNi d= 393.23 d = 0.39 NT 1015 EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 63 Advantages of meta–analysis 1. Allows researchers to give more weight to studies with larger samples and therefore, more stable effect size estimates. 2. Focuses on the magnitude of effects instead of the statistical significance of effects; significance tests have recently been critiqued as being the/a major stumbling block to scientific progress in psychology, and the social sciences in general. 3. Uses a common metric to aggregate effect sizes across studies. Therefore, meta–analysis summarizes IV/DV effects across multiple studies to reach a population–level overall conclusion about specified effects. EDWARDS—RESEARCH METHODS LECTURE NOTES PAGE 64 Topic #12 ETHICS IN RESEARCH SOME TERMS, CONCEPTS, AND DEFINITIONS 1. Ethical—A piece of behavioral research is ethical when the benefits and relevance of research balance costs in time and risks of harm to participants, when their interests and well–being are respected, and when they are properly informed about the nature of the research and the voluntary nature of their participation. 2. Research Ethics are guidelines to research decision–making. 3. Informed Consent and Voluntary Participation—an agreement with participants that clarifies the nature of the research and the responsibilities of each party. should be documented in writing ensures that participation is VOLUNTARY 4. Deception The first and primary concern should be the welfare of participants. Is the use of deception absolutely necessary to accomplish research objectives? Weigh cost of potential costs to potential gains. When deception is used, one should not only debrief, but should desensitize as well. 5. Debriefing or Dehoaxing—debriefing participants about any deception that was used in the study. This also increases their understanding. 6. Desensitizing—eliminating any undesirable influences that the experiment may have had on participants—i.e., debriefing participants about their behavior. One should not either debrief or desensitize when doing so will result in more harm than NOT doing so. 7. Ethical Dilemma—investigator's conflict in weighing potential cost to participant against potential gain to be accrued from the research project.