PSYC 71 Study Guide PDF

GOOD HYPOTHESIS Logical: follows from premises that were derived from a theory ● Empirically Testable: all variables can be observed (measured) ● Refutable: able to be shown to be false ● Positive: proposes the existence of something ● Specific: generates testable predictions for specific situations ● MEASUREMENTS Measurement: a systematic procedure for assigning scores or values to individuals so that the scores or values represent some characteristic of the ● individuals Much of what we want to know about in psychology is not directly observable, but can be measured indirectly in other ways ● Constructs are the ideas we care about which cannot be observed directly ○ Variables are the things we can measure which tell us about constructs ○ The way you choose to measure a construct in a particular study is called an operational definition ● The same construct can be (and often is) operationally defined differently in different studies! ○ TYPES OF MEASUREMENT Self report: participant reports on their own thoughts, feelings, or behaviors ● Behavioral: researcher observes and records some aspect of a participant’s behavior ● Physiological: researcher measures some part or aspect of a participant’s body ● WAYS TO ACCESS VALIDITY Construct validity: did I measure what I meant to measure? ● Internal validity: can I make a causal claim? ● External validity: do my results generalize? ● Statistical validity: how well does the data support my claim? ● CONSTRUCT VALIDITY Subjective assessments of construct validity ● Face validity: appears valid ○ Content validity: represents all parts of a construct ○ Empirical assessments of construct validity ● Convergent validity: relates to other measures of the same construct ○ Discriminant validity: does not relate to measure of a different construct ○ Predictive validity: predicts a relevant outcome ○ CONSTRUCT VALIDITY FOR INDEPENDENT VARIABLES How well was your independent variable (IV) manipulated? ● Manipulation check: extra dependent variable (DV) added to check if the experimental manipulation worked as intended ○ Pilot study: simple study with a separate group of participants to confirm the effectiveness of an experimental manipulation ○ ADDITIONAL CONSIDERATIONS Reliability: How consistent is your measure? ● Test-retest reliability ○ Interrater reliability ○ Internal consistency ○ Measurement artifacts: Were your measures affected by the circumstances or design of the study? ● Experimenter bias ○ Demand characteristics ○ Socially desirable responding ○ Range effects (ceiling, floor) ○ DIRECTIONALITY PROBLEM (TEMPORAL PRECEDENCE) When two variables are related, it is unclear which variable affects (causes a change in) the other ● THIRD VARIABLE PROBLEM (INTERNAL VALIDITY) When some third variable can explain the observed relationship between two variables ● POPULATIONS VS SAMPLES Population: group of interest ● Sample: individuals observed ● Representative sample: a sample that closely mirrors or resembles the population ○ Biased sample: a sample that differs in important characteristics from the population, often due to selection bias ○ TWO KINDS OF SAMPLING Probability sampling ● Every member of a target population is identified ○ Every member has a certain non-zero probability of being selected ○ Selection is random, based on probabilities of being selected ○ Nonprobability sampling ● Sampling in which one or more of the three above criteria are not met ○ TYPES OF PROBABILITY SAMPLING Simple random: all members have an equal chance of being selected ● Cluster: identify pre-existing clusters, randomly select some of the clusters, sample everyone in the selected clusters ● Systematic: put members of a population in an order, pick a random starting point, and then choose every nth membeR ● Stratified random: identify pre-existing groups, sample all of the groups equally ● Proportionate stratified random: identify pre-existing groups, sample all of the groups proportionately ● TYPES OF NONPROBABILITY SAMPLING Convenience: recruit participants who are easily accessible ● Quota: identify pre-existing subgroups, recruit a specific number of participants from each group (using a non-random selection method) ● Snowball: ask participants to help recruit more participants by asking people they know to also participate ● Purposive: select participants on the basis of some characteristic they share (using a non-random selection method) ● SAMPLING METHOD IMPACTS EXTERNAL VALIDITY Sampling method: how is your sample selected from your population? ● Question of external validity ○ Sample size: how many people do you sample? ● Question of statistical validity ○ Random sampling: sampling method in which you identify all of the members of your population and select a random subset ● Question of external validity ○ Random assignment: researchers place participants into groups by random ● Question of internal validity ○ CENTRAL TENDENCY A key component of describing a distribution is to indicate where it is “centered” ● Three common measures of central tendency are… ● Mean = the “balancing point” of the distribution = arithmetic average ○ Median = the central observation = 50th percentile ○ Mode = the most frequent value ○ MEASURES OF SPREAD Variability: the degree to which scores in a distribution are spread out or clustered together ● The goal here is to capture how much scores deviate from the mean ○ Deviation: difference between an individual score and the mean ● Sum of squares (SS): the sum of all squared deviations ● Variance (s2): mean of the squared deviations ● Standard deviation (s): square root of the variance ● Mean > Median → Positively Skewed ● Mean < Median → Negatively Skewed ● Ordinal variables → cannot find mean or standard deviation ● Nominal variables → cannot find median, mean or standard deviation ● ESTIMATION An estimator is a process that generates an estimate of a population parameter ● Population parameters are represented with Greek letters (mu, sigma, etc.) ○ An estimate is the actual numerical guess of what the population parameter is ● Sample statistics are represented with Roman letters (X, s, etc.) ○ SAMPLING ERROR Estimates of a parameter vary from sample to sample ● Sample size (n): the number of participants in a study ○ Sampling error: difference between the sample statistic and the true population parameter ○ Law of large numbers: the larger the sample, the more representative it is of the population, and so the less sampling error we expect ○ DISTRIBUTION TRIAD Population distribution: distribution of all possible scores from all possible individuals in the population ● Sample distribution: distribution of observed scores measured from one sample of observed individuals ● Sampling distribution: distribution of all possible values of a sample statistic measured from all possible samples of size n ● CENTRAL LIMIT THEOREM (CLT) The mean of all sample means (μx̄ ) is equal to the population mean (μ) ● The standard deviation of all sample means (σx̄ ) is smaller than the population standard deviation (σ), and gets smaller as the sample size (n) increases ● The distribution of sample means (x̄) is roughly normal when the sample size (n) is large, even if the population is highly non-normal ● CHARACTERISTICS OF SAMPLING DISTRIBUTIONS OF MEANS Sampling distribution of sample means always has a mean equal to μ (population mean), no matter the shape of the population from which the samples ● are drawn As sample size (n) increases, the standard deviation of the sampling distribution decreases by a factor of the square root of n ● The sampling distribution will be normally distributed IF the population is normally distributed or the sample size is sufficiently large (>~30) ● NORMAL DISTRIBUTIONS All normal distributions share this function, but they differ according to 2 parameters: mu and sigma ● Z SCORES Z-score: a representation of a score’s deviation from a mean in terms of standard deviations ● Sign (+ or -) indicates whether the score is above or below the mean ○ Value indicates the number of standard deviations the score is from the mean ○ A linear transformation preserves the relative position of the scores, while changing the center and the scale of the distribution ● Z = (x - μ)/σ ● USING Z SCORES Describing scores in distributions… with a single number ● We can tell whether a score is high, low, or average from just a z-score! ○ Equating and rescaling entire distributions ● Mean: always 0 ○ Standard Deviation: always 1 ○ Shape: same as original distribution ○ Making scores from non-equivalent distributions comparable ● A z-score of 1.2 has the same meaning, no matter what the original distribution ○ Z-scores are “distribution-free units” ○ HYPOTHESIS TESTING Hypothesis test: a statistical method that uses sample data to evaluate a hypothesis about a population ● Based on the probability of observing some sample statistic if a certain hypothesis is assumed to be true ○ Null hypothesis (H0) ● In the population there is no change, no difference, or no relationship ○ For an experiment, this means no effect of treatment ■ Any difference observed is due to sampling error only ○ Alternative hypothesis (H1) ● In the population there is a change, a difference, or a relationship ○ For an experiment, this means an effect of treatment ■ Any difference observed is due to sampling error AND a real effect ○ UNLIKELY ENOUGH TO REJECT H0? Alpha: the probability value that we use to determine which sample outcomes are considered very unlikely if the null hypothesis is true ● This probability level is chosen by the researcher! (does not have to be 0.05) ○ Critical region: the region of the sampling distribution that contains the sample outcomes that are considered very unlikely if the null hypothesis is true ● This area depends on: what alpha is and whether the test is one- or two-tailed ○ Critical value: the value(s) that define the boundaries of the critical region(s) ● These values depend on: what alpha is and whether the test is one- or two-tailed ○ TEST STATISTICS Z (z-test): compare sample mean to population mean (sigma known) ● t (t-test): compare 2 means or comparing sample mean to population mean (sigma unknown) ● F (ANOVA): compare 2+ means ● r (correlation): evaluate relationship between 2 quantitative variables ● X2 (chi-square): evaluate relationship between 2 categorical variables ● P-VALUE (PROPORTION MORE EXTREME VALUE) A p-value is a way of describing how extreme a score is in a distribution. A p-value is the proportion of a distribution more extreme than a given score ● CHARACTERISTICS OF A TRUE EXPERIMENT Manipulation: ● The experimenter actively changes some quantity or quality of an independent variable in order to observe the subsequent effect it has on a ○ dependent variable Establishes temporal precedence ○ Control: ● Alternative explanations are eliminated by controlling confounds (any other variables that are systematically different across conditions) ○ Improves internal validity ○ DIRECTIONALITY PROBLEM (TEMPORAL PRECEDENCE) When two variables are related, it is unclear which variable affects (causes a change in) the other ● THIRD VARIABLE PROBLEM (INTERNAL VALIDITY) When some third variable can explain the observed relationship between two variables ● KEY TERMS IN EXPERIMENTS Independent variable (IV): ● Any variable that the researcher intentionally manipulates (i.e. actively changes) across conditions. The “cause” ○ Dependent variable (DV): ● Any variable that the researcher measures as an outcome of the study. The “effect” ○ Extraneous variable: ● Any variable in the context of the study that has some relationship to the DV but is not an IV or DV ○ Confound: ● An extraneous variable that is correlated with levels of the IV (i.e., different in different conditions) and can provide an alternative ○ explanation of the results A confounded experiment lacks internal validity ○ COMMON SOURCES OF CONFOUNDS Environment: setting or context differs across treatment conditions ● Example: crowding, concentration, & temperature ○ Individual differences: assignment to conditions results in groups with different personal characteristics ● Example: pool temperature, endurance, & swimming experience ○ Time-related: treatment conditions occur at different times and experience over time causes a change in the dependent variable ● Example: cell phones, safe driving, & practice ○ METHODS OF CONTROLLING CONFOUNDS Randomization: use a random assignment process to avoid a systematic relationship between the potential confound and the conditions of the study so ● that differences are due only to chance Often used to control for individual differences, sometimes for time-related confounds ○ Hold constant: do not allow the potential confound to vary at all across participants or conditions of the study ● Often used to control environmental confounds, sometimes individual differences ○ Matching/counterbalancing: ensure that the average value of the confounding variable is the same across conditions of the study by matching ● participants or counterbalancing materials Often used to control time-related confounds, sometimes for individual differences ○ DESIGN CHOICES Between-subjects design: each participant does only one condition of a study ● Eliminates all passage-of-time confounds because each person is measured only once ○ ○ Introduces individual differences as a potential confound that must be controlled because differences between people in the groups can provide alternative explanations of the results Within-subjects design: each participant does all conditions of a study ● Eliminates all individual differences confounds because each person is compared to themselves ○ Can introduce time-related confounds because each person is measured more than once, so changes between conditions can result from ○ alternative explanations GENERAL FORM OF T-STATISTIC t = (sample statistic - population parameter) / estimated standard error of statistic ● t-tests look at the ratio of differences in 2 means to overall error ● TYPES OF T-TESTS One sample: used to compare the mean of one sample to the mean of a population ● Tells us how extreme the observed sample mean is relative to all the possible sample means we could have observed if the null hypothesis ○ were true Similar to a z-test ○ Only difference is whether the population standard deviation is known (z-test) or must be estimated from the sample data (t-test) ■ Dependent measures: used to compare the means of two conditions in a within-subjects or matched-pairs design ● Tells us if the observed mean difference is significantly different from zero (a.k.a population mean) ○ Independent measures: used to compare the means of two groups in a between-subjects design ● Tells us if the observed difference of means is significantly different from zero ○ WHY DO WE USE DEGREES OF FREEDOM (N-1) When n deviations from the sample mean are used to estimate variability in the population, only n-1 are free to vary ● Because of the restriction that the sum all deviations must equal to zero ○ Only n-1 sample deviations supply information for estimating variability ● Not doing this adjustment would cause an underestimate of variability in the population ● WHAT IS ANOVA? ANOVA stands for Analysis of Variance ● Analysis here means “breaking apart” or attributing —breaking apart the different sources of variance in a population ○ ANOVA looks at the ratio of variance between 3+ groups to variance within groups (due to random error) ● Test statistic for ANOVA is F ● H(0): all means are equal ○ H(1): not all means are equal; at least one group is different from the others ○ F(df treatment, df error) = F ratio, p = p-value ○ ANOVA does not tell us which groups are different and how ● Rune poc-hoc tests when the ANOVA is significant ○ DIFFERENT TYPES OF ONE-WAY ANOVA One-way (between-subjects) ANOVA ● One factor with three or more levels, where participants are in only one of the conditions ○ Example: (1) placebo, (2) old medication, (3) new medication for treating depression ○ One-way repeated measures ANOVA (within-subjects) ● One factor with three or more levels, where participants are in all of the conditions ○ Example: people taste and rate all 4 different kinds of wine ○ FACTORIAL NOTATION Written out like: # x # x # … ● Number of factors (IVs) = number of terms in the expression ○ Number of levels in each factor = the specific value of each term ○ Number of conditions = the product of the terms ○ Example: 2 x 2 ● 2 IVs, each with 2 levels = 4 conditions ○ Mixed factorial design ● One or more of the factors is/are manipulated between-subjects, and one or more is/are manipulated within-subjects ○ Each participant experiences more than one but not all conditions ○ STATISTICAL EFFECTS IN FACTORIAL DESIGNS Main effects: the effect of one factor on average across all levels of the other factor(s); difference between marginal means ● Simple effects: the effect of one factor within a level of another factor ● Interactions: the effect of one factor depends on the levels of the other factor(s); difference between simple effects ● Parallel lines indicate NO interaction ○ CHI-SQUARE TEST OF INDEPENDENCE When to use chi-square test ● ANOVA and t-tests: categorical predictors (x), continuous outcome (y) ○ Chi-square test: categorical predictor (x), categorical outcome (y) ○ Cannot compute a mean (or standard deviation) of a categorical variable ■ Compares observed frequencies to expected frequencies ● If they are “close enough”, the test statistic is small and the null is retained ○ If they are “different enough”, the test statistic is large and the null is rejected ○ Calculating p-value ● Probability is drawn from a chi-square distribution ○ Shape of distribution is affected by degrees of freedom, df = (r-1)(c-1) ○ CORRELATION Correlation: continuous predictor (x), continuous outcome (y) ● Not interesting to compare means of two different variables ○ Tells us how well the data fits the line ○ Assesses how consistently a change in x predicts a change in y ● Look at whether two continuous variables covary ○ When one variable deviates from its mean, the other variable should deviate in the same or directly opposite way ■ Covariance vs Variance ● Covariance = mean squared cross-product (sum of products over df) ○ Variance = mean squared deviation (sum of all squares over df) ○ Pearson’s correlation coefficient (r) is a standardization of covariance ● Degrees of freedom (df): n-2 ○ LINEAR REGRESSION Linear regression: use continuous x to predict continuous y ● Tells us what that correlation line actually is (and more) ○ y = mx + b ○ Least squares: an approach to fitting a model (line) to data where the sum of the squared distances to the data is minimized ● Criterion: Y(i) = b(0) + b(1)*X(i) such that Σ[Y(i) −Y (i)]squared = min ○ Residual error: an individual’s residual is the difference between that individual’s observed value and the value predicted by the model ● Intercept = the expected value for y when x is zero; b(0) ● Slope = the expected change in y for each 1-unit change in x; b(1) ● Regression coefficient vs correlation coefficient ● The slope is unbounded because there is no limit on how much larger the sum of the products can be relative to the sum of squares x ○ The correlation coefficient is bounded between 1 and -1 because the covariance can never be larger than the square root of the product of the ○ variances on x and y CONFIRMATORY VS EXPLORATORY RESEARCH Confirmatory research ● Priori hypotheses, data independent, hypothesis testing, p-values interpretable ○ Exploratory research ● Post hoc hypotheses, data contingent, hypothesis generating, p-values not interpretable ○ Presenting exploratory as confirmatory increases publishability of results at the cost of credibility of results ● QUESTIONABLE RESEARCH PRACTICES (QRPs) Underreporting ● Problem: including multiple DVs but only reporting DVs that support hypothesis ○ Why is this a problem? ○ Familywise error rate (FWER): probability of making one or more false alarms when performing multiple pairwise comparisons ■ Multiple comparisons lead to “alpha escalation” ● Misrepresents exploratory research as confirmatory ■ P-hacking ● Problem: researchers make decisions during data analysis that lead to statistically significant results ○ Exploiting “research degrees of freedom” ■ Why is this a problem? ○ Goal of data analysis should be determining how well the data support the hypothesis, NOT obtaining a significant p-value ■ HARKing ● Problem: hypothesizing after the results are known ○ TRANSPARENCY SOLUTION Methods ● Researchers are expected to disclose every study detail ○ How they determined their sample size ■ All data exclusions (if any) ■ All manipulations ■ All measures ■ Readers are better able to evaluate strength of the evidence ○ Results ● Researchers publicly share data files, how they prepared the data, how they computed composite scores ○ Helps address underreporting and p-hacking ○ Hypotheses ● Preregistration: ○ Document decisions in advance and post to public repositories with time stamp ■ Antidote to HARKing ■ Registered report: ○ A step beyond preregistration ■ Write a plan (introduction, method, data analysis) and submit to a journal before commencing data collection ■ “Conditional acceptance” if study deemed good ● Addresses publication bias (aka the ‘file drawer problem’) ■

PSYC 71 Study Guide PDF

Document Details

Tags

Related

Summary

Full Transcript