Final Exam Review - Psych 2910 PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document is a review of lecture materials for a psychology course (Psych 2910). It covers topics such as knowledge, scientific methods, and research, with examples of different approaches and types of research.
Full Transcript
Lecture 2: Knowledge Metaphysical Systems: Supernatural Explanations: - Attribute behaviour to non-physical forces such as spirits and deities Animism: natural phenomena are alive and influence behaviour - Possessing an eagle’s feather confers some eagle properties to the owner Mythology...
Lecture 2: Knowledge Metaphysical Systems: Supernatural Explanations: - Attribute behaviour to non-physical forces such as spirits and deities Animism: natural phenomena are alive and influence behaviour - Possessing an eagle’s feather confers some eagle properties to the owner Mythology and Religion: Also has the idea that non-physical forces determine what people do - Built on different assumptions than science Astrology: Human behaviour is determined by the activity of celestial bodies; astronomical phenomena and events predict human behaviour Philosophical Systems: - Became more focused on logic and empirical observation Empiricism (David Hume): the only sound basis for knowing is to make observations Positivism (Auguste Comte): the only sound basis for knowing is to make observations with only sense perceptions Methods of Acquiring Knowledge: Intuition: A thing that one knows or considers likely from instinctive feeling rather than conscious reasoning - Absence of conscious reasoning - Major problem: conclusion drawn without sufficient evidence Authority: Believing what “authority” figures tell you - Major problem: is the “authority” really an “authority”? Scientific Skepticism: the act of suspending judgment (the opposite of jumping to conclusions) when evaluating an explanation or claim. - It allows scientists to consider all possibilities and systematically question all information during an investigation. Science: Unlike most other methods of acquiring knowledge, science demands that you change your view when appropriate new evidence is forthcoming. Determinism: The universe is orderly - What was held yesterday will be held today - Events have meaningful, systematic causes (cause and effect) Causation: Covariation of cause and effect: When a cause is present, the effect is present; when a cause is absent, the effect is absent Temporal precedence: Cause precedes effect Elimination of Alternative Explanations: Nothing other than the stated causal variable could be responsible Goals of scientific psychology: - Describe behaviour - Predict behaviour - Determine the cause of behaviour - Explain behaviour Types of Research: Basic: Answer fundamental questions Applied: Address practical problems Four Keys of Research: 1. Replication: Describe research in sufficient detail that someone else can replicate the study and the results 2. Testability/Falsifiability: Not interested in ideas that cannot be tested or falsified (e.g., most of Freudian theory) 3. Peer Review: Try to ensure studies with flaws are not published 4. Adversarial Process: A process by which two adversaries, or people with opposing views, sit together, try to understand the theories of the others, and try to come up with experiments, experimental designs, and predictions, that compare and test different hypotheses. Pseudoscience: - Hypotheses are not falsifiable - Methodology is not scientific - Evidence is anecdotal or appeals to authority - Lack of citations to peer-reviewed articles - Not published in a peer-reviewed journal - Claims not revised to account for new data - Ignores conflicting evidence Biorhythms: Human behaviour is governed by physical, emotional, and intellectual cycles lasting 23, 28, and 33 days, respectively Homeopathy: A substance that causes the symptoms of a disease in healthy people and cures similar symptoms in sick people Phrenology: Personality traits predicted by bumps on the skull Critical Evaluation: - Evaluate the source of the data (reputable peer-reviewed journal? potential conflicts of interest? source of funding?) - Evaluate the methods (was the study conducted appropriately?) - Evaluate the analyses (are the statistical analyses correct?) - Evaluate the conclusions (do the conclusions follow from the analyses?) Reputable peer-reviewed journal: Often associated with scholarly/professional societies, well- known universities, well-known major publishers, articles that are cited in top journals Lecture 3: Intro to Statistics Why Statistics? - We don’t trust ourselves: Biases affect our perception of whether data shows strong result; Need to understand the metric of the data to be able to interpret it - When we study complex systems, we cannot control for every extraneous variable (statistics can provide for a different kind of control) - Some phenomena cannot be studied directly (statistics can find patterns in the data that are not obvious to the naked eye) Statistics: A way of understanding data; a decision-making process (data = plural, datum = singular) Descriptive Statistics: numbers that describe data (mean, median, standard deviation) Inferential Statistics: making inferences (educated guesses) about a population based on data from a sample Theory: a general statement about the relation between two or more variables - Organize and explain: Provides a framework for organizing otherwise disparate findings and offers an explanation - Generate new knowledge: If you cannot change action, will change belief Hypothesis: a testable prediction about specific events Key questions: - Who was studied? (Random sampling? Biased sampling?) - Why did subjects participate? (Money? Employees?) - Compared to...? (Was there an appropriate control group?) - How many? (Results based on 2 people? 2,000 people? or rats?) - How were the questions worded? - Are statements about causation appropriate? - Who paid for the study? - Is the study published in a reputable peer-reviewed journal? Population: The set of all individuals of interest Sample: The individuals from the population who were tested Independent Variable: The variable that the experimenter manipulates Dependent Variable: The variable that the experimenter measures Construct: An internal attribute or characteristic that cannot be directly observed Operational Definition: specifies concrete, replicable procedures designed to represent a construct (i.e. hunger: hours since last meal) - Allow other researchers to use same manipulation/study same variable Discrete variable: Indivisible categories (i.e. number of children in a family) Continuous variable: height, weight, age, time (measurement is discrete, but variable is continuous) Dichotomous variable: only 2 values Measurement Scales: - Need different analyses for different types of scales Nominal Scale: Conveys information only about the category - Can calculate frequency - Cannot have an intermediate value (e.g., 1.5) - The value of ‘1’ is not necessarily better or worse than ‘2’ Ordinal Scale: Nominal scale + information about order - Says nothing about the size of the intervals, only the order of the intervals Interval Scale: Ordinal scale + information about interval - Zero point is arbitrary rather than an absence (0° C does not mean the absence of temperature) - Probably interval scales: IQ, personality measures, clinical tests - Likely violates the assumption of equal intervals (but for good tests, intervals are approximately equal) Ratio Scale: Interval scale + absolute zero - Zero means an absence Lecture 4: Central Tendency Mean: Arithmetic mean or “average” - The sum of the scores divided by the number of scores - Computationally simple - Most frequently reported - Can be misleading (affected by extreme scores) Characteristics: - Changing a score changes the mean - Adding or subtracting a score changes the mean (except when the score is the mean) - Adding (or subtracting) a constant value to each score adds (or subtracts) that value to the mean - Multiplying (or dividing) by a constant value has the same effect on the mean - The sample mean is an unbiased estimator of the population mean (no tendency to under- or overestimate population value) Mode: The most frequent value - The score or category that has the greatest frequency - If two scores are tied for most frequent, then the distribution is bimodal - If more than two scores are tied for most frequent, there is no mode to report When should the mode be used? - Nominal scales - Discrete variables Median: The score that divides the distribution in half - The 50th percentile - Half the scores are above and half the scores are below - If there are an odd number of scores, the median is the middle value - If there is an even number of scores, the median is the point midway between the two middle scores - If the data is not cooperative, you need to calculate the precise median - Computationally difficult - Frequently not reported when it should be (Excel (and most other spreadsheets) do not report the precise median) When should the precise median be used? - Extreme scores or skewed distributions - Undetermined values (e.g., timed out) - Open-ended distributions (e.g., 5 or more) - Ordinal scale (e.g., rank) Real Limits: Boundaries of intervals for scores on a continuous scale Sample: Roman alphabet Population: Greek alphabet Geometric mean: usually used when scores can change relatively, e.g., filters will trap dust in an amount relative to the amount of air flowing through them Harmonic mean: usually used for rates (e.g., speed) Lecture 5: Variability Variability: A quantitative measure of the degree to which scores in a distribution are spread out or clustered together Range: The difference between the high score and low score - Indicates some additional info about scores: - “small” = closer to mean - “large” = further away from mean - Influenced by extreme scores Interquartile Range: Range covered by middle 50% (Q3-Q1) Standard Deviation: Describes the “average” or “typical” or “representative” distance from the mean - The most widely used measure of variability Lecture 6: Descriptive Statistics Displays of Data: - Document sources of statistical data and their characteristics - Make appropriate comparisons - Charts and tables may not always be effective: Displaying cause and effect - Recognize the inherent multivariate nature of problems - Inspect and evaluate an alternative hypothesis Frequency Distribution: an organized tabulation of the number of individuals located in each category on the scale of measurement Three common distributions: Lecture 7: Ethics CIHR: Canadian Institutes of Health Research NSERC: Natural Sciences and Engineering Research Council SSHRC: Social Sciences and Humanities Research Council of Canada Key Concerns: - Is the subject being put at risk? - Is the subject fully informed about the research and able to provide informed consent? “Experiments” conducted in Germany during World War II led to: - Nuremberg Code of Ethics (1947) - Helsinki Declaration (1964) - Belmont Report (1979) Milgram Study: Learner: actually a confederate Teacher: actually the subject Experimenter: ordered the Teacher to “correct” the Learner by administering increasingly high levels of electric shock Results: Full obedience: Up to highest shock level - Overall: 65% - Adult women 65% - Proximity (same room) 40% - Touch proximity 30% - Experimenter absent 22.5% Hoffling (1966): - Nurses were telephoned by a physician they did not know - Ordered to administer a nonprescribed drug in double the maximum dosage to a patient - 21 out of 22 nurses (95.5%) followed the doctor's orders Core Principles: Respect for persons - choose participation freely and without interference Concern for welfare - minimize risks and maximize benefits - individual should be free to decide whether the balance of risks/benefits is acceptable Justice - Treat people fairly and equitably Risk vs. Benefit Analysis: Psychological and physiological harm vs. benefits to both individual and society Potential Risks: Physical Harm: Sleep deprivation, injecting drug or placebo Stress: Providing unfavourable feedback about personality, intelligence, etc., to lower self- esteem; asking about traumatic events Loss of privacy (right to control information ) and confidentiality (keep info secret or anonymous) Legal issues - data collected in Canada but hosted on US servers may violate Canadian ethics policies Informed Consent: Present all the information necessary to make a free and informed decision about whether to participate Vulnerable populations: - Infants and children - Alzheimer’s research - Prisoners - Students in a classroom Explicit coercion: - $200 for 1 hour of participation - Reduced sentence for participation Implicit coercion: - Honours student to roommate: “I need only one more person in my study to finish my thesis.” - Instructor to students: “I’m running a study and need volunteers to participate.” Withholding Information: - Providing too much information can invalidate the study - Researchers usually withhold some information about the hypothesis under investigation - Generally considered OK as long as the info does not affect their decision to partake in the study Deception: Active misrepresentation of information - Most common in social psychology - Usually in form of cover story - Usually not nearly as severe as Milgram (because ethics boards won’t allow it) Debriefing: Opportunity for researchers to deal with potentially harmful effects of withholding information, deception, and effects due to participation itself - If research alters physiological or psychological state, the debriefing process is when original state is restored or additional help (medical, psychological) is offered HREB: Health Research Ethics Board ICEHR: Interdisciplinary Committee on Ethics in Human Research ACC: Animal Care Committee Minimal Risk: Risks of harm are no greater than those encountered in everyday life Greater than Minimal Risk: Vulnerable populations; sensitive questions Canadian Council on Animal Care: sponsored by CIHR and NSERC to “oversee the ethical use of animals in science in Canada” Animal Care Committee: Must have at least one of each of the following on the committee: - experienced scientist - institutional member who does not use animals - experienced veterinarian - community member Animal Research: 3 Rs Replacement - Replace use of animals with other techniques if possible (modelling, meta- analysis, simulation) Reduction - Minimize the number of animals used Refinement - Refine procedures to minimize stress and pain Diederik A. Stapel: A Dutch social psychologist who made up dozens of papers Mark Hauser: Evolutionary biology and cognitive neuroscience at Harvard Daryl Bem: A US social psychologist reported evidence that people could predict the future at a better-than-chance rate under some circumstances Plagiarism: Using someone else’s work without giving credit to the original source Not making data available: - Hinders independent re-analysis - Prevents additional analyses (e.g., meta analysis) Lecture 8: Percentiles - The 90th percentile is always better than the 30th percentile - 90th: 90% of scores are at or below this point - 30th: 30% of scores are at or below this point Real Limits: - Scores are usually measurements of a continuous variable but measurement is usually discrete – this is why we have to consider upper and lower limits Precision of Estimate: Raw Data: - The middle value is 64 - There are 12 scores above 64 - There are 12 scores below 64 - Therefore 64 is the median Frequency Table: - Can get different answers depending on the interval width - Larger interval width = less precision (less info) - Smaller interval width = more precision (more info) Lecture 9: Standard Scores (z-scores) Percentile: Useful because it indicates where a score falls within a distribution Standard deviation: measure of data dispersion relative to the mean – smaller is better (data closer to the mean) Metric: - The metric of a variable is how we understand what the numbers mean (i.e. cm vs. inches) - Often with interval measures, the metric is defined in terms of the mean and standard deviation and is often arbitrary - e.g. IQ - We can rescale the metric by doing a linear transformation Linear Transformation: - When you multiply, divide, add, or subtract a constant from each score in a distribution, the mean and/or standard deviation can change - Changing the metric like this won’t affect any inferential statistics Add or Subtract: - If you add or subtract a constant to each score in a distribution then the mean of that distribution will change by that constant (the standard deviation will not change) Multiply or divide: - If you multiply or divide each score in a distribution by a constant then the mean and standard deviation of that distribution will change by the same multiple Standard score: - A number in a set where the mean and standard deviation of the set are already known (i.e. a pre-defined metric) z-score: - The classic standard score - Mean = 0 - Standard deviation = 1 - Standardizing = converting to z-scores - Tell you how many standard deviations each score is away from the mean - A z-score near 0 indicates that the score is at or near the mean - A z-score of -1 indicates that the score is 1 standard deviation below the mean - A z-score greater than + or -3 is extremely large and should be examined to determine if it is an outlier Two transforms: - Frequently do both - Raw score → z-score - Z-score → raw score - To transform to a new distribution → convert to z-score → convert to raw score Standard vs. normal: Normal distribution – distribution has a bell shape to it → often called the z distribution, and z- scores are used to find points on the distribution - Approx. 66% of scores are between + or -1 standard deviations of the mean - Approx. 95% scores are between + or -2 standard deviations of the mean - Approx. 99% scores are between + or -3 standard deviations of the mean - Therefore -3 < most z-scores < +3 Lecture 10: Sampling Population – all potential participants Sample – the people in the study Good samples have two characteristics: 1. They are representative: each member of the population has an equal chance of being in the sample 2. They have more people: larger samples are more likely to be representative of a population than smaller samples - Statistics cannot correct for a non-representative sample Sampling error: - The natural discrepancy or amount of error between a sample statistic and a population parameter - If you take two random samples from the sample population: - Means are likely to differ - Which sample gives the best estimate? How can we determine the probability of obtaining any specific sample? - Examine the distribution of sample means - Describes the entire set of all possible sample means for any sized sample Distribution of sample means: - Central limit theorem (CLT): regardless of the original distribution of the data distribution of sample means will approach a normal distribution as sample size increases (usually around n = 30 or more) - It allows us to use normal probability techniques even when the original data isn't normally distributed. - It allows us to make inferences about populations based on sample statistics - The collection of sample means for all possible random samples of a particular size (n) from a population 1. Sample means should pile up near the population mean 2. Pile of sample means should be similar to a normal distribution 3. The larger the sample size, the closer the sample means should be to the population mean - Will be almost perfectly normal if either of the following holds: 1. The population from which the samples are selected is a normal distribution 2. The number of scores (n) in each sample is relatively large (approx. 30 or more) Expected value of M: - The mean of the distribution of sample mean is equal to the population mean and is called the expected value of M - Mean of sampling distribution equals the mean of the sampled population Standard error of M: - Describes distribution of sample means - When small, all sample means are close together - Measures how well an individual sample mean represents the entire distribution - As sample size increases, standard error systematically decreases - measures how much the sample mean is expected to vary from the true population mean - What factors can lead to a sampling error? Small sample size, non-representative samples, random variablility Central tendency = mean Variability of distribution = standard error - Standard deviation of distribution of sample mean = standard error Lecture 11: Inferential statistics Descriptive statistics: describes the characteristics of a dataset, such as means, medians, modes, and standard deviations. They focus on presenting the data as it is. Inferential statistics: making appropriate conclusions about populations based on samples drawn from the population Hypothesis: a testable prediction about specific events Research hypothesis: administering Drug A will change systolic (maximum) blood pressure compared to administering a placebo Null hypothesis: assume the absence of something - Easy to disprove negative assumptions - Can be impossible to disprove positive assumptions p-value: the probability of obtaining the observed results, assuming that the null hypothesis is true. The lower the p-value, the greater the statistical significance of the observed difference. What is the p-value? - Probability of making a Type 1 error (null is true) Lecture 12: One-Sample t-tests and Dependent Samples t-tests - We can use the z-test whenever we can trust that the sampling distribution is normal, and we can calculate the standard error. - However, if we don’t know the population standard deviation, then we cannot calculate the standard error of the mean. - If we do not know the population standard deviation, we can estimate it from s (i.e., we can estimate the population standard deviation by using the sample standard deviation. - By doing this we are introducing some error in our calculation by estimating - When we do this, we change the shape of the sampling distribution of the mean. It is no longer z-distributed. Instead, it follows the t distribution. The t-distribution (use when population standard deviation is unknown): - The t-distribution is a squashed version of the normal distribution. Its tails are fatter. - The t-distribution is a family of distributions. The curve is slightly different for each degree of freedom (degrees of freedom for a one-sample t-test is n – 1) - As degrees of freedom increase, the t-distribution becomes slightly less squished. At infinite degrees of freedom, it is the same as the normal distribution. Independent groups: between-subject design - IV is manipulated between subjects - a given subject is either in the control group or in the experimental group Dependent groups: within-subject design - IV is manipulated within subjects - a given subject is in both the control condition and the experimental condition - The Picture Superiority Effect study is a within-subjects design — everybody experienced all the conditions Difference Scores (dependent samples t-test): - If the same person is measured on the same measure in two different conditions, then we can calculate a difference score for each person, which is their score in one of the conditions subtracted by their score in another condition. - By doing this, we are now taking two sets of scores and reducing it to one set of score, the difference scores. Notes on dependent samples t-tests: - The degrees of freedom you use is not based on the number of scores that you have (i.e., 81 midterms and 81 finals = 162 scores), but instead on the number of difference scores you have. - The SD that we use to calculate the standard error is the SD of the difference scores. Lecture 13: t-Test for Independent Groups A controlled experiment has: 1. Random sampling from the population 2. Random assignment of conditions from the sample How to compare the mean of an experimental group and the mean of a control group? - The independent t-test Assumptions: Independent Groups: Participants in each group must be different, a between-subject design - If participants are in both groups (a within-subject design), use a dependent group t-test Normality of Dependent Variable: The sample dependent variable should be normally distributed Homogeneity of Variance: The variance of the two groups will be equal or approximately equal An in situ design has: 1. Random sampling from the population to form multiple samples 2. Each sample is in a different group Independent sample t-test: - The logic of the independent samples t-test is the same as the one sample t-test. You take the difference between the means, divide it by the standard error, and check it against a tcritical defined by df. - The main difference is that your standard error is not the standard error of the mean, but the standard error of the difference of means Pooled Variance: The pooled variance is a weighted average of the two sample variances - It is weighted by df, not by n Directional (one-tailed) Test: - Use non-directional (two-tailed) test unless you have a compelling reason to do otherwise - Will have difficulty convincing reviewers/editors that one-tailed is genuine - Bias is to view one-tailed as last attempt to change a non-significant result into a significant result Two-tailed: - More conservative - Less likely to result in Type I error - More likely to miss a real (but small) difference One-tailed: - More liberal - More sensitive to real (but small) differences - Also more likely to result in Type I error Lecture 14: Correlation Correlation: A statistical method that determines the degree of linear relationship between two variables - A bivariate statistic - r is a sample correlation - p is a population correlation Causation: Correlation does not let you make any statements about causation 4 Correlations: Pearson’s r - two continuous variables Spearman’s p - two ordinal rankings Point-biserial rpb - one continuous and one dichotomous variable Phi correlation - two dichotomous variables Before calculating a correlation, plot the data to see if there are any potential issues: - Restricted range - Outlier - Nonlinearity Correlation: Assumptions The relationship is linear: - always examine scatter plot first - r will be low and/or nonsignificant for nonlinear - Underestimates population correlation - Possibility of Type II Error Homoscedasticity: - On average, all points are equally distant from the line (Contrast with: Heteroscedastic) - Underestimates population correlation - Possibility of Type II Error R^2 - Coefficient of Determination: 1. The proportion of variance in one variable accounted for by the other variable Ex. if a and b are correlated with r =.50: - r2 =.5 ×.5 =.25 - Variable a explains 25% of the variation in b - 75% of the total variance between variables a and b remains unexplained (uncommon, unshared, or unexplained variance) 2. A measure of strength between two or more r values Regression Analysis: - Correlation looks at two variables symmetrically (r for x and y is same as r for y and x) - Regression relabels them: - x = independent variable/predictor - y = dependent variable/criterion - Focus: How changes in x affect predicted score in y Simple regression: - 1x - Math performance (y) as a function of socioeconomic status (x). - Related to correlation - The means of y and x are always on the regression line Use x scores to predict y scores (y = a + bx) - y = dependent variable - a = y intercept (point where line crosses y axis when x = 0) - b = slope or angle of line - x = independent variable - Determine line by least squares (smallest total deviation of points) Multiple regression: - 2 or more xs (e.g., x1, x2 ,...) - Math performance (y) as a function of sex (x1 ), age (x2), and socioeconomic status (x3 ) Spearman’s Correlation: The degree of relationship for two sets of ranked (ordinal) data Point-Biserial Correlation: rpb is the degree of relationship between a dichotomous variable and a continuous variable - The computational formula for point biserial gives the same result as the Pearson r formula - Easier to compute Phi (𝜙)Correlation: Degree of relationship between two dichotomous variables Correlation Summary: 1. Look at the scatter plot (linear, range, outliers) 2. Choose the appropriate test: - Pearson: 2 continuous variables (compute t for significance) - Spearman: 2 ordinal rankings (use table for Spearman’s values for significance) - Point Biserial: 1 continuous and 1 dichotomous (compute t for significance) - Phi (𝜙): 2 dichotomous (compute 𝑋 2 for significance) Lecture 15: Probability Analytical Probability: Refers to the process of calculating the likelihood of an event or outcome using mathematical methods and formal reasoning. Relative Frequency: The probability of an event happening is the number of times that it has happened in the past divided by the total number of times that it could have happened Subjective Probability: The perception of the probability of an event occurring Independence: When the occurrence of one event does not affect the other events Mutually exclusive: When the occurrence of one event precludes the occurrence of another event Exhaustive: When the total of the events represents all possible outcomes The Addition Rule (OR): The total probability of any of a set of mutually exclusive events happening is equal to the sum of their separate probabilities - If we have two separate events (event A and event B) that are mutually exclusive, the probability of either A or B happening is equal to the probability of A happening plus the probability of B happening Sampling with replacement: After a sample is drawn, it is replaced back into population and has the same probability of being drawn again (i.e., the total number of possible events remains the same). Sampling without replacement: After a sample is drawn, it is NOT replaced back into population and cannot be drawn again (i.e., the total number of possible events decreases by one). Multiplication Rule (AND): The probability of a set of independent events occurring is equal to the product of their separate probabilities. - If we have two separate events (event A and event B) that are independent, the probability of both A and B happening is equal to the probability of A happening times the probability of B happening. Lecture 16: Research Design Fundamentals Situational variable: characteristics of situation, environment Response variable: responses or behaviours Participant or subject variables: characteristics of a person Mediating variables: variable between situational and response - Bystander (non-) Intervention - In an emergency, the more people present (situational variable), the less likely any one person will intervene (response variable) Converging Operations: If several different operational definitions all converge on the same conclusion, it strengthens that conclusion Non-experimental: observe, measure variables and determine if there is a relationship (vary together) - Correlational studies do not let you infer causation - Correlational studies do tell you about possible relations between variables - Correlational studies do let you make predictions Is there a relation between exercise and happiness? - Operationally define exercise - Operationally define happiness - Measure both variables - Find exercise is positively related to happiness - Those who exercise more are also happier - Positive: Both variable change in same way - Often called correlational study Experimental: direct manipulation and control of variables - Conceptualize a cause and effect relationship between variables Does thinking about fast food cause people to behave impatiently? - IV: Seeing or not seeing fast food logos - DV: Time spent reading paragraph Causation: If two variables (A and B) are correlated with each other, there are several possible explanations - A could cause B - B could cause A - A third variable, C, could cause both A and B - A and B may really be unrelated (spurious correlation) - Attempt to eliminate influence of all variables except for the one(s) being manipulated Experimental Control: - The only difference is the IV - All other variables are held constant - If held constant, it cannot affect the outcome Third Variable: Varies with both measured variable - Ex. Heat linked to both ice cream sales and aggravated assaults Confounding Variable: Varies with one variable - Ex. If subjects in Saw FF Logos were tested right before lunch and subjects in No FF Logos were tested right after lunch Issues with experiments: Artificial Setting: Lab-based studies control everything so the situation may be atypical Solution: Compare results with field study Unethical to use experimental methods in many situations: Solution: use pre-existing assignment Participant Variables: Age, sex, gender, ethnicity, nationality, birth order Solution: Use a non-experimental design Descriptive Research: Experimentation may not be needed - A survey may suffice to accurately describe the findings Prediction: If accurate prediction is the focus, experimentation may not be necessary - Look for a relation between measures (Ex. Predict success in a programme based on test scores) Causation: If stating causation is important, then experimentation is essential - Can’t state anything definitive about causation from correlational methods Converging Methods: Like converging operations, converging methods is often best route - Weakness of one study is unlikely to be same as weakness in a slightly different study - If all studies (correlational and experimental) converge on same conclusions, then conclusions are strengthened Law: a universal statement of the nature of things that allows reliable predictions Festinger’s Cognitive Dissonance: When two beliefs or beliefs/actions are dissonant (inconsistent), people experience an adverse state of arousal - Highly motivated to reduce aversive arousal by changing beliefs/actions - Usually cannot undo actions - Therefore, change beliefs Francis Bacon (1561-1626) - Method of induction: reasoning from specifics to general principles - careful, systematic observations - e.g., how cats and dogs interact, develop general explanation, can predict future interactions - Reason from specific to general David Hulme (1711-1776) - Problem of induction: How many observations are needed? - The next observation could disprove the previous ones - Accounts based on observations can be erroneous (This sheep is white; this sheep is white... All sheep are white) Deduction: Reasoning from the general to the specific - Use a general statement (theory) to make a prediction (hypothesis) that is then tested against observations - It is possible to disprove a theory, but it is impossible to prove a theory Science: - The goal is to develop increasingly more accurate theories and accounts OR - The goal is to develop increasingly less inaccurate theories Scientific Process: 1. Recognize or identify a problem: A recognized gap in knowledge or an important question in response to some event/issue 2. Review the literature: Once a question has been identified, researchers must determine whether similar studies have been conducted (focus on peer-reviewed journals) 3. Theoretical Considerations: During your literature review, learn about particular theories associated with the issue - Organize: What do all the findings relating to unemployment and health have in common?... (stress, anxiety, low self-esteem, etc.) - Generate Predictions: Provide a “road map” to relevant areas of study: Do health outcomes of unemployment vary by gender, type of work, social support, income relief, etc? 4. Hypothesis Development: A more specific version of a theory that tries to organize particular data and independent/dependent variables relations within a portion of a larger, comprehensive theory Research hypothesis: An even more specific version that makes a prediction 5. Research Design: The “general plan” for determining dependent variables, selecting participants, assigning participants to experimental conditions, controlling extraneous variables, and collecting data - These can be different kinds of studies: experiments, intervention studies, correlational studies, case studies - Plan the statistical analysis/analyses (common, usually fatal, error is not planning the statistical analyse) 6. Obtain approval from relevant ethics board and conduct the study 7. Data analysis and statistical decisions: - Is there a difference between groups? - Is there a relation among variables? - Is the difference or the strength of the relation large enough that we can be confident it is real? 8. Interpret results in light of past research and theory: - Do findings support previous work, or do they challenge established ideas/theories? - How can we reconcile differences in outcomes? 9/10. Report writing and research dissemination: - Research results need to be communicated to the community if they are to be helpful to the general advancement of science - Empirical reports must follow certain predefined formats so that they are easily understood by others (Abstract, Introduction, Method, Results, Discussion) Dissemination: Researchers rely primarily on refereed journal articles - Typically reviewed by 3 (or more) independent researchers - Editor makes decisions based on reviews - Process (from submission to publication) can take up to a year - Problem: People -- including researchers -- aren’t very good at logical reasoning - People tend to propose tests that confirm rather than disconfirm a belief/hypothesis (also known as confirmation bias, positive bias, or superstitions) - Scientific reasoning (these numbers follow a rule: 2 4 6) - Self-fulfilling prophecy (child labelled as “bright” or “dumb” Wason Selection Task: - If a card has a vowel on one side, it must have an even number on the other. - Which 2 cards do you turn over to disprove this rule OR - If a card has salt on one side, it must be pink on the other. - Which 2 cards do you turn over to disprove this rule Falsification: Some researchers are known for proposing strong tests of their theories and falsifying them - More commonly, other researchers will be more than happy to falsify your theories for you Qualification: More usual outcome than falsification - Modify an existing theory rather than rejecting it - An approach in which researchers identify boundary conditions to a theory - Inherently more complicated - Still prone to verification/positive bias/confirmation bias NSERC: Natural Sciences and Engineering SSHRC: Social Sciences and Humanities CIHR: Canadian Institutes of Health Research Lecture 17: Measurement Reliability: How consistent are the measurements? The measure has two components: 1. True Score: The person’s true score on the measure, e.g., IQ = 100 2. Measurement (random) Error: Deviations from the true score If an instrument has more reliability, it has less measurement error - Repeated tests yield similar results Generally measure each person only once: - Critically important that you use a reliable measure How do you know if your measure is reliable? - Assess the stability of the score Correlation: Pearson product-moment correlation coefficient: - r = +1: Perfect positive correlation (As X increases, Y increases) - r = –1: Perfect negative correlation (As X increases, Y decreases) - r = 0: No correlation (As X increase, Y may increase or decrease) Test–Retest Reliability: if a test is measuring what it is supposed to, then if somebody took the test again, they should get the same score (assuming that what you measuring hasn’t changed) - Very little difference between test and retest scores - High correlation between the test and retest scores Problem: A person may remember responses from the first test, so the score on the second test may be higher Solution: Use an alternate form of test Problem: What you are measuring may change over time (e.g., mood) Solution: Use some other method of assessing reliability Internal Consistency: When you have multiple items to measure a construct, do all your items measure it equally well? Do they correlate? - r varies from -1 (perfect negative) to 0 (no) to +1 (perfect positive) relation Chronbach’s α: α varies from 0 (no reliability) to 1 (perfect) Inter-Rater Reliability: When you have multiple items that have to be judged, does one judge differ from another judge? - Number of times an infant looks at an object - Judgments of creativity Similar to test-retest reliability: look for very little mean difference and high correlation Validity: How well does it measure what it’s supposed to be measuring? - Multiple types of validity - Providing evidence for construct validity is time-consuming, expensive, and difficult - Use existing measures rather than create your own Construct Validity: How well does the manipulated (independent) variable and measured (dependent) variable reflect the variables the researcher hoped to manipulate or measure? - Hunger: Time spent reading restaurant guide vs. hours of food deprivation - Ability to succeed in graduate school: GRE Face Validity: The measure appears to be measuring what it is supposed to be measuring - If interested in extraversion, the questions include topics having to do with socializing, parties, involvement with organizations, etc. Face validity is largely worthless: - Rapid eye movements are an accurate (valid) predictor of dreaming but have no face validity - Flashing words very briefly (~50ms) leads to accurate (valid) findings about reading but has no face validity Content Validity: The measure captures all the necessary aspects of the construct and nothing else Example: Psychopathy Checklist-Revised (PCR) Psychopathy: interpersonal callousness, unemotionality, impulsivity, antisocial behaviour PCR has questions on just these 4 aspects; it has no questions on aggression because aggression is not part of psychopathy Predictive Validity: Scores on the measure accurately predict future behaviour Concurrent Validity: Scores on the measure accurately predict current behaviour - Scores on GRE Math should be related to performance in the math course currently being taken Convergent Validity: Scores on the measure are related to scores on similar instruments - The score on IQ Test 1 should correlate highly with the score on IQ Test 2 - The Score on Personality Test 1 should correlate highly with the score on Personality Test 2 Discriminant Validity: Scores on the measure are not related to scores on instruments that measure something different - The score on IQ Test 1 should not correlate with the score on Personality Test 1 - The score on Personality Test 2 should not correlate with the score on IQ Test 2 Internet Questionnaires/Polls: Usually “made up”: - No assessment of reliability (Do you get the same score each time you take it?) - No assessment of validity (Is it measuring what it purports to measure?) - Because you cannot assess the instrument, you cannot assess the results Reactivity: Awareness of being measured changes an individual’s behaviour - Measuring eye blink rate - Measuring respiration rate - Questions about dishonesty - Questions about socially unacceptable behaviours/thoughts/beliefs Study-wise Validity: Internal Validity: The extent to which the results of an experiment can be attributed to the manipulation of the independent variable rather than to some other, confounding, variable - In other words, are the conclusions about what happened in the study valid - Usually highest in experimental studies External Validity: The extent to which the conclusions of a study would still hold outside of the context of the study - In other words, do the results generalize to people and contexts in the real world outside of the particular study. - Usually highest in certain types of non-experimental studies Lecture 18: Observation Most psychology research is quantitative: collecting and analyzing numerical data But some is qualitative: more descriptive, typically more in-depth information from fewer people - Not necessarily mutually exclusive Example: How are teenagers affected by working after school? Quantitative: statistical analyses - A questionnaire was administered to random samples - compute means, perform statistical analyses - predict GPA by # of hours worked, % of students working, etc. Qualitative: interpreting people’s experiences within a specific context - series of focus groups with 6-10 teenagers - data might be focused on “themes” emerging from groups Naturalistic Observation: - Originated with animal studies and anthropological studies - Observe “subjects” — lions, birds, people — in naturally occurring environments (ex. malls, hotels, jobs) - Generally don’t have a hypothesis - Try to describe a complete and accurate picture - Can generate hypotheses to be tested in further studies - Can always use quantitative methods in conjunction with naturalistic observation (ex. record how many times one activity follows another) Issues: Participation vs. nonparticipation: Should the researcher participate or not? - May gain additional insights, see additional things in participating - But may lose objectivity Concealed observation: The presence of an observer may change behaviour - Concealed observation raises ethical issues - Can also raise legal issues Limits: - Useful for gaining an initial understanding of complex behaviour - Provide a knowledge base from which hypotheses can be generated and then tested - Lack of control usually makes it difficult to test hypotheses directly - Data interpretation cannot be planned and is not simple Systematic Observation: Focus on one or more specific behaviours - Less global, more usually quantitative Example: children and empathy - 18-36 month old children played with experimenter who (a) hurt self, shows pain or (b) broke toy, shows sadness - The behaviour of children coded: - Prosocial (comfort experimenter), unresponsive, etc - Children more affected by others’ sadness than pain Issues: - Reliability: assessed by inter-rater reliability - Reactivity: Can be reduced by concealed observation - Sampling: How to sample behaviours? - Observe ~70 hockey games rather than 1 or 2 - Record all comments by spectators, then code Case Studies: Detailed investigations involving a single subject - Allow an in-depth analysis and exploration. - Often used to generate theories in a new field that has not been investigated before. - Also used to investigate rare cases to see what they might tell us about more general cognition or behaviour. Phineas Gage: Damage to specific brain areas is associated with specific kinds of deficits S. D.: Amphetamine-induced dopaminergic excitation - suggests what it might be like if senses are enhanced. Peter Tripp: Stayed awake for 200 hours - extreme levels of sleep deprivation come close to being intolerable (but perhaps not so extreme for everyone) Issues: Victor of Aveyron: Feral children - Why feral in the first place? Piaget’s children: Upper-middle Swiss Freud: Non-scientific theory overshadowed the potential usefulness of case studies Archival Research: Use previously compiled information - Qualitative: look for common themes - Quantitative: numerical analyses Numerous sources: - Statistics Canada - Polling organizations - Public records - Mass communications Content Analysis: - Like coding system - Organized procedure for quantifying information: Number of times keywords used Ratio of positive to negative emotions conveyed - Use inter-rater reliability to assess