PSYC 3F40 Midterm Review Topics List October 2024 PDF
Document Details
2024
Tags
Summary
This document is a midterm review for PSYC 3F40, covering topics such as the purpose of science, characteristics of science, scientific theories, ethical obligations in research, views on animal research, and basic writing skills. It includes examples, explanations and a summary of the material.
Full Transcript
**PSYC 3F40 Midterm Review Topics List October 2024** BASICS OF SCIENCE **1. Purpose of Science** The purpose of science is to understand nature by identifying laws or theories that explain and predict observed and new facts. Science seeks to make sense of facts rather than simply listing them, a...
**PSYC 3F40 Midterm Review Topics List October 2024** BASICS OF SCIENCE **1. Purpose of Science** The purpose of science is to understand nature by identifying laws or theories that explain and predict observed and new facts. Science seeks to make sense of facts rather than simply listing them, aiming to explain facts systematically so that new insights can emerge. **2. Main Characteristics of Science** - **Empirical Questions:** Science addresses questions that can be answered through observation and measurement, rather than philosophical or value-based inquiries. - **Systematic Observation:** Data is gathered through systematic, repeatable observation to avoid biased conclusions. - **Public Scrutiny:** Scientific knowledge is open to public verification and critique, avoiding exclusivity or reliance on private insights. **3. Purpose of Scientific Theories** Scientific theories aim to: - **Explain Known Observations Simply:** Using parsimony (simplicity), where the theory with fewer assumptions or simpler mechanisms is generally preferred if it still explains the data (Occam's Razor). - **Predict New Observations:** Theories should lead to predictions (hypotheses) that are testable (falsifiable) and could potentially be disproven by evidence. For example, Mendeleev's periodic table and Einstein\'s predictions about light bending. **4. Difference Between the Scientific Method and Other Knowledge Approaches** The scientific method relies on empirical testing and verification through observation and experimentation, which enables continuous improvement or replacement of theories. Other approaches, such as common sense, logic, or authority, lack empirical testing, making them less progressive and sometimes inaccurate (e.g., Aristotle\'s belief that heavy rocks fall faster). **5. Ethical Obligations of Researchers to Participants** Researchers must adhere to these ethical obligations: - **Informed Consent:** Participants must be informed about what they'll experience and must agree to it voluntarily. - **Right to Withdraw:** Participants can leave the study at any time without penalty. - **Minimal Harm:** Any potential harm should be minimized, and participants should not be subjected to unnecessary discomfort or embarrassment. - **Anonymity/Confidentiality:** Personal information must remain anonymous or confidential, especially for sensitive topics. **6. Views on Ethics in Animal Research** Views among psychologists about animal research ethics vary: - Some advocate for a total ban on animal research. - Others support a ban only on highly harmful studies. - A third group supports continued animal research, emphasizing humane treatment, minimal harm, and conducting it only for scientifically significant purposes. WRITING **Grammar and Usage Points:** **"classic prose style" (from 1600s France):** ** The writer knows ("sees") something that the reader doesn't, and shows it to the reader...** ** as clearly, concisely, and simply as possible (without compromising accuracy)** ** ** ** The writer's (and reader's) attention is on the information, not on the writer ("it's not about you")** ** Writer and reader both assume that truth exists and can be known (at least approximately) → and they both want to know the truth** ** ** ** The writer treats the reader as an equal who doesn't (yet) know what the writer knows, but who wants to know** ** (but note: depending on exactly who the reader is---e.g., another researcher vs. a layperson--- there will be less or more to explain to the reader)** ** ** ** The writer always remembers that the reader doesn't yet know what the writer is explaining** ** (when you're deeply immersed in something, it's easy to forget this...** ** and to write in a way that assumes knowledge that the reader doesn't yet have)** ** ** - A **noun** is a word that represents a person, place, thing, or idea. - An **adjective** is a word that describes or modifies a noun. It provides more information about the noun's characteristics, such as size, color, quantity, etc. - A **verb** is a word that describes an action, occurrence, or state of being. Verbs are essential for forming sentences as they convey what the subject is doing. - **Plurals and Possessives**: Differentiate between plurals (e.g., \"dogs\") and possessive forms (e.g., \"dog's\" for singular possessive, \"dogs'\" for plural possessive). - **Ambiguous Comparisons**: Clarify comparisons to avoid confusion, especially with \"than\" (for comparisons) and \"then\" (indicating sequence). - **Tricky Words**: Understand the specific meanings of words like *criterion/criteria*, *phenomenon/phenomena*, and *principal/principle*. - Our principal (main) challenge is to uphold this principle (rule). - - **Less vs. Fewer**: Use \"less\" for uncountable quantities (e.g., less time) and \"fewer\" for countable quantities (e.g., fewer questions). - **i.e. vs. e.g.**: Use *i.e.* to clarify ("that is") and *e.g.* to provide examples, both followed by a comma and used within parentheses. - **Affect vs. Effect**: Use *affect* as a verb (to influence) and *effect* as a noun (the outcome); note that *affect* can also be a noun referring to emotion, and *effect* can occasionally be used as a verb meaning to bring about. - **Comma Usage**: Place commas correctly based on sentence structure, including the use of the Oxford comma before \"and\" in lists. - **Hyphenation**: Hyphenate compound adjectives before nouns (e.g., 16-item questionnaire) but avoid hyphens when the phrase follows the noun. - When you want to combine "should", "could" or "would" (or other conditional terms) with something from the past... - you use "have", NOT "of" (even though it sounds like "of" when speaking). - E.g., Catherine of Aragon should have signed a prenuptial agreement. - use "since" and "while" only when referring to time periods, not as synonyms for "because" and "although". - Just use for time: \"while that was going one, since that period\" - - Most experts (& APA style guide) say you shouldn't put a comma into a "compound predicate" (i.e., where you say at least two things about the same subject, whose name is only given once). - E.g., according to them, the sentence below shouldn't have any comma: - "Octopuses achieved higher typing speeds than did all other animals, and handled more telephone calls as well." (unless there is an octopuses after the comma) **Writing Techniques:** - **Parallel Construction**: Ensure grammatical consistency in lists or comparisons within sentences (e.g., "Participants were asked to refrain from talking, making eye contact, and working quietly..."). - **Active vs. Passive Voice**: Prefer active voice for clarity, but use passive voice when focus should be on the action's recipient. - A sentence is written in the active voice if the subject performs the action described by the verb. Examples... - I made mistakes. - The jury convicted him. - A sentence is written in the passive voice if the action described by the verb is being done to the subject. Examples... - Mistakes were made. (action is being done) - He was convicted. - **Standard English**: Avoid slang or culturally specific language in scientific writing, as clarity and precision are paramount. **Scientific Writing Tips:** - **Clarity and Simplicity**: Use precise yet simple language; if two words are equally precise, choose the simpler one. - **Organization**: Organize content logically at all levels---sections, paragraphs, and sentences---with smooth transitions. - **Truth and Reader Awareness**: Aim for clear communication by recognizing the reader's perspective and knowledge level. **Specific Writing Rules:** - **Modifiers and Ambiguity**: Ensure modifiers clearly relate to the words they modify to avoid confusion (e.g., \"Journalists using cameras were observed\" vs. \"Cameras were used to observe the journalists\"). - **Use of Colons and Semicolons**: Colons introduce lists, while semicolons separate long items in a list or two independent clauses. - **That vs. Which**: Use \"that\" for restrictive clauses (essential to the meaning) and \"which\" for non-restrictive clauses (extra information). - That and Which (and (non-) restrictive clauses) - Consider these two sentences... - (1) The results, which contradicted the hypotheses, were not reported. - (2) The results that contradicted the hypotheses were not reported. - - In non-restrictive (1), all of the results weren't reported, and these results contradicted the hypotheses. (The "which" part between the commas is a "non- restrictive clause".) - - In restrictive (2), only some of the results weren't reported, and these were the ones that contradicted the hypotheses. (The "that" part is a "restrictive clause".) **Main Sections of Empirical Reports:** 1. **Abstract:** A concise summary (typically under 120-200 words) that includes what was examined, the methods used, key findings, and their significance. It should avoid unnecessary details and focus on essential information. 2. **Introduction:** Introduces the topic and provides background by discussing relevant previous research and theories. It should lead the reader to the research questions without leaving them confused. The introduction must be unbiased and should conclude with the hypotheses or main exploratory questions. 3. **Method:** Details how the research was conducted, allowing for reproducibility. It typically includes subsections on participants, materials, and procedures. If lengthy, some content can be placed in an appendix or cited from previous work. 4. **Results:** Presents the findings, starting with descriptive statistics and focusing on significant results related to hypotheses. This section should highlight key findings without excessive commentary, which is reserved for the discussion. 5. **Discussion:** Interprets the findings, relates them to hypotheses and previous research, and discusses limitations and alternative interpretations. It should suggest directions for future research and conclude with key insights. **Other Sections:** Additional components of a manuscript include the title page, references, author notes, footnotes, tables, and figures, all following APA style or the specific guidelines of the relevant journal or institution. MEASUREMENT Understand methods of measurement (direct observation \[various kinds\], self-report/rating, observer report/rating) **Types of Measurement Methods:** 1. **Direct Observation:** Involves watching and recording behaviors directly, which can be done in: - Natural or artificial settings, - Either live or through recordings, - Overtly (participants aware) or covertly (participants unaware), - Through ratings or counts (e.g., time spent or frequency of actions). - Examples include observing children in playgrounds, mock juries, lab participants, or social media activity. Physiological tests and maximum-performance tests (like IQ tests) are also forms of direct observation. While direct observation is objective and verifiable, it can be costly and difficult to interpret. 2. **Reports and Ratings:** Gathered through self-report or observer/informant (e.g., friends, family) using questionnaires or interviews, which may be structured or open-ended. Reports can cover: - Behavior tendencies (e.g., \"I worry about things\"), - Trait ratings (e.g., \"I am a nervous person\"). - While efficient, reports are subjective and may carry biases; combining self-reports with multiple observers can enhance accuracy. 3. **Biodata:** Uses objective records, such as school or court records, to provide insights. **Choosing a Measurement Method:** The best method depends on the nature of the research. For example, personality studies favor self-reports, while cognitive studies rely on direct observation. Understand levels of measurement (nominal, ordinal, ratio, interval) \[NORI\] 1. **Nominal:** - **Definition:** Categorical values represented by arbitrary numbers, such as names or labels that do not imply quantity. - **Example:** Province codes (e.g., Alberta = 1, BC = 2, Yukon = 13). - **Characteristics:** Means or medians cannot be calculated since values lack numeric meaning; can only compare categories on other variables. 2. **Ordinal:** - **Definition:** Indicates which people have higher or lower levels of the variable and shows relative ranks. - **Example:** Race placements (e.g., 1st, 2nd, 3rd), or percentiles in test scores (%). - **Characteristics:** Can determine median and range, but not mean or standard deviation, as differences between ranks aren't necessarily equal. 3. **Ratio:** - **Definition:** Has rank, consistent intervals, and a true zero, allowing for meaningful ratios between values. - **Example:** Height, weight, or temperature in Kelvin. - **Characteristics:** Ratios are meaningful (e.g., 180 cm is twice as tall as 90 cm), making this scale useful for various quantitative analyses. - Psychological research does not usually need ratio properties. 4. **Interval:** - **Definition:** Shows rank with consistent, meaningful differences between values, though with no true zero. - **Example:** Temperature in Celsius, IQ scores, or school exam scores. - **Characteristics:** Allows for calculations of mean and standard deviation. However, ratios aren\'t meaningful (e.g., 20°C isn't \"twice as warm\" as 10°C). - A score of 0 doesn't necessarily mean a complete absence of a variable -\> can't find ratios between scores - no true 0: \"friend is twice as smart as me if she got 50% higher than me on test\" **Application in Research:** In psychological research, data should ideally be at least ordinal and preferably interval level for meaningful analysis, though ratio-level data is rarely necessary and often unattainable. Understand reliability and the kinds of error variance associated with the three kinds of reliability **Overview**: - Measurements of a characteristic may vary, even with the same type of measurement. Examples include scores on two exams for the same subject or ratings from different judges for a performance. - For any measurement, there are two types of variance: - **True Variance**: Reflects consistency across similar measurements (which is shared with other measurments) - **Error Variance**: Unique to the specific measurement, representing measurement inconsistencies. - **Reliability** measures how much of a measurement's variance is \"true,\" indicated by its correlation with similar measurements (e.g., two questionnaires or two blood samples). **Sources of Error**: 1. **Occasion of Measurement**: Consistency of scores across different times. Weak agreement suggests higher error variance. 2. **Questions (Items)**: Consistency across different sets of questions. Greater item-specific variability increases error. 3. **Raters**: Variability across different raters' judgments. **Types of Reliability**: 1. **Test-Retest Reliability (Stability)**: - Measures consistency across time. - Calculated by correlating scores of the same individuals on two different occasions. - The test-retest reliability (stability) indicates how much the scores (i.e., differences between scores) - are consistent across occasions (true variance) - as opposed to fluctuating a lot across occasions (error variance). 2. **Internal-Consistency Reliability**: - Measures consistency across items (e.g., different questions) within a test. - Calculated by correlating similar versions of a test or estimated from one version. - High internal consistency shows that scores are primarily based on shared item content (true variance), with less influence from item-specific error variance 3. **Inter-Rater (Inter-Observer) Reliability**: - Measures consistency across different raters. - Calculated by correlating ratings from two similar groups of raters (when possible), often you cannot find two panels of judges, so you would estimate within one group. - High inter-rater reliability indicates consensus among raters (true variance), while variability in ratings suggests higher error variance due to individual rater differences. **Factors Enhancing Reliability**: 1. **Correlation Among Measurement Parts**: When items or raters correlate well with each other, indicating high true variance. 2. **Number of Items/Raters**: Aggregating many items or raters improves reliability by increasing true variance and canceling out error variance. - **Example**: More items (higher K) and higher average inter-item correlation (mean r) lead to higher reliability scores. **K (\# items)** **Mean r among items** **Reliability** ------------------ ------------------------ ----------------- 5 0.10 0.36 5 0.25 0.63 20 0.10 0.69 20 0.25 0.87 **Reliability Range**: - Reliability is the proportion of total variance that is true variance and not error variance. - Reliability values range from 0 to 1, with higher values indicating better reliability. However, no strict minimum threshold exists; higher values are generally preferred. Understand the Spearman-Brown formula and be able to use it (and more generally, understand how reliability depends on the number of items/judges and their intercorrelations) Reliability increases as the number of items (e.g., questions on a test) or judges (e.g., raters assessing a characteristic) increases, and as the intercorrelations among these items or judges strengthen. Here's why: 1. **Number of Items/Judges**: - When more items or judges are included, true variance tends to accumulate, providing a clearer picture of the characteristic being measured. - **Aggregation Principle**: Each added item or judge contributes to the total variance. True variance adds constructively (strengthening reliability), while error variance (the \"noise\") tends to cancel out as more items or judges are combined. *Example*: With 20 items or judges, the overall score tends to be more reliable than with only 5 items or judges because random errors across individual items or ratings are averaged out. 2. **Intercorrelations (Mean r among Items or Judges)**: - High intercorrelations mean that each item or judge is measuring the same underlying construct consistently. This reflects high true variance. - **Spearman-Brown Formula**: Reliability can be estimated with the Spearman-Brown formula, which shows how increasing the number of items/judges and their correlations increases the overall reliability of the measurement. - Higher intercorrelations (e.g., an average correlation of 0.25 compared to 0.10) indicate that items or judges are aligned in assessing the trait, reducing random error and improving reliability. Know what coefficient alpha (Cronbach's alpha) is and how it relates to the Spearman-Brown formula **Reliability** refers to the extent to which similar measurements of a characteristic agree with each other, as indicated by the correlations among those measurements. **Factors Influencing Reliability** 1. **Number of Items or Judges** - Reliability is higher when there are more items (e.g., questions in a survey) or judges (e.g., raters in an evaluation). 2. **Correlation Among Items or Judges** - Higher correlations between items or judges indicate a stronger common element, increasing reliability. **Common Element Concept** - With more items or judges, the unique elements of any given item or judge become less significant. Aggregating items helps improve overall reliability. **Reliability Formulas** 1. **Spearman-Brown Formula** - Used to calculate internal-consistency or inter-rater reliability: - Reliability=kr/ 1+ (k-1)r - Where: - k = number of items or judges - r = average correlation between items or judges - Aggregating multiple items or judges increases overall reliability, compensating for individual items with lower reliability. 2. **Coefficient Alpha (Cronbach's Alpha)** - More commonly used in research, particularly for internal-consistency reliability of scales or tests: α= k/k-1 (1−∑s\^2item/−s\^2scale) - k is the number of items - ∑s\^2item is the sum of item variances - s\^2scale is the total scale variance - Alpha is proportion of scale variance that is due to item covariance (i.e., not due to items' unique variances) out of maximum possible - Alpha reflects the proportion of scale variance due to item covariance, excluding unique variances. **Special Considerations** - **Subscales**: For scales containing distinct subscales, other reliability measures (e.g., omega) can better separate variance common to all items from variance unique to individual subscales. Understand validity (& how different from reliability) and its three kinds (content; construct \[both convergent and discriminant\]; criterion) **Validity** refers to how well a measurement assesses the intended characteristic. While high reliability means consistency, validity ensures that we are measuring the correct characteristic. - High reliability is required for validity, but reliability alone does not guarantee validity (the measurement could be consistently measuring the wrong characteristic). **Types of Validity** 1. **Content Validity** - **Purpose**: Ensures that the measurement items relate to all aspects of the intended characteristic and not to other, unrelated characteristics. - **Examples**: - A sports questionnaire should exclude non-sport-related questions. - An impulsivity scale should capture only the intended aspect of impulsivity. - **Evaluation**: Typically assessed qualitatively by comparing items to the intended characteristic's aspects (not often quantified). 2. **Construct Validity** - **Purpose**: Determines if scores correlate as expected with other measurements. - **Components**: - **Convergent Validity**: Scores on the measurement correlate strongly (positively or negatively) with other measurements of the same or opposite characteristic. - Example: Sibling and self-reports on friendliness should correlate well. - **Discriminant Validity**: Scores should show weak correlations with measurements of different, unrelated characteristics. - Example: Self-ratings of intelligence might be unintentionally capture traits of humility or self-esteem. Weak correlations with humility or self-esteem measures would support discriminant validity, showing that the intelligence rating isn\'t \"contaminated\" by these other factors. - **Evaluation**: Often involves comparison with measurements based on different methods (e.g., self-reports with observer ratings). 3. **Criterion Validity** - **Purpose**: Examines if the measurement correlates with important outcome variables (criteria), often for practical purposes like selection or diagnosis. - **Examples**: - Admissions test scores correlating with GPA. - Psychopathy assessments correlating with re-offending rates. - **Notes**: Even a modest correlation in criterion validity can be useful in predictions. **Additional Validity Types in Experimental Contexts** - **Internal Validity**: An experiment has internal validity if the independent variable is the only factor that can influence the dependent variable. - **Example 1**: In a study examining the effect of a new teaching method on student test scores, internal validity is achieved if only the teaching method varies between groups. Controlling for other factors like classroom environment, teacher experience, or study time ensures that differences in scores can be attributed solely to the teaching method. - **Example 2**: In a clinical trial testing a medication\'s effectiveness on blood pressure, high internal validity means all participants follow the same diet and activity level, so changes in blood pressure can be directly linked to the medication rather than lifestyle differences. - **External Validity**: Refers to whether an experiment's findings generalize to other settings, including real-world scenarios beyond the lab. - **Example 1**: If a study on memory techniques conducted with college students in a lab setting shows strong results, external validity is tested by applying these techniques to older adults or high school students in everyday settings to see if they still improve memory. - **Example 2**: An experiment on stress reduction strategies may show effectiveness in a controlled setting with minimal distractions. To ensure external validity, researchers might examine whether these strategies work in more variable, high-stress real-life environments, such as workplaces or schools. Understand the idea of the "multi-trait multi-method matrix" as a way of examining construct (i.e., convergent and discriminant) validity and the problem of method variance The \"multi-trait multi-method matrix\" (MTMM) is a systematic approach to assess construct validity, specifically looking at *convergent* and *discriminant* validity while controlling for the effects of *method variance* (differences that arise from using various measurement methods). This approach helps ensure that a measurement tool accurately captures the intended trait and is not simply reflecting the method used to measure it. **Components of the MTMM Matrix** The MTMM matrix involves: 1. **Multiple Traits**: Different characteristics or constructs being measured (e.g., friendliness, intelligence). 2. **Multiple Methods**: Different ways of measuring each trait (e.g., self-reports, observer ratings). **How MTMM Assesses Construct Validity** 1. **Convergent Validity**: - **Goal**: Show that measurements of the same trait by different methods correlate strongly. - **Example**: If friendliness is measured by both a self-report and observer rating, these scores should correlate well, demonstrating that the methods are capturing the same underlying trait. 2. **Discriminant Validity**: - **Goal**: Show that measurements of different traits by the same or different methods do not correlate strongly. - **Example**: Friendliness measured by self-report should not correlate highly with intelligence measured by self-report, indicating the traits are distinct. **Method Variance and the MTMM Matrix** - **Method Variance Problem**: Variance due to the method used (rather than the trait) can make it seem as if measurements are correlated, even if they're not measuring the same trait. - **Example**: Self-reports may show artificially high correlations across different traits (e.g., friendliness and intelligence) due to shared method variance. By organizing data in the MTMM matrix, you can assess: - High correlations between same-trait, different-method pairs (good convergent validity). - Low correlations between different-trait, same-method pairs (good discriminant validity). **Construct Validity** **Aspects of Construct Validity:** 1. **Convergent Validity:** - **This aspect assesses whether a measure correlates strongly with different measures of the same construct.** - **It includes strong negative correlations with measures of opposite constructs.** 2. **Discriminant Validity:** - **This examines whether a measure correlates weakly (near zero) with measures of different, theoretically unrelated constructs.** **Importance of Construct Validity:** - Construct validity may be poor if a measurement reflects the method of measurement more than the trait or characteristic being assessed. - **Examples:** - Does a multiple-choice course test measure achievement in the current course or just multiple-choice test-taking ability? - Does a self-report personality scale measure its intended trait or just a tendency to provide socially desirable responses? **Method for Assessing Construct Validity:** - To investigate these concerns, you can measure two or more traits using two or more methods. This allows for the evaluation of convergent and discriminant correlations in a **multi-trait, multi-method matrix.** **Example: High-School Testing** - Suppose high-school students are tested in Science and Social Studies using both a multiple-choice test and a written answer test. - Questions to consider: - Does performance reflect course-related knowledge or just proficiency in that testing format? - **Expected Outcomes:** - If the course content is more significant than the type of test, then: - **Convergent Correlations** between multiple-choice (MC) and written (Writ) tests for the same course should be high: - High correlation (r) between Sci MC & Sci Writ. - High correlation (r) between SS MC & SS Writ. - **Discriminant Correlations** between tests of different subjects should be lower: - Lower correlation (r) between Sci MC & SS MC. - Lower correlation (r) between Sci Writ & SS Writ. - If different types of tests measuring the same course do not correlate much, this indicates poor convergent validity. - If tests of the same type measuring different courses correlate too highly, this suggests poor discriminant validity. **Additional Example: Personality Traits** - Suppose individuals are measured for the personality traits of Extraversion (X) and Conscientiousness (C) using both self-report and spouse\'s observer report. - **Questions to Consider:** - Do scores reflect actual trait levels or just tendencies for self-reports or spouse reports to describe the individual favorably? - **Expected Outcomes:** - If the trait is more important than the reporting person: - **Convergent Correlations** between self-reports and observer reports for the same trait should be high: - High correlation (r) between self-report X & observer report X. - High correlation (r) between self-report C & observer report C. - **Discriminant Correlations** between Extraversion and Conscientiousness as measured by the same report type should be lower: - Lower correlation (r) between self-report X & self-report C. - Lower correlation (r) between observer report X & observer report C. - If different types of reports measuring the same trait do not correlate much, this indicates poor convergent validity. - If reports of the same kind measuring different traits correlate too highly, this suggests poor discriminant validity. SIGNIFICANCE TESTS AND EFFECT SIZES - If it is very unlikely to get results this extreme in a certain direction in a sample of this size then you reject the H0 and conclude that HA is supported - But if it is not so unlikely, then you do not reject H0, and you conclude that HA is not supported - A criminal trial starts with H0 that accused is innocent-\> is the evidence strong enough to reject H0? Understand what p-values (significance level) mean - Significance level (p)-\> how likely would this be if H0 were true? - How unlikely is "very unlikely"? - \ - Larger samples make it easier to detect small effects at a significance level of p\ - When measurements are more reliable (i.e., less error variance), the true effect becomes clearer, which can enhance power. - Use stronger or more extreme interventions, such as higher doses or opposite groups, to make the effect more detectable. **Problems with p-values** - P-values can vary dramatically from one sample to another so do not overinterpret differences between p-values - For studies with power =.50, will sometimes get p \<.001 in one sample, and p \.50 in another - For example, p =.03 vs. p =.09 is actually a very small difference even though the values are on opposite sides of the usual alpha =.05 cut-off. Confidence intervals (CIs) - CIs use **sample effect sizes** to estimate a range where the **population effect size** likely falls, often with **95% confidence**. **Interpreting a 95% CI**: - A 95% CI implies that if we took many random samples of the same size, **95% of those intervals would contain the true population value**. - **Note**: This holds true only if each sample is randomly selected. If you choose a sample with a large effect size, the CI might **overestimate** the population effect. **Impact of Sample Size on CI Precision**: - **Small Samples**: Lead to **wider CIs**, providing a less precise estimate of the population effect size. - **Large Samples**: Lead to **narrower CIs**, giving a more accurate estimate of the population effect size. **Advantages of CIs and Effect Sizes over p-values**: - **Less Variation Across Samples**: CIs and effect sizes tend to **fluctuate less than p-values** between samples, making them more reliable. - **Importance of Effect Size**: CIs provide insight into **how large or meaningful an effect is**, not just whether it is statistically significant. Common effect sizes - r= relation between two variables - d= differences between two groups on one variable **correlation (r)** - Correlation coefficient, Pearson's r, indicates how much two variables tend to "go together". - Controls for standard deviations of variables, so original units of measurement don't matter. - You could find the correlation between totally different pairs of variables and the correlation coefficient still has same meaning. +-----------------------------------+-----------------------------------+ | | - : r = ∑ (zXzY) / (n-1) | +-----------------------------------+-----------------------------------+ A black background with red and blue text Description automatically generated **Range of r**: Possible values for r range from -1 to +1. - **+1**: Perfect positive correlation -- high scores on X are associated with high scores on Y. - **0**: No correlation -- no linear relationship between X and Y. - **-1**: Perfect negative correlation -- high scores on X are associated with low scores on Y. **Intuitive Understanding**: - When most people are above average (or below) on both variables, r will be positive. - If one variable is above average and the other below for most people, r will be negative. **Predicting One Variable from Another** - If you know someone's **z-score** (standardized score) for variable X and the correlation rXY between X and Y: - You can **estimate their z-score for Y** using the formula: - zŶ = (zX)(rXY). - Ŷ= estimated value - where zŶ is the estimated z-score for Y, given zX and rXY - **Example**: - Suppose my z-score for \"charm\" is -2.0. - If the correlation between \"charm\" and \"wit\" is r=0.5 - then: zŶ =(−2.0)(0.5)= −1.0 - This suggests a less extreme (closer to the mean) z-score for \"wit\" compared to \"charm.\" **Practical Implications** - **Accuracy**: - These predictions are generally accurate across a group of people. - However, for an individual, the estimate can vary unless rr is very high, meaning people with similar zX scores can still have different zY scores. Group Difference (d) Cohen's d - For differences between means of two groups, it is sometimes useful to control for standard deviations-\ how relevant are the differences between these two groups to the standard deviation -- -------------------- d = (x̄A -- x̄B) / s -- -------------------- - S= standard deviation of each group, assumed to be the same (if it's not the same, then average them in some way) - When the means of two groups are equal, Cohen\'s d is 0 because there is no difference to standardize. But as the difference between the means grows relative to the within-group standard deviation, d increases, and it can exceed +1 or -1. For example: - If the means differ by one standard deviation unit, d will be approximately 1 (or -1 for a negative difference). - If the difference is greater than one standard deviation, d will exceed +1 or -1, indicating a large effect size. Is this effect small, medium, or large? Small effect r =.10 OR d = 0.20 ------------------ --------------------- Medium effect: r =.30 OR d = 0.50 Large effect: r =.50 OR d = 0.80 ...but this depends on variables being measured. Explain whether you can increase power without increasing type I error rate (alpha). - Can we keep alpha while increasing power? Yes, you can cause you can increase the sample size, so the confidence intervals go narrower, so you can reject with a small effect - You can also increase the reliability of measurements Explain whether you can increase power without increasing type II error (beta). - Power= 1- beta - It is impossible to increase power and type II error because each has to add up to 1, so yes you can because whenever power goes up, power goes down. They have an opposite relationship Understand how p-values depend on sample size and effect size (i.e., what is needed to find a low p-value). - If a small sample size gets a p-value of \ - **Random Sampling:** A random sample ensures that every member of the population has an equal chance of being included, which enhances the likelihood that the sample will reflect the characteristics of the population. - **Consequences of Non-Representative Samples:** This can lead to biased results, where certain groups are over- or underrepresented, potentially skewing conclusions and recommendations based on the data. - In summary, the representativeness of a sample is crucial for ensuring that research findings are valid and applicable to the broader population. **Random Sampling** Researchers aim to obtain representative samples by employing various random sampling methods. Here are the main types of random sampling: 1. Simple Random Sampling: - Definition: Every individual in the population has an equal chance of being selected. - Goal: To achieve a representative sample by minimizing selection bias. - Process: Researchers may follow up on \"missed\" individuals to ensure that the sample is comprehensive. 2. Stratified Random Sampling: - Definition: The population is divided into distinct subpopulations (strata), and simple random sampling is conducted within each stratum. - Goal: To ensure that all categories (e.g., age, sex, ethnicity) are proportionately represented in the sample. - Example: Public opinion surveys may sample individuals from various age groups to accurately capture diverse perspectives. 3. Cluster Sampling: - Definition: If a complete list of the population is not available, researchers sample entire clusters (subgroups) and then randomly select individuals within those clusters. - Goal: To simplify the sampling process when only subgroups (e.g., census districts, school boards) are accessible. - Process: Researchers take a random sample of clusters based on their size and subsequently conduct simple random sampling within those clusters. 4. Multi-Stage Sampling: - Definition: A more complex form of cluster sampling, where clusters themselves may be subdivided into smaller clusters, and random samples are taken at multiple stages. - Goal: To enhance the efficiency of sampling in large or complex populations. Know how the confidence interval for means and for proportions varies according to sample size **1. Impact of Sample Size on Results:** - Even with a random sample, smaller samples may yield results that are significantly different from the population due to chance. - Larger samples provide better estimates, as the sampling error (expressed by the width of confidence intervals) decreases in relation to the square root of the sample size. **2. Size vs. Ratio:** - For large populations, the absolute size of the sample is more important than the ratio of the sample size to the population size. - Example: Surveying 2,000 Americans from a population of 330 million provides more reliable insights than surveying 1,000 Canadians from a population of 39 million, despite the latter having a higher proportion. **3. Effect of Sample Size on Confidence Intervals:** - When estimating a population proportion, the confidence interval narrows as the sample size increases: - **N = 100:** 95% CI = p ± 0.100 - **N = 1,000:** 95% CI = p ± 0.032 - **N = 10,000:** 95% CI = p ± 0.010 - A sample of 1,000 provides results accurate to within 3 percentage points 19 out of 20 times, especially when the proportion (p) is far from 0.50. **4. Estimating Population Means:** - When estimating a population mean, the confidence intervals also become narrower with larger samples: - **N = 25:** 95% CI = x̄ ± 0.40 sd - **N = 100:** 95% CI = x̄ ± 0.20 sd - **N = 400:** 95% CI = x̄ ± 0.10 sd - **N = 1,600:** 95% CI = x̄ ± 0.05 sd - As the sample size increases, standard deviations decrease, resulting in greater accuracy. **5. Confidence Intervals for Differences Between Means:** - When estimating differences between means across samples, confidence intervals are wider because each sample carries its own error. **6. Oversampling for Small Subpopulations:** - In cases where a researcher wants to study a small subpopulation, but random sampling doesn\'t yield a large enough sample, they can \"oversample\" that group. - After collecting data, researchers can adjust their estimates for the whole population by underweighting the oversampled group. - **Example**: If a study is investigating a health condition that primarily affects a small percentage of the population (e.g., a rare disease), researchers may oversample individuals from that group to ensure there are enough cases to analyze effectively. **Presenting Descriptive Data in Research** - When presenting descriptive data in research, even if the study is not primarily descriptive, it\'s beneficial to report descriptive statistics and frequency information, such as graphs and tables. This information should be accurate, concise, and easy to understand. Frequency distributions can be shown in tables or graphs if they are inherently interesting, while less interesting distributions should still be examined for any unusual patterns. Typically, reporting key descriptive statistics, like means and standard deviations, in a table is advisable for clarity. CORRELATIONAL RESEARCH **Correlational studies** - Researchers examine how much two or more variables "covary" or correlate with each other - Can be in natural or artificial setting but there is no manipulation of variables - Larger samples give more accurate indication of the correlation in the population - Confidence interval shrinks with square root of sample size N = 25 → 95% CI = r +/- 0.40 -------------- --------------------- N = 100 → 95% CI = r +/- 0.20 N = 400 → 95% CI = r +/- 0.10 N = 1600 → 95% CI = r +/- 0.05 - When r is far from 0, these Cis become narrower and asymmetrical - The numbers above correspond to critical values beyond which a correlation will be statistically significant in a two-tailed test with alpha=.05 - E.g., for N=100, p\