Midterm Psychological Statistics PDF
Document Details
Tags
Summary
This document covers psychological statistics, specifically hypothesis testing, effect size, and types of errors. It discusses the variables involved in research, scientific guesses about relationships between variables, the four steps of a hypothesis test, and examples. It also addresses the concept of errors in hypothesis testing like type I and type II.
Full Transcript
MODULE 2 MIDTERM: Psychological Statistics - Every sample mean (M) has a z-score that describes its position in the distribution of Lesson 6: Introduction to Statistical Inference: sam...
MODULE 2 MIDTERM: Psychological Statistics - Every sample mean (M) has a z-score that describes its position in the distribution of Lesson 6: Introduction to Statistical Inference: sample means. Hypothesis Testing, Effect Size and Types of Errors Hypothesis Testing Hypothesis - Identifies the specific variables involved in the study/research and describes how they are Example related. 1. A sample of n = 4 scores is selected from a normal distribution with a mean of μ = 40 and a standard Hypothesis deviation of σ = 16. The sample mean is M = 42. Find - A scientific guess of the relationship between the z-score for this sample mean and determine the two (or more) variables. probability of obtaining a sample mean larger than M = 42. This is, find p(M > 42) for n = 4. Hypothesis test - A statistical method that uses sample data to evaluate a hypothesis about a population. Hypothesis Testing - The application of statistical analysis and logic to data gathered to determine the answer to a research question. The Four Steps of a Hypothesis Test Step 1: State the hypothesis Step 2: Set the criteria for a decision. Errors in Hypothesis Testing Step 3: Collect data and compute sample statistics. Step 4: Make a decision. Type I Error (False Positive) - Occurs when a researcher rejects a null The alpha level, or the level of significance, is a hypothesis that is actually true. In a typical probability value that is used to define the concept of research situation, a Type I error means the “very unlikely” in a hypothesis test. researcher concludes that a treatment does have an effect when in fact it has no effect. The critical region is composed of the extreme sample values that are very unlikely (as defined by the Type II Error (False Negative) alpha level) to be obtained if the null hypothesis is true. - Occurs when a researcher fails to reject a null The boundaries for the critical region are hypothesis that is really false. In a typical determined by the alpha level. If sample data fall in the research situation, a Type II error means that the critical region, the null hypothesis is rejected. hypothesis test has failed to detect a real treatment effect. Boundaries for the Critical Region - Represented by the symbol β (beta). - To determine the exact location for the boundaries that define the critical region, we use the alpha-level probability and the unit normal table. Comparison of z-Score for Population and for Samples Z-score for Population - Every score (X) has a z-score that describes its position in the distribution of scores Z-score for Sample This reviewer is NOT FOR SALE. Assumptions for Hypothesis Tests with z-Scores Random Sampling - Helps to ensure that the sample is representative. EXTRA INFO FOR Errors in hypothesis testing Independent Observations - Two events (or observations) are independent if Type I error the occurrence of the first event has no effect on - no virus but result is positive = false positive the probability of the second event. - Minali ang tama Type II error The Value of σ Is Unchanged by the Treatment - with virus but result negative = false negative - Recall that adding (or subtracting) a constant - Tinama ang mali changes the mean but has no effect on the standard deviation. Reporting the Results of the Statistical Test Normal Sampling Distribution - Z-table can be used only if the distribution of Statistical tests are used in hypothesis testing. sample means is normal. Result Directional (One-Tailed) Hypothesis Tests - Is said to be significant or statistically significant if it is very unlikely to occur when the null Directional hypothesis test, or a one-tailed test hypothesis is true. That is, the result is sufficient - The statistical hypotheses (H0 and H1) specify to reject the null hypothesis. either an increase or a decrease in the population mean. Significant Result - In statistical tests, a significant result means that Measuring Effect Size the null hypothesis has been rejected, which means that the result is very unlikely to have Measure of effect size occurred merely by chance. - Is intended to provide a measurement of the absolute magnitude of a treatment effect, Significance independent of the size of the sample(s) being - Is a term that tells us how confident we can be used. that a difference or relationship exist. - Cohen (1988) recommended that effect size can be standardized by measuring the mean Factors That Influence a Hypothesis Test difference in terms of the standard deviation. The Variability of the Scores - Higher variability can reduce the chances of finding a significant treatment effect by reducing the z-score value. Cohen’s (1988) criteria for evaluating the size of a treatment effect The Number of Scores in the Sample - Increasing the number of scores in the sample produces a smaller standard error and a larger value for the z-score. Two serious limitations with using a hypothesis test to establish the significance of a treatment effect: This reviewer is NOT FOR SALE. 1. The focus of a hypothesis test is on the data rather than the hypothesis. 2. Demonstrating a significant treatment effect does not necessarily indicate a substantial treatment effect. The Power of a Statistic Power - Using a larger sample increases power. - The probability that the test correctly rejects the null hypothesis. - Parametric tests have more power than Estimated Standard Error (sM) non-parametric tests. - Is used as an estimate of the real standard error σM when the value of σ is unknown. The power of a statistical test - - Is the probability that the test will correctly reject a false null hypothesis. Calculating the Power of a Statistic - Helps you to determine if your sample size is large enough. Power and Effect Size t statistic - Consider what would happen if the treatment - is used to test hypotheses about an unknown effect were only 4 points. population mean, μ, when the value of σ is - Measures of effect size such as Cohen’s d and unknown. measures of power both provide an indication of - the strength or magnitude of a treatment effect. Other Factors that Affect Power Sample Size Degrees of freedom - A larger sample produces greater power for a - Describes the number of scores in a sample that hypothesis test. are independent and free to vary. Alpha Level A t distribution - Reducing the alpha level for a hypothesis test - Is the complete set of t values computed for also reduces the power of the test. every possible random sample. One-Tailed vs. Two-Tailed Tests - If the treatment effect is in the predicted As the value for df increases, the t distribution direction, then changing from a regular becomes more similar to a normal distribution. two-tailed test to a one-tailed test increases the power of the hypothesis test. The t Statistic Limitation of z-scores - z-score requires that we know the value of the population standard deviation (or variance), which is needed to compute the standard error. - In most situations, however, the standard deviation for the population is not known. - When the variance (or standard deviation) for Assumptions of the t Test the population is not known, we use the corresponding sample value in its place. 1. The values in the sample must consist of independent - observations. Formulas: 2. The population sampled must be normal. The Influence of Sample Size and Sample Variance This reviewer is NOT FOR SALE. - Estimated standard error is directly related to the sample variance so that the larger the variance, the larger the error. - The estimated standard error is inversely related to the number of scores in the sample. The larger the sample, the smaller the error. Measuring Effect Size for the t Statistic Estimated Cohen’s d Assumptions of the Related-Samples t Test Measuring the Percentage of Variance - An alternative method for measuring effect size 1. Independent observations (within treatment) is to determine how much of the variability in the 2. The population distribution of difference scores (D scores is explained by the treatment effect. values) must be normal. - The concept behind this measure is that the treatment causes the scores to increase (or Effect Size and Confidence Intervals for the decrease), which means that the treatment is Repeated-Measures/Related Samples t causing the scores to vary. - If we can measure how much of the variability is Effect Size explained by the treatment, we will obtain a Calculating Cohen’s d measure of the size of the treatment effect. Measuring the Percentage of Variance Explained, r2 Experiment: Swearing and pain tolerance Confidence Intervals for estimating μ Confidence interval - An interval, or range of values centered around a sample statistic. The logic behind a confidence interval is that a sample statistic, such as a sample mean, should be relatively near to the The Percentage of Variance Accounted for, r2 corresponding population parameter. T-Test for Related Samples Basic Elements of the t Statistic T-Test for Related Samples Hypothesis Test Step 1: State the hypotheses, and select the alpha level. Two-tailed This reviewer is NOT FOR SALE. Ho : μ = 0 There is no difference between the two conditions H1 : μ ≠ 0 There is a difference. α =.05 One tailed Ho : μ ≥ 0 There is no decrease with swearing. H1 : μ < 0 Pain ratings decrease. α =.01 Step 2: Locate the critical region. For two-tailed test, n = 9, so the t statistic has df = n – 1 = 8. For α =.05, the critical value listed in the t distribution table is ±2.306. For one-tailed test. In this example, change is in the predicted direction (the researcher predicted lower ratings and the sample mean shows a decrease.) With n = 9, we obtain df = 8 and a critical value of t = 2.896 for a one-tailed test with α =.01. Thus, any t statistic beyond 2.896 (positive or negative) is Step 4: Make a decision sufficient to reject the null hypothesis. Two-tailed: Reject Ho One-tailed: Reject Ho Analysis of Variance Analysis of variance (ANOVA) - Is a hypothesis-testing procedure that is used to evaluate mean differences between two or more treatments (or populations) The major advantage of ANOVA is that it can be used to compare two or more treatments. Basic Structure of ANOVA A typical situation in which ANOVA would be used. Three separate samples are obtained to evaluate the mean differences among three populations (or Step 3: Calculate the t-statistic. treatments) with unknown means. 1. Compute the sample variance. 2. Use the sample variance to compute the estimated standard error. 3. Use the sample mean (MD) and the hypothesized population mean (µD) along with the estimated standard error to compute the value for the t statistic. Factorial ANOVA - In analysis of variance, the variable (independent or quasi-independent) that This reviewer is NOT FOR SALE. designates the groups being compared is called n = number of scores in each treatment. n = 5 for all a factor. the treatments. - The individual conditions or values that make up N = total number of scores in the entire study. N = 3(5) a factor are called the levels of the factor. = 15 scores in the entire study. T = sum of the scores (ΣX) for each treatment Testwise alpha level condition. - Is the risk of a Type I error, or alpha level, for an G = sum of all the scores in the research study (the individual hypothesis test. grand total) is identified by G. SS and M for each sample Variance in the Denominator ΣX² for the entire set of N - Measures the mean differences that would be expected if there is no treatment effect. Why ANOVA instead of t-Test? The advantage of ANOVA is that it performs all three comparisons simultaneously in one hypothesis test. Thus, no matter how many different means are being compared, ANOVA uses one test with one alpha level to evaluate the mean differences and thereby avoids the problem of an inflated experimentwise alpha level. Statistical Hypotheses for ANOVA The Structure and Sequence of Calculations for the ANOVA In statistical terms, we want to decide between two hypotheses: Ho: The telephone condition has no effect, H1: The telephone condition does affect driving. Testwise alpha level - Is the risk of a Type I error, or alpha level, for an individual hypothesis test. Components of ANOVA Analysis of Variance - Divides the total variability into two basic Analysis of Sum of Squares (SS) components: Partitioning the sum of squares (SS) for the independent 1. Between-Treatments Variance. Measures how measures ANOVA. much difference exists between the treatment conditions. 2. Within-Treatment Variance. Provides a measure of the variability inside each treatment condition or how big the differences are when Ho is true. For the independent-measures ANOVA, the F-ratio has the following structure: ANOVA Notation and Formulas ANOVA Summary Table k = number of treatment conditions—that is, the number of levels of the factor. k = 3. This reviewer is NOT FOR SALE. Post hoc tests (or posttests) - Are additional hypothesis tests that are done after an ANOVA to determine exactly which mean differences are significant and which are not. Tukey’s Honestly Significant Difference (HSD) Test - Allows you to compute a single value that determines the minimum difference between treatment means that is necessary for significance. Honestly significant difference, or HSD - This value is used to compare any two treatment conditions. The Scheffè Test - Has the distinction of being one of the safest of all possible post hoc tests (smallest risk of a Type I error) because it uses an extremely cautious method for reducing the risk of a Type I error. Assumptions for the Independent-Measures ANOVA 1. The observations within each sample must be independent. 2. The populations from which the samples are selected must be normal. 3. The populations from which the samples are selected must have equal variances (homogeneity of variance). Summary of ANOVA Formulas Computing Effect Size for ANOVA - The data produced a between-treatments SS of 70 and a total SS of 116. Thus, This reviewer is NOT FOR SALE.