MM Tut 9 & 10 PDF
Document Details
Uploaded by RestoredBeryllium8557
Maastricht University
Tags
Related
Summary
This document contains tutorials on statistical concepts, including the better/worse-than-average effect, measures of central tendency, reliability, Six Sigma methods, and Null Hypothesis Significance Testing (NHST).
Full Transcript
# Tutorial 9 ## Better/Worse-Than-Average Effect - People generally think they are above average in their abilities for tasks perceived as easy or familiar. - People might rate themselves as worse than average, anchoring on their perception of their absolute performance level and insufficiently ad...
# Tutorial 9 ## Better/Worse-Than-Average Effect - People generally think they are above average in their abilities for tasks perceived as easy or familiar. - People might rate themselves as worse than average, anchoring on their perception of their absolute performance level and insufficiently adjusting for the task's difficulty for everyone for harder tasks. ## Diminishing Sensitivity - Value derived from gains decreases as the amount of gain increases. - That is, the first €100 are more valuable than the additional €100. ## Loss Aversion - People tend to prefer the risky option, reflecting a tendency to take risks to avoid a sure loss. > "The most important determinant of a great data scientist isn't knowing lots of complicated techniques. It's having common sense and curiosity, a knack for asking good questions, and the ability to tell a good story with data." (Steven Levitt) # Why does reliability matter? - Multi-item scales increase precision of the theoretical construct that is measured. - Reliability in multi-item scales ensures that the scale consistently measures the concept it's intended to measure, providing confidence in the results obtained. ## Cronbach's Alpha | Reliability Interpretation | Cronbach's Alpha | |---------------------------|--------------------| | Unacceptable Reliability | < 0.5 | | Poor Reliability | 0.5-0.6 | | Questionable Reliability | 0.6-0.7 | | Acceptable Reliability | 0.7-0.8 | | Good Reliability | 0.8-0.9 | | Excellent Reliability | > 0.9 | ## Measures of Central Tendency - **Mean**: Calculated by adding up all the values in a dataset and then dividing by the number of values. - **Median**: The middle value in a dataset when the values are arranged in ascending or descending order. If there is an even number of values, the median is the average of the two middle numbers. - **Mode**: Value that is observed most often in the data. - The mean and median are roughly the same when a distribution is symmetrical. - The mean and median are different when a distribution is skewed. ## Standard Deviation - Quantifies the amount of variation or dispersion in a set of values. # Measure of Dispersion ## Six Sigma - It is a quality management methodology that aims to improve business and manufacturing processes by minimizing defects and variability. - Its goal is to achieve a level where the process mean and the nearest specification limit are six standard deviations apart, resulting in extremely high quality and efficiency with only about 3.4 defects per million opportunities. ## Box Plots - Visualize critical information about the data. ## QQ-Plot - Matches the data points from your dataset with those of a well-known distribution (like the normal distribution). - Deviations from a straight line in a QQ-plot indicate departures from a theoretical distribution. # Tutorial 10 ## Null Hypothesis Significance Testing (NHST) - A statistical method used to determine if there is enough evidence in a sample of data to reject a null hypothesis. ## Test Statistics - Measure the degree of agreement between the sample data and the null hypothesis. Different tests use different test statistics. ## P-Values - Probabilities that express how extreme the observed data are, assuming the null hypothesis is true. The p-value shares the same interpretation for different statistical tests. - The cut-off (alpha level) between a non-significant and significant results is commonly set at 0.05. - The alpha level is inversely related to the confidence level. - *The p-value is the probability of obtaining an observed, or more extreme, test statistic under the assumption that the null hypothesis is true.* | **Null Hypothesis (HO)** | **p-value** | **Alternative Hypothesis (H1)** | |:-------------------------|:------------|:--------------------------------| | The mean of the outcome variable is the same across different levels of the predictor variable. | If p > .05, we fail to reject HO | The mean of the outcome variable differs between the levels of the predictor variable. | | | If p≤.05, we reject HO | | - *In NHST, a null hypothesis can only be rejected or failed to be rejected, but never be accepted!* ## Directional Hypothesis | **Null Hypothesis (HO)** | **Alternative Hypothesis (H1)** | |:-------------------------|:--------------------------------| | The mean of the outcome variable is the same across different levels of the predictor variable. | The predictor variable decreases the outcome variable. | | The mean of the outcome variable is the same across different levels of the predictor variable. | The predictor variable increases the outcome variable. | | The mean of the outcome variable is the same across different levels of the predictor variable. | The mean of the outcome variable differs between the levels of the predictor variable. | # t-tests | **Test** | **Predictor variable** | **One-sample** | **Paired** | **Independent** | **ANOVA** | **Chi^2 test for independence** | |:--------|:-----------------------|:---------------|:----------|:------------------|:--------|:---------------------------------| | | Categorical | Categorical | Categorical | Categorical | Categorical | Categorical | | Number of levels in predictor variable | 1 | 2 | 2 | 2 or more | 2 or more | 2 or more | | Outcome variable | Continuous | Continuous | Continuous | Continuous | Categorical | | A1: Independent observations | | All tests assume independent observations <br> The paired t-test assumes independent pairs | | | No assumption of independence | | A2: Normality | | Assume a normally distributed outcome variable | | | No assumption of normality | | Additional Assumptions | | Paired observations | | | Homogeneity of variances | Expected counts of 5 or more in at least 80% of cells | ## One sample t-test - When to use it: You want to compare a sample mean to a test value. ## Paired sample t-test - When to use it: You compare means across pairwise observations. ## Independent sample t-test - When to use it: You have two separate and independent groups and you want to compare their means. ## Violating Assumptions - Independence -> Be aware that results might be biased. - Normality -> Non-parametric test - Unequal variances - > Welch correction ## ANOVA - When to use it: You have more than two separate and independent groups and you want to compare their means. ## Chi^2 Test for Independence - When to use it: You have more than two separate and independent groups and want to compare their proportions based on a categorical or ordinal variable.