Podcast
Questions and Answers
Which centrality measure is most appropriate for a continuous variable with a right skewed distribution?
Which centrality measure is most appropriate for a continuous variable with a right skewed distribution?
What graphical representation is best suited for displaying the relationship between two continuous variables?
What graphical representation is best suited for displaying the relationship between two continuous variables?
What is a key assumption of the independent sample T-Test?
What is a key assumption of the independent sample T-Test?
When analyzing categorical variables, which test would you use to determine if there is a significant association between them?
When analyzing categorical variables, which test would you use to determine if there is a significant association between them?
Signup and view all the answers
Which of the following statements is true regarding left skewed distributions?
Which of the following statements is true regarding left skewed distributions?
Signup and view all the answers
Which analysis method is appropriate for ordinal data paired across two samples?
Which analysis method is appropriate for ordinal data paired across two samples?
Signup and view all the answers
What is the main requirement for using the Wilcoxon Test?
What is the main requirement for using the Wilcoxon Test?
Signup and view all the answers
What do you interpret when the p-value is low, according to hypothesis testing?
What do you interpret when the p-value is low, according to hypothesis testing?
Signup and view all the answers
What does a p-value less than 0.05 indicate?
What does a p-value less than 0.05 indicate?
Signup and view all the answers
Which factor does NOT influence the width of a 95% Confidence Interval?
Which factor does NOT influence the width of a 95% Confidence Interval?
Signup and view all the answers
What does it mean if the 95% Confidence Interval includes 0 for the difference in means?
What does it mean if the 95% Confidence Interval includes 0 for the difference in means?
Signup and view all the answers
Which type of sampling ensures that each individual has an equal chance of being selected?
Which type of sampling ensures that each individual has an equal chance of being selected?
Signup and view all the answers
What is the main weakness of stratified sampling?
What is the main weakness of stratified sampling?
Signup and view all the answers
In effect modification, how is the relationship between a risk factor and an outcome primarily identified?
In effect modification, how is the relationship between a risk factor and an outcome primarily identified?
Signup and view all the answers
What does a 95% Confidence Interval that does not include 1 indicate in terms of risk?
What does a 95% Confidence Interval that does not include 1 indicate in terms of risk?
Signup and view all the answers
What sampling method involves selecting every kth individual from a list after setting a random starting point?
What sampling method involves selecting every kth individual from a list after setting a random starting point?
Signup and view all the answers
Which condition describes confounding in research?
Which condition describes confounding in research?
Signup and view all the answers
What does a sample size formula n = 1.96 x m^2 calculate?
What does a sample size formula n = 1.96 x m^2 calculate?
Signup and view all the answers
Study Notes
Descriptive Statistics
- Continuous Data Distributions: Data can be either odd-shaped or bell-shaped.
- Odd-Shaped Data: Use the median to describe centrality and quartiles to describe spread because mean is skewed by the outliers.
- Bell-Shaped Data: Use the mean for central tendency and the standard deviation for spread.
- Skewness impact: Right skew (positive): mean > median; Left skew (negative): mean < median.
Categorical vs. Continuous Data
- Categorical Data: More than two responses, non-numeric, ordered (ordinal) or unordered (nominal). Examples include country of birth or Likert scales.
- Continuous Data: Data can take any value in a given range, numerical, with infinite resolution. Examples include weight, age, and height.
Data Visualization
- Categorical Data (1 variable): Bar Chart
- Categorical Data (2 variables): Clustered Bar Chart
- Continuous Data (1 variable): Box Plot or Histogram
- Continuous Data (2 variables): Scatterplot
- Continuous & Categorical Data: Two Boxplots or two Histograms
Hypothesis Testing Assumptions
-
Independent Samples T-test:
- Normal distribution of continuous variables (check histograms).
- Independent samples (individuals within samples and between samples).
- Equal variances.
-
Chi-Square Test:
- Categorical data.
- Independent observations (individuals within/between groups).
- Expected frequencies (in each category) > 5.
-
Sign Test:
- Ordinal or continuous data.
- Independent observations.
- Paired samples.
- Symmetrical differences (differences between paired observations).
-
Mann-Whitney U test:
- Independent groups.
- Rankable data.
- Compares two groups, not normally distributed categorical data (ordinal).
-
Wilcoxon Test:
- Paired samples.
- Ordinal categorical data (rankable data).
Interpretation of p-values
- p < 0.05: Reject the null hypothesis; statistically significant.
- p ≥ 0.05: Accept the null hypothesis; not statistically significant.
- Lack of significance: Does not confirm equivalence, may indicate insufficient data. Consider increasing sample size.
Confidence Intervals (95%)
- Formula: Estimate ± (2 x Standard Error)
- Interpretation: 95% confident that true population value is within the limits.
- Factors impacting width: Sample size and standard deviation.
- Consistency with p-values: 95% CI includes zero, p ≥ 0.05 (lack of significant difference); CI does not include zero, p < 0.05 (significant difference).
Standard Deviation vs. Standard Error
- Standard Error (SE): Measure of precision (of an estimate); standard deviation of estimates, related to n.
- Standard Deviation (SD): Measure of spread (of the data itself).
Sample Size Estimation
- Formula: n = 1.96 x (m/SE)² where m = margin of error
Confounding vs. Effect Modification
- Confounding: Distortion of the effect of a risk factor on an outcome. Related to exposure and outcome (but not in causal pathway). Example: Age confounding smoking & lung cancer.
- Effect Modification: Different relationship between exposure and outcome depending on a third variable. Example: Gender modifying the effect of a medicine on blood pressure.
- Identification: Compare odds ratios or risk ratios to see if relationship changes with categories.
Probability Sampling
- Simple Random Sampling: Each sample has equal selection chance. Example: Random name selection. Strengths = minimizes bias. Weaknesses = underrepresentation for small populations, requires a complete list, potentially difficult to attain.
- Stratified Sampling: Randomly sampling from subgroups/strata with shared characteristics. Example: GPA study across degrees (English, Science, Engineering) . Strengths: precise representation, all subgroups. Weaknesses: Requires detailed population information, not easy to survey.
- Systematic Sampling: Select every kth unit after a random starting point. Example: Every 10th name. Strengths = even distribution/representation. Weaknesses: Sensitive to patterns within population.
- Cluster Sampling: Randomly selecting entire clusters/groups. Example: Randomly selecting schools. Strengths = large populations over wide area. Weaknesses = increased sampling error.
- Two-Stage Sampling: Two levels of selection. Example: survey streaming platform users. Strengths: cost-effective, wide-area representation. Weaknesses: biased results for uneven population distribution across clusters.
Identifying Statistical Significance with Difference in Means/Risk and Confidence Intervals
- Difference in Mean: 95% CI does not include 0 → significant difference; includes 0 → no significant difference.
- Risk: 95% CI does not include 1 → significant risk difference; includes 1 → no significant risk difference.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Description
This quiz covers the fundamentals of descriptive statistics, including data distributions, key differences between categorical and continuous data, and visualization techniques. Understand the appropriate measures of central tendency and spread for various data types, as well as how to effectively represent data visually.