Descriptive Statistics Overview
18 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which centrality measure is most appropriate for a continuous variable with a right skewed distribution?

  • Mode
  • Standard Deviation
  • Mean
  • Median (correct)
  • What graphical representation is best suited for displaying the relationship between two continuous variables?

  • Bar Chart
  • Histogram
  • Scatterplot (correct)
  • Box Plot
  • What is a key assumption of the independent sample T-Test?

  • Observations can be dependent
  • The data must be categorical
  • Variance does not need to be equal
  • Samples must be independent (correct)
  • When analyzing categorical variables, which test would you use to determine if there is a significant association between them?

    <p>Chi-Square Test</p> Signup and view all the answers

    Which of the following statements is true regarding left skewed distributions?

    <p>Mean is less than Median</p> Signup and view all the answers

    Which analysis method is appropriate for ordinal data paired across two samples?

    <p>Sign Test</p> Signup and view all the answers

    What is the main requirement for using the Wilcoxon Test?

    <p>Two paired samples</p> Signup and view all the answers

    What do you interpret when the p-value is low, according to hypothesis testing?

    <p>There is strong evidence against the null hypothesis</p> Signup and view all the answers

    What does a p-value less than 0.05 indicate?

    <p>Reject the Null Hypothesis</p> Signup and view all the answers

    Which factor does NOT influence the width of a 95% Confidence Interval?

    <p>Margin of error</p> Signup and view all the answers

    What does it mean if the 95% Confidence Interval includes 0 for the difference in means?

    <p>There is no difference between the groups</p> Signup and view all the answers

    Which type of sampling ensures that each individual has an equal chance of being selected?

    <p>Simple random sampling</p> Signup and view all the answers

    What is the main weakness of stratified sampling?

    <p>Requires a complete list of the population</p> Signup and view all the answers

    In effect modification, how is the relationship between a risk factor and an outcome primarily identified?

    <p>By comparing odds ratios or relative risks</p> Signup and view all the answers

    What does a 95% Confidence Interval that does not include 1 indicate in terms of risk?

    <p>Significant risk difference</p> Signup and view all the answers

    What sampling method involves selecting every kth individual from a list after setting a random starting point?

    <p>Systematic sampling</p> Signup and view all the answers

    Which condition describes confounding in research?

    <p>It distorts the effect of a risk factor on an outcome</p> Signup and view all the answers

    What does a sample size formula n = 1.96 x m^2 calculate?

    <p>Required sample size</p> Signup and view all the answers

    Study Notes

    Descriptive Statistics

    • Continuous Data Distributions: Data can be either odd-shaped or bell-shaped.
    • Odd-Shaped Data: Use the median to describe centrality and quartiles to describe spread because mean is skewed by the outliers.
    • Bell-Shaped Data: Use the mean for central tendency and the standard deviation for spread.
    • Skewness impact: Right skew (positive): mean > median; Left skew (negative): mean < median.

    Categorical vs. Continuous Data

    • Categorical Data: More than two responses, non-numeric, ordered (ordinal) or unordered (nominal). Examples include country of birth or Likert scales.
    • Continuous Data: Data can take any value in a given range, numerical, with infinite resolution. Examples include weight, age, and height.

    Data Visualization

    • Categorical Data (1 variable): Bar Chart
    • Categorical Data (2 variables): Clustered Bar Chart
    • Continuous Data (1 variable): Box Plot or Histogram
    • Continuous Data (2 variables): Scatterplot
    • Continuous & Categorical Data: Two Boxplots or two Histograms

    Hypothesis Testing Assumptions

    • Independent Samples T-test:
      • Normal distribution of continuous variables (check histograms).
      • Independent samples (individuals within samples and between samples).
      • Equal variances.
    • Chi-Square Test:
      • Categorical data.
      • Independent observations (individuals within/between groups).
      • Expected frequencies (in each category) > 5.
    • Sign Test:
      • Ordinal or continuous data.
      • Independent observations.
      • Paired samples.
      • Symmetrical differences (differences between paired observations).
    • Mann-Whitney U test:
      • Independent groups.
      • Rankable data.
      • Compares two groups, not normally distributed categorical data (ordinal).
    • Wilcoxon Test:
      • Paired samples.
      • Ordinal categorical data (rankable data).

    Interpretation of p-values

    • p < 0.05: Reject the null hypothesis; statistically significant.
    • p ≥ 0.05: Accept the null hypothesis; not statistically significant.
    • Lack of significance: Does not confirm equivalence, may indicate insufficient data. Consider increasing sample size.

    Confidence Intervals (95%)

    • Formula: Estimate ± (2 x Standard Error)
    • Interpretation: 95% confident that true population value is within the limits.
    • Factors impacting width: Sample size and standard deviation.
    • Consistency with p-values: 95% CI includes zero, p ≥ 0.05 (lack of significant difference); CI does not include zero, p < 0.05 (significant difference).

    Standard Deviation vs. Standard Error

    • Standard Error (SE): Measure of precision (of an estimate); standard deviation of estimates, related to n.
    • Standard Deviation (SD): Measure of spread (of the data itself).

    Sample Size Estimation

    • Formula: n = 1.96 x (m/SE)² where m = margin of error

    Confounding vs. Effect Modification

    • Confounding: Distortion of the effect of a risk factor on an outcome. Related to exposure and outcome (but not in causal pathway). Example: Age confounding smoking & lung cancer.
    • Effect Modification: Different relationship between exposure and outcome depending on a third variable. Example: Gender modifying the effect of a medicine on blood pressure.
    • Identification: Compare odds ratios or risk ratios to see if relationship changes with categories.

    Probability Sampling

    • Simple Random Sampling: Each sample has equal selection chance. Example: Random name selection. Strengths = minimizes bias. Weaknesses = underrepresentation for small populations, requires a complete list, potentially difficult to attain.
    • Stratified Sampling: Randomly sampling from subgroups/strata with shared characteristics. Example: GPA study across degrees (English, Science, Engineering) . Strengths: precise representation, all subgroups. Weaknesses: Requires detailed population information, not easy to survey.
    • Systematic Sampling: Select every kth unit after a random starting point. Example: Every 10th name. Strengths = even distribution/representation. Weaknesses: Sensitive to patterns within population.
    • Cluster Sampling: Randomly selecting entire clusters/groups. Example: Randomly selecting schools. Strengths = large populations over wide area. Weaknesses = increased sampling error.
    • Two-Stage Sampling: Two levels of selection. Example: survey streaming platform users. Strengths: cost-effective, wide-area representation. Weaknesses: biased results for uneven population distribution across clusters.

    Identifying Statistical Significance with Difference in Means/Risk and Confidence Intervals

    • Difference in Mean: 95% CI does not include 0 → significant difference; includes 0 → no significant difference.
    • Risk: 95% CI does not include 1 → significant risk difference; includes 1 → no significant risk difference.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Description

    This quiz covers the fundamentals of descriptive statistics, including data distributions, key differences between categorical and continuous data, and visualization techniques. Understand the appropriate measures of central tendency and spread for various data types, as well as how to effectively represent data visually.

    More Like This

    Use Quizgecko on...
    Browser
    Browser