Untitled Quiz
0 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Study Notes

Introduction to Statistics

  • Statistics is a field of study dealing with the collection, analysis, interpretation, presentation, and organization of data.
  • It is used to understand patterns, trends, and relationships within data, often used to make predictions or decisions.

Types of Data

  • Categorical Data: Data that fits into categories or groups, like gender, color, or type.
  • Quantitative Data: Data that can be measured numerically, like height, weight, or temperature.

Sampling

  • Sampling is a process where a researcher selects one or more cases from a larger group (population) for study.
  • Important for studying populations too large to collect data on every member.
  • Crucial for generalizing findings to the entire population.

Sampling Methods

  • Simple Random Sampling (SRS): every member of the population has an equal chance of being selected. Can be with or without replacement.
  • Systematic Sampling: Every kth member of a population is selected.
  • Stratified Random Sampling: The population is divided into subgroups (strata). Then random samples are drawn from each stratum.
  • Cluster Sampling: A sampling method where the population is divided into groups (clusters). Then entire clusters are randomly selected.
  • Convenience or Accidental Sampling: The researcher selects the most accessible individuals or cases.

Data Collection Methods

  • Questionnaires: A structured set of questions used to collect data from individuals. Can be answered in person, by mail, phone, or online.
  • Recording: Recording data collected through observation.
  • Qualitative Methods: Methods used to find information through observation, watching, listening, or reading.

Sample Size

  • Sample size is the number of individuals selected for observations.
  • Precision (Acceptable amount of error)
  • Population Homogeneity (Variability in pop.)
  • Sampling Fraction (relative number of elements in sample to pop.)

Sampling Fraction Adjustment

  • n' (adjusted sample size) = n (estimated sample size without adjustment) / [1+(n/N)]
  • N: population size

Non-Probability Sampling

  • Availability sampling: Uses readily available and accessible participants.
  • Snowball sampling: Participants refer other participants to take part.
  • Quota sampling: Samples are selected to match characteristics of the population across multiple subgroups.
  • Purposive sampling: Selects participants based on their specific characteristic.

Spurious Relationships

  • A spurious relationship exists where two variables appear to have a relationship, but that relationship is actually caused by a third variable.
  • Controlling for other variables is important for understanding the true relationship.

Data Display

  • Graphs: Used to present and visualize the distribution of data.
    • Bar charts: Useful for displaying categorical data.
    • Pie charts: Useful for displaying categorical data (parts of one whole).
    • Histograms: Used for frequency distribution of quantitative data.
    • Time series plots: Used to show how a variable changes over time.
    • Dot plots: Useful for graphically representing data
    • Stem plots: Display individual data points in a systematic way.

Variables

  • Individuals: Objects or entities being observed. Can be people, animals or things.
  • Variables: Characteristics of an individual. Can take various values or categories.
    • Quantitative: Measured numerically. Examples: Height, weight, temperature.
    • Categorical: Fits into categories. Examples: Eye color, gender.
  • Categorical types: Nominal (unordered categories), ordinal (ranked categories)

How to Determine Variable Type

  • Ask what is being measured of each individual.
  • Is it a numerical value, or a descriptive category?

Measures of Center

  • Mean: Average of all values in a data set.
  • Median: Center point of a data set when ordered.
  • Mode: Most frequent value in the data set.

Measures of Spread

  • Range: Difference between the highest and lowest values.
  • Interquartile Range (IQR): range between third and first quartile of a data set.
  • Standard Deviation: Average distance between each data point and the mean.
  • Variance: Sum of squares of deviations from the mean, divided by degrees of freedom.
  • Semi-Interquartile Deviation: Half the difference between the third and first quartiles.

Box Plots

  • Box plots visually display the five-number summary (Min , Q1, Median, Q3, Max) of a set of data.
  • Helpful for identifying outliers.

Outliers

  • Outlier: An observation that is substantially different from most of the other data points.
  • Potential Issues
  • How the outlier influences the calculated mean and standard deviation.

Choosing a Summary Statistic

  • Use the mean and standard deviation for symmetrical distributions without outliers.
  • Use the median for non-symmetrical distributions and those with outliers.

Hypothesis Testing

  • Table showing the possible outcomes of a hypothesis test is included.
  • Includes Type I and Type II errors for rejecting or accepting the null hypothesis.

Confidence Intervals

  • Specific methods for calculating 96% and 70% confidence intervals (CI) using a given standard deviation and mean are included.
  • Confidence Intervals (CI) give a range within which a true population value is estimated to lie with a specified confidence level.

Z-scores and Probabilities

  • Explains z-scores transformation of normal distributions, and how to interpret percentiles.
  • Includes z-score ranges for different grades (A, B, C, etc.)
  • Explains how to interpret z-score values from tables of percentiles.

Probability

  • A quantitative assessment of the likelihood of an uncertain event occurring.
  • Always between 0 and 1 (inclusive), [0,1].

Sensitivity, Specificity, PPV, and NPV

  • Sensitivity: Percentage of true positives (correct positive results).
  • Specificity: Percentage of true negatives (correct negative results).
  • Positive Predictive Value (PPV): Percentage of true positives among positive test results.
  • Negative Predictive Value(NPV): Percentage of true negatives among negative test results.
  • Important in evaluating the usefulness of tests (e.g. screening tests).

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Sampling Distributions PDF

More Like This

Untitled Quiz
6 questions

Untitled Quiz

AdoredHealing avatar
AdoredHealing
Untitled Quiz
37 questions

Untitled Quiz

WellReceivedSquirrel7948 avatar
WellReceivedSquirrel7948
Untitled Quiz
55 questions

Untitled Quiz

StatuesquePrimrose avatar
StatuesquePrimrose
Untitled Quiz
18 questions

Untitled Quiz

RighteousIguana avatar
RighteousIguana
Use Quizgecko on...
Browser
Browser