Podcast
Questions and Answers
What is the range of values a proportion can take?
What is the range of values a proportion can take?
- From -1 to 1
- From 0 to 1 (correct)
- From -100% to 100%
- From 1 to 10
What does the parameter 'p' represent in the binomial distribution?
What does the parameter 'p' represent in the binomial distribution?
- Number of successes out of n trials
- Probability of success in each trial (correct)
- Total number of trials
- Average number of successes
Which of the following statements about the binomial distribution is true?
Which of the following statements about the binomial distribution is true?
- It can be analyzed using the normal distribution.
- It is a discrete probability distribution. (correct)
- It is a continuous distribution.
- It assumes trials are dependent.
If the prevalence of type 2 diabetes (T2DM) is 11.3%, how many individuals would you expect to find with T2DM in a sample of 50 Americans?
If the prevalence of type 2 diabetes (T2DM) is 11.3%, how many individuals would you expect to find with T2DM in a sample of 50 Americans?
In the context of a binomial distribution, what can be said about the frequency representation of diseases?
In the context of a binomial distribution, what can be said about the frequency representation of diseases?
What does a p-value of less than 0.05 in the Shapiro-Wilk test indicate?
What does a p-value of less than 0.05 in the Shapiro-Wilk test indicate?
What is the significance of the central limit theorem?
What is the significance of the central limit theorem?
What is represented by the standard error (SE)?
What is represented by the standard error (SE)?
In a Q-Q plot, what does it indicate if the points lie along the line y = x?
In a Q-Q plot, what does it indicate if the points lie along the line y = x?
Which of the following best describes the distribution of a variable in the sample?
Which of the following best describes the distribution of a variable in the sample?
What is the main purpose of biostatistics?
What is the main purpose of biostatistics?
Which of the following describes a characteristic of a sample in biostatistics?
Which of the following describes a characteristic of a sample in biostatistics?
What type of research must be randomized to ensure unbiased results?
What type of research must be randomized to ensure unbiased results?
What is the consequence of using a small and biased sample in research?
What is the consequence of using a small and biased sample in research?
What distinguishes continuous numeric variables from discrete ones?
What distinguishes continuous numeric variables from discrete ones?
Which type of variable has inherent ordering?
Which type of variable has inherent ordering?
What is a key difference between sample quantities and population quantities in biostatistics?
What is a key difference between sample quantities and population quantities in biostatistics?
What is a dichotomous variable?
What is a dichotomous variable?
What is the purpose of the sample mean and sample variance in relation to a population?
What is the purpose of the sample mean and sample variance in relation to a population?
Which measures are more appropriate for analyzing skewed or non-normal data?
Which measures are more appropriate for analyzing skewed or non-normal data?
In a probability distribution, what characterizes a random variable?
In a probability distribution, what characterizes a random variable?
What is the shape of a standard normal distribution?
What is the shape of a standard normal distribution?
To standardize a normally-distributed variable, which formula is used?
To standardize a normally-distributed variable, which formula is used?
What percentage of the probability lies within ±1 standard deviation from the mean in a normal distribution?
What percentage of the probability lies within ±1 standard deviation from the mean in a normal distribution?
Which of the following pairs are the parameters that define a normal distribution?
Which of the following pairs are the parameters that define a normal distribution?
What does a cumulative probability in a probability distribution represent?
What does a cumulative probability in a probability distribution represent?
What does the null hypothesis (H0) signify in the context of comparing two population means?
What does the null hypothesis (H0) signify in the context of comparing two population means?
What is a key characteristic of the Wilcoxon-Mann-Whitney test?
What is a key characteristic of the Wilcoxon-Mann-Whitney test?
What does a p-value less than 0.05 typically indicate in hypothesis testing?
What does a p-value less than 0.05 typically indicate in hypothesis testing?
What would be the appropriate interpretation if the confidence interval (CI) includes the null value of zero?
What would be the appropriate interpretation if the confidence interval (CI) includes the null value of zero?
In the formula for the difference between two population means, $d = μ1 - μ2$, what does $d = 0$ signify?
In the formula for the difference between two population means, $d = μ1 - μ2$, what does $d = 0$ signify?
What does the standard error (SE) of the sample mean represent?
What does the standard error (SE) of the sample mean represent?
In the formula for a 95% confidence interval, what does Z represent?
In the formula for a 95% confidence interval, what does Z represent?
What additional distribution is used for calculating confidence intervals for smaller sample sizes?
What additional distribution is used for calculating confidence intervals for smaller sample sizes?
Which assumption is NOT required for performing a one-sample t-test?
Which assumption is NOT required for performing a one-sample t-test?
What is the primary purpose of calculating a confidence interval?
What is the primary purpose of calculating a confidence interval?
If the assumptions for a t-test are violated, which method might you need to use?
If the assumptions for a t-test are violated, which method might you need to use?
What does a 95% confidence level imply about the confidence interval?
What does a 95% confidence level imply about the confidence interval?
What is indicated by a confidence interval that is wide?
What is indicated by a confidence interval that is wide?
Flashcards
Biostatistics
Biostatistics
The application of statistical methods to biomedical research to collect, classify, analyze, and interpret data.
Empirical Science
Empirical Science
Knowledge based on observation and experience, using natural or experimental observations, and inductive reasoning to reach generalizations.
Clinical Research
Clinical Research
Research aiming to understand and improve human health, often involving human subjects.
Randomized Clinical Trials
Randomized Clinical Trials
Signup and view all the flashcards
Sample
Sample
Signup and view all the flashcards
Population
Population
Signup and view all the flashcards
Categorical Variables
Categorical Variables
Signup and view all the flashcards
Numeric Variables
Numeric Variables
Signup and view all the flashcards
Proportion
Proportion
Signup and view all the flashcards
Binomial Distribution
Binomial Distribution
Signup and view all the flashcards
Prevalence
Prevalence
Signup and view all the flashcards
Probability Distribution
Probability Distribution
Signup and view all the flashcards
Binomial Distribution Parameters
Binomial Distribution Parameters
Signup and view all the flashcards
Population mean (µ)
Population mean (µ)
Signup and view all the flashcards
Population variance (σ²)
Population variance (σ²)
Signup and view all the flashcards
Sample mean (X)
Sample mean (X)
Signup and view all the flashcards
Sample variance (S²)
Sample variance (S²)
Signup and view all the flashcards
Normal distribution
Normal distribution
Signup and view all the flashcards
Z-score
Z-score
Signup and view all the flashcards
Probability distribution
Probability distribution
Signup and view all the flashcards
Random variable
Random variable
Signup and view all the flashcards
Normal Distribution
Normal Distribution
Signup and view all the flashcards
Shapiro-Wilk Test
Shapiro-Wilk Test
Signup and view all the flashcards
Q-Q Plot
Q-Q Plot
Signup and view all the flashcards
Central Limit Theorem
Central Limit Theorem
Signup and view all the flashcards
Standard Error (SE)
Standard Error (SE)
Signup and view all the flashcards
Mann-Whitney test
Mann-Whitney test
Signup and view all the flashcards
Null hypothesis (H0)
Null hypothesis (H0)
Signup and view all the flashcards
Alternative hypothesis (H1)
Alternative hypothesis (H1)
Signup and view all the flashcards
P-value
P-value
Signup and view all the flashcards
Proportion
Proportion
Signup and view all the flashcards
95% Confidence Interval
95% Confidence Interval
Signup and view all the flashcards
Confidence Interval Equation (Large Samples)
Confidence Interval Equation (Large Samples)
Signup and view all the flashcards
Standard Error (SE)
Standard Error (SE)
Signup and view all the flashcards
t-distribution
t-distribution
Signup and view all the flashcards
Degrees of Freedom (t-test)
Degrees of Freedom (t-test)
Signup and view all the flashcards
Assumptions of t-test
Assumptions of t-test
Signup and view all the flashcards
Sample Mean (X)
Sample Mean (X)
Signup and view all the flashcards
Sample Variability
Sample Variability
Signup and view all the flashcards
Study Notes
Biostatistics Lectures 1-4 Summary
- Biostatistics involves collecting, classifying, analyzing, and interpreting data from biomedical research to generate medical knowledge.
- Science is empirical, based on observations and experiences, using inductive reasoning to generalize from observations.
- Basic and clinical research are interconnected. Clinical research must be randomized to ensure unbiased results.
- In biostatistics, samples are subsets of a population, but the sample itself is not the primary focus.
- Samples are studied to draw inferences about a population of interest.
- Larger samples increase the likelihood of accurate results. Small or biased samples may lead to random errors. Random error is the difference between sample and population means due to sampling.
- Sample quantities are known and measured, while population quantities are unknown and estimated.
- Clinical research involves a study design, data collection, data analysis, and data processing. Software like R facilitates reproducible research.
- Reproducible research involves using code throughout the entire research process and presents a direct link between analysis and final results.
- Rectangular data, typically in tabular format, includes a patient ID, age, sex, vaccination status, and COVID-19 status, among other variables.
- Units of observation are patients or other similar entities described by the data.
- Observations (records) are rows in the table, and variables (or fields) are columns in the table, representing characteristics of the observation unit or entity.
- Variables are classified as categorical or numeric. Numeric variables include continuous and discrete types. Categorical variables include nominal, ordinal, and dichotomous.
- Frequency tables present the number and percentage of participants in each category, useful for categorical and grouped numeric variables.
- Frequency tables may also be cumulative.
- Contingency tables (cross-tabulations) examine the relationship between two categorical variables, examining marginal totals and category-specific proportions.
- Plotting categorical variables involves pie charts and bar plots, illustrating relative frequencies (up to 100%) with appropriate colors/ordering choices.
- Histograms and box-and-whisker plots illustrate numerical (continuous or discrete) variables; for histograms, bins have the same width with no space between; for box-and-whisker plots, features include median, hinges, whiskers, and outliers.
- Measures of location include mean (average), median, quantiles, and mode.
- Measures of spread include variance, standard deviation, range, and interquartile range (IQR).
- Distributions describe the pattern of values in sample or population data and can be generalized to probability distributions for an individual.
- A normal distribution (Ν(μ,σ)) is described by its mean (μ) and standard deviation (σ), is bell-shaped, and is symmetric with mean =median =mode = 0.
- The characteristics of this distribution for a given range of values, and from −∞ to ∞: P(x) = 1. (with specific ranges representing specific probabilities).
- Determining (testing) if data follows a normal distribution involves contextual knowledge and statistical tests (e.g., Shapiro-Wilk test) or Q-Q plots.
- The central limit theorem shows that the means of repeated random samples from any distribution follow a normal distribution, even when the underlying variables aren't normally distributed.
- There are three distributions to consider (population, sample, sample mean of the variable) and their associated statistics (mean, standard deviation, standard error).
- Confidence intervals and the difference between two proportions give a range of values within which the mean and proportion are likely to lie.
- Confidence intervals are estimated and calculated using critical values.
- Statistical tests like the t-tests and non-parametric tests (Mann-Whitney) are useful for comparing sub-sample means, using probability values (p-values) for decision-making; the null hypothesis is that the means are the same, and the alternative hypothesis proposes otherwise. p-values represent probabilities of observing particular results (or more extreme values), thereby affecting decision-making on rejecting or not rejecting the null hypothesis.
- Proportions are calculated by comparing A/(A+B). The analysis of proportions does not use the normal/t-distributions but other special methods (exact method; Clopper-Pearson).
- The binomial distribution provides probabilities for discrete outcomes in binary trials.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.