Biostatistics Lectures 1-4 Summary

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

What is the range of values a proportion can take?

From -1 to 1
From 0 to 1 (correct)
From -100% to 100%
From 1 to 10

What does the parameter 'p' represent in the binomial distribution?

Number of successes out of n trials
Probability of success in each trial (correct)
Total number of trials
Average number of successes

Which of the following statements about the binomial distribution is true?

It can be analyzed using the normal distribution.
It is a discrete probability distribution. (correct)
It is a continuous distribution.
It assumes trials are dependent.

If the prevalence of type 2 diabetes (T2DM) is 11.3%, how many individuals would you expect to find with T2DM in a sample of 50 Americans?

11 individuals (A)

Signup and view all the answers

In the context of a binomial distribution, what can be said about the frequency representation of diseases?

They are represented as proportions. (C)

Signup and view all the answers

What does a p-value of less than 0.05 in the Shapiro-Wilk test indicate?

There is a deviation from normality. (C)

Signup and view all the answers

What is the significance of the central limit theorem?

It establishes that the means of repeated random samples will follow a normal distribution. (C)

Signup and view all the answers

What is represented by the standard error (SE)?

The standard deviation of the sampling distribution of the sample mean. (A)

Signup and view all the answers

In a Q-Q plot, what does it indicate if the points lie along the line y = x?

The data is normally distributed. (D)

Signup and view all the answers

Which of the following best describes the distribution of a variable in the sample?

It provides the estimated mean and standard deviation. (B)

Signup and view all the answers

What is the main purpose of biostatistics?

To generate medical knowledge through data analysis (B)

Signup and view all the answers

Which of the following describes a characteristic of a sample in biostatistics?

It is used to infer about the population (B)

Signup and view all the answers

What type of research must be randomized to ensure unbiased results?

Clinical Research (C)

Signup and view all the answers

What is the consequence of using a small and biased sample in research?

Higher chance of random error (B)

Signup and view all the answers

What distinguishes continuous numeric variables from discrete ones?

Continuous variables can be converted to other units of measurement (B)

Signup and view all the answers

Which type of variable has inherent ordering?

Ordinal Variable (C)

Signup and view all the answers

What is a key difference between sample quantities and population quantities in biostatistics?

Sample quantities are known and being measured, while population quantities are unknown and being estimated (D)

Signup and view all the answers

What is a dichotomous variable?

A variable that takes only two possible values (D)

Signup and view all the answers

What is the purpose of the sample mean and sample variance in relation to a population?

To estimate the population mean and population variance (C)

Signup and view all the answers

Which measures are more appropriate for analyzing skewed or non-normal data?

Median and quantiles (B)

Signup and view all the answers

In a probability distribution, what characterizes a random variable?

Its values depend on outcomes of a random phenomenon (B)

Signup and view all the answers

What is the shape of a standard normal distribution?

Symmetrical and unimodal (D)

Signup and view all the answers

To standardize a normally-distributed variable, which formula is used?

$Z = \frac{X - \mu}{\sigma}$ (D)

Signup and view all the answers

What percentage of the probability lies within ±1 standard deviation from the mean in a normal distribution?

68% (D)

Signup and view all the answers

Which of the following pairs are the parameters that define a normal distribution?

Mean and standard deviation (C)

Signup and view all the answers

What does a cumulative probability in a probability distribution represent?

The probability of a value being less than or equal to a given value (A)

Signup and view all the answers

What does the null hypothesis (H0) signify in the context of comparing two population means?

Any difference in sample means is due to random error. (D)

Signup and view all the answers

What is a key characteristic of the Wilcoxon-Mann-Whitney test?

It assesses whether two samples come from the same distribution. (A)

Signup and view all the answers

What does a p-value less than 0.05 typically indicate in hypothesis testing?

There is a statistically significant difference between population means. (B)

Signup and view all the answers

What would be the appropriate interpretation if the confidence interval (CI) includes the null value of zero?

There may be no statistically significant difference in population means. (D)

Signup and view all the answers

In the formula for the difference between two population means, $d = μ1 - μ2$, what does $d = 0$ signify?

There is no difference between the population means. (D)

Signup and view all the answers

What does the standard error (SE) of the sample mean represent?

The population standard deviation divided by the square root of sample size (A), The variability of the sample mean from the population mean (C)

Signup and view all the answers

In the formula for a 95% confidence interval, what does Z represent?

The z-value corresponding to the confidence level (C)

Signup and view all the answers

What additional distribution is used for calculating confidence intervals for smaller sample sizes?

t-distribution (B)

Signup and view all the answers

Which assumption is NOT required for performing a one-sample t-test?

The samples must have equal sizes (A)

Signup and view all the answers

What is the primary purpose of calculating a confidence interval?

To estimate the range in which a population parameter lies (C)

Signup and view all the answers

If the assumptions for a t-test are violated, which method might you need to use?

Bootstrap methods (A)

Signup and view all the answers

What does a 95% confidence level imply about the confidence interval?

It suggests the interval will contain the population mean in 95 out of 100 cases when sampled multiple times (A)

Signup and view all the answers

What is indicated by a confidence interval that is wide?

Greater variability in the sample data (C)

Signup and view all the answers

Flashcards

Biostatistics

The application of statistical methods to biomedical research to collect, classify, analyze, and interpret data.

Empirical Science

Knowledge based on observation and experience, using natural or experimental observations, and inductive reasoning to reach generalizations.

Clinical Research

Research aiming to understand and improve human health, often involving human subjects.

Randomized Clinical Trials

Clinical research methods that impartially assign subjects to different treatment groups to minimize bias and increase reliability.

Signup and view all the flashcards

Sample

A subset of a population selected for study, used to infer about the characteristics of the larger population.

Signup and view all the flashcards

Population

The entire group of individuals or objects that are of interest.

Signup and view all the flashcards

Categorical Variables

Variables that represent categories or groups, like blood type or gender.

Signup and view all the flashcards

Numeric Variables

Variables measured numerically, further divided into continuous and discrete types, involving quantities.

Signup and view all the flashcards

Proportion

A fraction representing the number of individuals with a specific characteristic divided by the total number of individuals.

Signup and view all the flashcards

Binomial Distribution

A probability distribution showing the chances of getting a specific number of successes in a set number of independent trials.

Signup and view all the flashcards

Prevalence

The proportion of a population that has a particular disease or condition at a specific time.

Signup and view all the flashcards

Probability Distribution

A function showing the probability of different outcomes of a random variable.

Signup and view all the flashcards

Binomial Distribution Parameters

The Binomial Distribution is determined by 'n' (number of trials) and 'p' (probability of success).

Signup and view all the flashcards

Population mean (µ)

The average value of a variable in the entire population.

Signup and view all the flashcards

Population variance (σ²)

A measure of the spread or variability of a variable in the entire population.

Signup and view all the flashcards

Sample mean (X)

The average value of a variable in a sample taken from a population.

Signup and view all the flashcards

Sample variance (S²)

A measure of the spread or variability of a variable in a sample taken from a population.

Signup and view all the flashcards

Normal distribution

A probability distribution that is symmetric and bell-shaped; characterized by its mean (µ) and standard deviation (σ).

Signup and view all the flashcards

Z-score

A value representing how many standard deviations a data point is from the mean of a normal distribution.

Signup and view all the flashcards

Probability distribution

A function that describes the possible outcomes of a random variable and their associated probabilities.

Signup and view all the flashcards

Random variable

A variable whose value is a numerical outcome of a random phenomenon or experiment.

Signup and view all the flashcards

Normal Distribution

A common probability distribution where data tends to cluster around a central value, often bell-shaped.

Signup and view all the flashcards

Shapiro-Wilk Test

A statistical test used to determine if a dataset follows a normal distribution.

Signup and view all the flashcards

Q-Q Plot

A plot used to visually assess if a dataset follows a normal distribution.

Signup and view all the flashcards

Central Limit Theorem

The means of repeated samples from any distribution will be approximately normally distributed, even if the original data isn't.

Signup and view all the flashcards

Standard Error (SE)

The standard deviation of the distribution of sample means. A measure of how much sample means differ from the population mean.

Signup and view all the flashcards

Mann-Whitney test

A non-parametric test used to compare two samples, assessing if they come from the same distribution.

Signup and view all the flashcards

Null hypothesis (H0)

The hypothesis that there is no difference between two population means.

Signup and view all the flashcards

Alternative hypothesis (H1)

The hypothesis that there is a difference between two population means.

Signup and view all the flashcards

P-value

Probability of observing results as extreme or more extreme if the null hypothesis is true.

Signup and view all the flashcards

Proportion

Ratio of the number of items with a specific characteristic to the total number of items

Signup and view all the flashcards

95% Confidence Interval

A range estimated to contain the true population mean with a 95% probability.

Signup and view all the flashcards

Confidence Interval Equation (Large Samples)

CI = X ± Z * SE, where X is the sample mean, Z is the Z-value for the desired confidence level (e.g., 1.96 for 95%), and SE is the standard error.

Signup and view all the flashcards

Standard Error (SE)

A measure of the variability of sample means around the population mean, calculated as standard deviation divided by the square root of the sample size.

Signup and view all the flashcards

t-distribution

Used to calculate confidence intervals for smaller sample sizes when the population standard deviation is unknown.

Signup and view all the flashcards

Degrees of Freedom (t-test)

A parameter in the t-distribution calculation, equal to sample size minus 1 (n-1).

Signup and view all the flashcards

Assumptions of t-test

Normality of data, independence of data points, and equal variances assumptions to perform t-test effectively.

Signup and view all the flashcards

Sample Mean (X)

Average of values obtained from a given set of samples.

Signup and view all the flashcards

Sample Variability

The spread of values in a dataset, indicative of how much individuals in dataset differ from one another

Signup and view all the flashcards

Study Notes