Biostatistics Lectures 1-4 Summary
39 Questions
1 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is the range of values a proportion can take?

  • From -1 to 1
  • From 0 to 1 (correct)
  • From -100% to 100%
  • From 1 to 10
  • What does the parameter 'p' represent in the binomial distribution?

  • Number of successes out of n trials
  • Probability of success in each trial (correct)
  • Total number of trials
  • Average number of successes
  • Which of the following statements about the binomial distribution is true?

  • It can be analyzed using the normal distribution.
  • It is a discrete probability distribution. (correct)
  • It is a continuous distribution.
  • It assumes trials are dependent.
  • If the prevalence of type 2 diabetes (T2DM) is 11.3%, how many individuals would you expect to find with T2DM in a sample of 50 Americans?

    <p>11 individuals</p> Signup and view all the answers

    In the context of a binomial distribution, what can be said about the frequency representation of diseases?

    <p>They are represented as proportions.</p> Signup and view all the answers

    What does a p-value of less than 0.05 in the Shapiro-Wilk test indicate?

    <p>There is a deviation from normality.</p> Signup and view all the answers

    What is the significance of the central limit theorem?

    <p>It establishes that the means of repeated random samples will follow a normal distribution.</p> Signup and view all the answers

    What is represented by the standard error (SE)?

    <p>The standard deviation of the sampling distribution of the sample mean.</p> Signup and view all the answers

    In a Q-Q plot, what does it indicate if the points lie along the line y = x?

    <p>The data is normally distributed.</p> Signup and view all the answers

    Which of the following best describes the distribution of a variable in the sample?

    <p>It provides the estimated mean and standard deviation.</p> Signup and view all the answers

    What is the main purpose of biostatistics?

    <p>To generate medical knowledge through data analysis</p> Signup and view all the answers

    Which of the following describes a characteristic of a sample in biostatistics?

    <p>It is used to infer about the population</p> Signup and view all the answers

    What type of research must be randomized to ensure unbiased results?

    <p>Clinical Research</p> Signup and view all the answers

    What is the consequence of using a small and biased sample in research?

    <p>Higher chance of random error</p> Signup and view all the answers

    What distinguishes continuous numeric variables from discrete ones?

    <p>Continuous variables can be converted to other units of measurement</p> Signup and view all the answers

    Which type of variable has inherent ordering?

    <p>Ordinal Variable</p> Signup and view all the answers

    What is a key difference between sample quantities and population quantities in biostatistics?

    <p>Sample quantities are known and being measured, while population quantities are unknown and being estimated</p> Signup and view all the answers

    What is a dichotomous variable?

    <p>A variable that takes only two possible values</p> Signup and view all the answers

    What is the purpose of the sample mean and sample variance in relation to a population?

    <p>To estimate the population mean and population variance</p> Signup and view all the answers

    Which measures are more appropriate for analyzing skewed or non-normal data?

    <p>Median and quantiles</p> Signup and view all the answers

    In a probability distribution, what characterizes a random variable?

    <p>Its values depend on outcomes of a random phenomenon</p> Signup and view all the answers

    What is the shape of a standard normal distribution?

    <p>Symmetrical and unimodal</p> Signup and view all the answers

    To standardize a normally-distributed variable, which formula is used?

    <p>$Z = \frac{X - \mu}{\sigma}$</p> Signup and view all the answers

    What percentage of the probability lies within ±1 standard deviation from the mean in a normal distribution?

    <p>68%</p> Signup and view all the answers

    Which of the following pairs are the parameters that define a normal distribution?

    <p>Mean and standard deviation</p> Signup and view all the answers

    What does a cumulative probability in a probability distribution represent?

    <p>The probability of a value being less than or equal to a given value</p> Signup and view all the answers

    What does the null hypothesis (H0) signify in the context of comparing two population means?

    <p>Any difference in sample means is due to random error.</p> Signup and view all the answers

    What is a key characteristic of the Wilcoxon-Mann-Whitney test?

    <p>It assesses whether two samples come from the same distribution.</p> Signup and view all the answers

    What does a p-value less than 0.05 typically indicate in hypothesis testing?

    <p>There is a statistically significant difference between population means.</p> Signup and view all the answers

    What would be the appropriate interpretation if the confidence interval (CI) includes the null value of zero?

    <p>There may be no statistically significant difference in population means.</p> Signup and view all the answers

    In the formula for the difference between two population means, $d = μ1 - μ2$, what does $d = 0$ signify?

    <p>There is no difference between the population means.</p> Signup and view all the answers

    What does the standard error (SE) of the sample mean represent?

    <p>The population standard deviation divided by the square root of sample size</p> Signup and view all the answers

    In the formula for a 95% confidence interval, what does Z represent?

    <p>The z-value corresponding to the confidence level</p> Signup and view all the answers

    What additional distribution is used for calculating confidence intervals for smaller sample sizes?

    <p>t-distribution</p> Signup and view all the answers

    Which assumption is NOT required for performing a one-sample t-test?

    <p>The samples must have equal sizes</p> Signup and view all the answers

    What is the primary purpose of calculating a confidence interval?

    <p>To estimate the range in which a population parameter lies</p> Signup and view all the answers

    If the assumptions for a t-test are violated, which method might you need to use?

    <p>Bootstrap methods</p> Signup and view all the answers

    What does a 95% confidence level imply about the confidence interval?

    <p>It suggests the interval will contain the population mean in 95 out of 100 cases when sampled multiple times</p> Signup and view all the answers

    What is indicated by a confidence interval that is wide?

    <p>Greater variability in the sample data</p> Signup and view all the answers

    Study Notes

    Biostatistics Lectures 1-4 Summary

    • Biostatistics involves collecting, classifying, analyzing, and interpreting data from biomedical research to generate medical knowledge.
    • Science is empirical, based on observations and experiences, using inductive reasoning to generalize from observations.
    • Basic and clinical research are interconnected. Clinical research must be randomized to ensure unbiased results.
    • In biostatistics, samples are subsets of a population, but the sample itself is not the primary focus.
    • Samples are studied to draw inferences about a population of interest.
    • Larger samples increase the likelihood of accurate results. Small or biased samples may lead to random errors. Random error is the difference between sample and population means due to sampling.
    • Sample quantities are known and measured, while population quantities are unknown and estimated.
    • Clinical research involves a study design, data collection, data analysis, and data processing. Software like R facilitates reproducible research.
    • Reproducible research involves using code throughout the entire research process and presents a direct link between analysis and final results.
    • Rectangular data, typically in tabular format, includes a patient ID, age, sex, vaccination status, and COVID-19 status, among other variables.
    • Units of observation are patients or other similar entities described by the data.
    • Observations (records) are rows in the table, and variables (or fields) are columns in the table, representing characteristics of the observation unit or entity.
    • Variables are classified as categorical or numeric. Numeric variables include continuous and discrete types. Categorical variables include nominal, ordinal, and dichotomous.
    • Frequency tables present the number and percentage of participants in each category, useful for categorical and grouped numeric variables.
    • Frequency tables may also be cumulative.
    • Contingency tables (cross-tabulations) examine the relationship between two categorical variables, examining marginal totals and category-specific proportions.
    • Plotting categorical variables involves pie charts and bar plots, illustrating relative frequencies (up to 100%) with appropriate colors/ordering choices.
    • Histograms and box-and-whisker plots illustrate numerical (continuous or discrete) variables; for histograms, bins have the same width with no space between; for box-and-whisker plots, features include median, hinges, whiskers, and outliers.
    • Measures of location include mean (average), median, quantiles, and mode.
    • Measures of spread include variance, standard deviation, range, and interquartile range (IQR).
    • Distributions describe the pattern of values in sample or population data and can be generalized to probability distributions for an individual.
    • A normal distribution (Ν(μ,σ)) is described by its mean (μ) and standard deviation (σ), is bell-shaped, and is symmetric with mean =median =mode = 0.
    • The characteristics of this distribution for a given range of values, and from −∞ to ∞: P(x) = 1. (with specific ranges representing specific probabilities).
    • Determining (testing) if data follows a normal distribution involves contextual knowledge and statistical tests (e.g., Shapiro-Wilk test) or Q-Q plots.
    • The central limit theorem shows that the means of repeated random samples from any distribution follow a normal distribution, even when the underlying variables aren't normally distributed.
    • There are three distributions to consider (population, sample, sample mean of the variable) and their associated statistics (mean, standard deviation, standard error).
    • Confidence intervals and the difference between two proportions give a range of values within which the mean and proportion are likely to lie.
    • Confidence intervals are estimated and calculated using critical values.
    • Statistical tests like the t-tests and non-parametric tests (Mann-Whitney) are useful for comparing sub-sample means, using probability values (p-values) for decision-making; the null hypothesis is that the means are the same, and the alternative hypothesis proposes otherwise. p-values represent probabilities of observing particular results (or more extreme values), thereby affecting decision-making on rejecting or not rejecting the null hypothesis.
    • Proportions are calculated by comparing A/(A+B). The analysis of proportions does not use the normal/t-distributions but other special methods (exact method; Clopper-Pearson).
    • The binomial distribution provides probabilities for discrete outcomes in binary trials.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    Description

    This quiz encompasses key concepts from Biostatistics Lectures 1-4. It covers the roles of data collection, sampling, and the importance of randomization in clinical research. The relationship between basic and clinical research is also highlighted, emphasizing the need for larger samples to improve accuracy in studies.

    More Like This

    Use Quizgecko on...
    Browser
    Browser