BIOL 208 - Lab 0: Statistics in Biology PDF
Document Details
Uploaded by HardWorkingLute
null
Tags
Summary
This document provides an overview of statistical methods in biology. It covers concepts such as measures of central tendency (mean, median, mode), measures of spread (standard deviation, variance), and significance tests (p-values). The document also discusses different data types (continuous and discrete) and how they relate to the choice of statistical test.
Full Transcript
# Lab 0: Part 3: Statistics in Biology ## Overview - This lab focuses on the interpretation of statistical test results, rather than the calculations themselves. - It covers both descriptive statistics (e.g., mean, confidence intervals) and more advanced statistical tests. ## Objectives - Ident...
# Lab 0: Part 3: Statistics in Biology ## Overview - This lab focuses on the interpretation of statistical test results, rather than the calculations themselves. - It covers both descriptive statistics (e.g., mean, confidence intervals) and more advanced statistical tests. ## Objectives - Identify why statistics are used in biology. - Distinguish between measures of central tendency (mean, median, mode) and range (standard deviation, variance). - Calculate mean, standard deviation, and 95% confidence interval. - Apply the appropriate statistical test to a given data set. - Distinguish between continuous and discrete data sets. - Describe what a p-value means. - Interpret p-values to assess statistical significance. ## Statistics in Biology ### Why do I need to learn statistics for a biology course? - Our brains are good at finding patterns, even when they don't actually exist - Statistics provide tools to rule out the possibility of finding "false" patterns. - Ecologists collect data with a high degree of variability. - Descriptive statistics are essential for communicating the spread or range of your data. ### Measures of central tendency and spread - Measures of central tendency describe where most of the values in a dataset tend to lie: - Mean (average) - Median - Mode - Measures of spread describe the upper and lower limits of the data: - Range - Standard Deviation - Variance **Mean (x):** - The most widely used measure of central tendency. - Calculated by the formula: Σx₁ / n - Σx₁ = the sum of all individual observations - n = the total number of observations - Can be calculated using Excel's "Descriptive Statistics" toolpak or the formula "=AVERAGE(...)". **Variance (s²):** - The most statistically useful measure of spread. - Calculated as the average of the square of the deviations of the measurements from the mean: Σ(xᵢ - x)² / (n - 1) **Standard deviation (s):** - Square root of the variance. - Calculated in Excel using the formula "=STDEV(...)" or "=STDEV.S(...)". ### Normal Distribution - Most observations are clustered around the average, creating a bell-shaped curve. - 68.3% of observations lie within one standard deviation of the mean. - 95.4% of observations lie within two standard deviations of the mean. - 99.7% of observations lie within three standard deviations of the mean. ### Samples, Populations, and Confidence Intervals - It's usually impossible to measure the entire population, so we use samples to make estimates. - Confidence Intervals are used to create an interval around the sample mean that is likely to contain the true population mean, taking into account sample size and variability. **Confidence Intervals are:** - A range around a sample estimate that is likely to contain the true population value. - Their size depends on: - The size (n) and variability (s²) of your sample. - The statistically defined confidence level (usually 95%). - A less variable sample implies a smaller confidence interval, as the sample mean is likely closer to the true population mean. - A more variable sample implies a larger confidence interval, as the sample mean could be further from the true population mean. **Standard Error of the Mean (Sx):** - Used to calculate confidence intervals. - Formula: Sx = s / √n or Sx = √(s² / n). **Confidence Interval Formula:** - x + (t * Sx) - x = sample mean. - t = value obtained from a statistical t-table (based on degrees of freedom and desired confidence level). - Sx = Standard Error of the Mean. ### Significance and Interpreting p-values - Statistical tests help determine the significance of trends observed in a sample data. - **P-value:** The probability of getting a result at least as extreme as your own, given that the null hypothesis is true. - **Null Hypothesis:** States that there is no difference between groups or no relationship between variables. - If the p-value is less than the alpha value (p<0.05), we reject the null hypothesis and conclude that the result is statistically significant. - If the p-value is greater than or equal to the alpha value (p≥0.05), we fail to reject the null hypothesis and conclude that the result is not statistically significant. ### Types of Data - **Measurement Data (Continuous Data):** - Can fall anywhere on a continuous number line (e.g., 12.5 cm, 152.75 kg, 20.8 °C). - Includes means/averages calculated from count/enumeration data. - **Count/Enumeration Data (Discrete Data):** - Represents counted values that can only be whole numbers (e.g., 0, 1, 2, 3, 4). ### Summary - Understand the importance of statistics in biology for analyzing data and making informed conclusions. - Differentiate between measures of central tendency and range. - Calculate mean, standard deviation, and confidence intervals. - Identify the appropriate statistical test based on the type of data and research question. - Interpret p-values to assess statistical significance.