Lecture Part 6 - Bio Statistics PDF
Document Details
Uploaded by Deleted User
Polytechnic University of the Philippines
Tags
Summary
These notes cover topics in biostatistics, specifically focusing on normal distribution, different types of distributions, and hypothesis testing. The document also includes examples and exercises related to these topics.
Full Transcript
MODULE 6: OBJECTIVES: After successful completion of this module, you should be able to: ✦ Identify regions under the normal curve corresponding to different standard normal values. ✦ Compute probabilities using the standard normal table and Excel. ✦ Know when to use Normal distri...
MODULE 6: OBJECTIVES: After successful completion of this module, you should be able to: ✦ Identify regions under the normal curve corresponding to different standard normal values. ✦ Compute probabilities using the standard normal table and Excel. ✦ Know when to use Normal distribution and T-distribution. ✦ Differentiate the null and alternative hypotheses. ✦ Formulates the appropriate null and alternative hypotheses. ✦ Explain the logic of hypothesis testing. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Normal Distribution ✦ The normal distribution is sometimes called the bell curve because the graph of its probability density looks like a bell. ✦ It is also known as the Gaussian distribution, after the German mathematician Carl Friedrich Gauss who Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics 2. Because mean, median and mode are equal, the normal curve has a single peak and the highest point occurs at x = μ. In ection point ection point μ−σ μ μ+σ fl fl Properties of Normal Curve 4. The area under the normal curve is 1. 5. The area under the normal curve to the right of μ equals the area under the curve to the left of μ, which equals 0.50 6. The normal curve approaches, area = 1 but never touches the x-axis as it extends farther and farther away from the mean. 0.50 0.50 Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics μ1 = μ2, σ1 < σ2 μ1 < μ2, σ1 < σ2 Mean: ✦ Changing the mean shifts the entire curve left or right on the X-axis. Standard Deviation: ✦ Changing the standard deviation either tightens or spreads out the width of the distribution along the X- axis. μ 1 < μ2 , σ1 = σ2 Larger standard deviations produce distributions that are more spread out. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Determine whether the graph represent a normal curve. A. C. B. D. All of them did not represent the normal curve. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Role of Area under a Normal Curve Suppose that a random variable X is normally distributed with mean μ and standard deviation σ. The area under the normal curve for any interval of values of the random variable X represents either ✦ the proportion of the population with the characteristic described by the interval of values or ✦ the probability that a randomly selected individual from the population will have the characteristic described by the interval of values. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Standard Normal Distribution A normal random variable having mean value μ = 0 and standard deviation σ = 1 is called a standard normal random variable, and its density curve is called the standard normal curve. It will always be denoted by the letter Z. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Standardizing a Normal Random Variable The normal random variable of a standard normal distribution is called a standard x−μ score or a z-score. Every normal random z= variable X can be transformed into a z score σ via the following equation: where X is a normal random variable, μ is the mean of X, and σ is the standard deviation of X. Probabilities for a standard normal random variable are computed using Standard Normal Distribution Table which shows a cumulative probability associated with a particular z-score. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Remember! Positive values of z-score indicate how far above the mean a score falls and negative values indicate how far below the mean a score falls. Whether positive or negative, larger z-scores mean that scores are far away from the mean and smaller z-scores means that scores are close to the mean. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Standard Normal Distribution Table 1 (Positive Side P(Z < z)) Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Patterns for Finding Areas under a Standard Normal Curve A. Area to the right of a negative z value or to the left of a positive z value. Use Table 1 directly 0 z1 z1 0 B. Area between z values on either side of 0. = - z1 0 z2 0 z2 z1 0 1 − Area C. Area between z values on same side of 0. = - z1 z2 0 z1 0 z2 1 − Area 1 − Area Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Patterns for Finding Areas under a Standard Normal Curve D. Area to the right of a positive z value or to the left of a negative z value. = - 0 z1 0 0 z1 Area = 1 E. Area between a given z value and 0. = - 0 z1 0 z1 0 Area = 0.50 Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Example 1: Scores on a standardized college entrance examination (CEE) are normally distributed with mean 510 and standard deviation 60. A selective university considers for admission only applicants with CEE scores over 560. Find proportion of all individuals who took the CEE who meet the university's CEE requirement for consideration for admission. Solution: Given: μ = 510,σ = 60 and x = 560 Area = P(X > 560) Step 1: Draw a normal curve and shade the desired area. X 450 510 570 Polytechnic University of the Philippines 560 College of Science Department of Mathematics and Statistics Using Table 1 By-hand Approach! Step 2: Convert the value of x to a z-score. P(X > 560) = P (Z > z) Area = P(Z > 0.83) ( ) 560 − 510 = 0.2033 =P Z> 60 = P(Z > 0.83) = 1 − P(Z ≤ 0.83) = 1 − 0.7967 Z −2 −1 0 1 2 = 0.2033 0.83 Use the Complement Rule and determine one minus the area. The proportion of all CEE scores that exceed 560 is 0.2033 or 20.33%. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Step 2: Used Excel to determine the area under any normal curve. Technology Approach! Use “TRUE” for cumulative since we want the area under the normal curve. The proportion of all CEE scores that exceed 560 is 0.2033 or 20.33%. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Example 2: A pediatrician obtains the heights of her three-year-old female patients. The heights are approximately normally distributed, with mean 38.72 inches and standard deviation 3.17 inches. Determine the proportion of the three-year-old females that have a height less than 35 inches. Solution: Given: μ = 38.72,σ = 3.17 and x = 35 Step 1: Draw a normal curve and shade the desired area. Area = P(X < 35) X 35.55 38.72 41.89 Polytechnic University of the Philippines 35 College of Science Department of Mathematics and Statistics Using Table 1 By-hand Approach! Step 2: Convert the value of x to a z-score. P(X < 35) = P (Z < z) Area = P(Z < − 1.17) = 0.1210 ( 3.17 ) 35 − 38.72 =P Z< = P(Z < − 1.17) = 1 − P(Z ≥ − 1.17) = 1 − 0.8790 Z −2 −1 0 1 2 = 0.1210 Use the Complement Rule −1.17 and determine one minus the area. The proportion of the pediatrician’s three-year-old females who are less than 35 inches tall is 0.1210 or 12.10%. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Step 2: Used Excel to determine the area under any normal curve. Technology Approach! Use “TRUE” for cumulative since we want the area under the normal curve. The proportion of the pediatrician’s three- year-old females who are less than 35 inches tall is 0.1210 or 12.10%. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Example 3: A pediatrician obtains the heights of her three-year-old female patients. The heights are approximately normally distributed, with mean 38.72 inches and standard deviation 3.17 inches. Determine the probability that a randomly selected three-year- old girl is between 35 and 40 inches tall, inclusive. Solution: Given: μ = 38.72,σ = 3.17, and 35 ≤ X ≤ 40 Area = P(35 ≤ X ≤ 40) Step 1: Draw a normal curve and shade the desired area. X 35.55 38.72 41.89 Polytechnic University of the Philippines 35 40 College of Science Department of Mathematics and Statistics Using Table 1 By-hand Approach! Step 2: Convert the value of x to a z-score. P(35 ≤ X ≤ 40) = P(z ≤ Z ≤ z) ( 3.17 3.17 ) 35 − 38.72 40 − 38.72 =P ≤Z≤ = P(−1.17 ≤ Z ≤ 0.40) = P(Z ≤ 0.40) − [1 − P(Z ≥ − 1.17)] = 0.6554 − [1 − 0.8790] Area = P(−1.17 ≤ Z ≤ 0.40) = 0.6554 − 0.1210 = 0.5344 The probability a randomly selected three-year-old female is between 35 and 40 inches tall X −2 −1 0 1 2 is 0.5344. −1.17 0.40 Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Step 2: Used Excel to determine the area under any normal curve. Technology Approach! Use “TRUE” for cumulative since we want the area under the normal curve. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics What is t - Distribution? The t-distribution (also called Student’s t- Distribution) is a family of distributions that look almost identical to the normal distribution curve, only a bit shorter and fatter. ✦ The t distribution is used instead of the normal distribution when you have small samples. Polytechnic University of the Philippines College of Science https://www.dummies.com/education/math/statistics/how-to-tell-a-z- Department of Mathematics and Statistics distribution-from-a-t-distribution/ How to determine the critical value/t-scores for a t - Distribution? Step 1: Calculate the degrees of freedom (df). df = n − 1 Take note! The degrees of freedom depend on the number of parameters you are estimating. Thus, from an n-sized sample you have n-1 degrees of freedom if, as it usually happens, you need to estimate the population mean through the sample mean. Step 2: Look up the df in the left hand side of the t- distribution table. Locate the column under your alpha level. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Example: If n = 15 and α = 0.05, then tα/2 → t0.025 = 2.145 α 0.05 df = 15 − 1 = 14 = = 0.025 2 2 Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Example: If n = 11 and con dence level = 99%, then tα/2 → t0.005 = 3.169 α = 1 − CL α 0.01 df = 11 − 1 = 10 = 1 − 0.99 = = 0.005 2 2 = 0.01 Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics fi Take Note! ✦ The t -distribution is different for different degrees of freedom. ✦ The larger the sample size, the more the t distribution looks like the normal distribution. ✦ The t -distribution is centered at 0 and is symmetric about 0. ✦ The area under the curve is 1. The area under the curve to the right of 0 equals the area under the curve to the left of 0, which equals 0.50. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Inferential Statistics It is the process of generalizing information obtained from a sample to a population. Two areas of inferential statistics: (1)Estimation (2)Hypothesis Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics ESTIMATION Sample data are used to estimate the value of unknown parameters such as μ or σ. Two Types of Estimation 1. Point estimation -(single points that are used to infer parameters directly). 2. Interval estimation - (also called con dence interval for parameter). Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics fi Con dence Interval Con dence interval provides more information than point estimates and it consist of an interval of numbers. Level of con dence represents the expected proportion of intervals that will contain the parameter if a large number of different samples is obtained. The level of con dence is denoted by(1 − α) × 100 % Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics fi fi fi fi Misconception about Interpreting Con dence Interval “There is a 95% chance that the true population mean falls within the con “The mean will fall within the con dence interval 95% of the time.” Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics fi fi dence interval does not mean that there is 95% probability that the interval contains the true value you are estimating. fi What is HYPOTHESIS TESTING? Hypothesis testing is a procedure on sample evidence and probability, used to test claims regarding a characteristic of one or more populations. What is HYPOTHESIS? A statement or claim regarding a characteristic of one or more populations. A preconceived idea, assumed to be true but has to be tested for its truth or falsity. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Take Note! Because we use sample data to test hypotheses, we cannot state with 100% certainty that the statement is true; we can only determine whether the sample data support the statement or not. Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics Procedures for Testing Hypothesis 1. State the null and alternative hypothesis. 2. Set the level of signi Polytechnic University of the Philippines College of Science Department of Mathematics and Statistics