Statistics & Probability Reviewer (SIGMA) PDF

Summary

This document is a reviewer for Statistics and Probability, specifically designed for students in secondary school. It provides an overview of key concepts including discrete and continuous random variables. The document presents formulas, tables, and examples for understanding various statistical distributions and calculations.

Full Transcript

DEPARTMENT OF EDUCATION ANGELO L. LOYOLA SENIOR HIGH SCHOOL SOCIETY FOR INSPIRING GROWTH IN MATHEMATICS ACHIEVEMENT STATISTICS & PROBABILITY Comprehensive Reviewer COMPILATION OF FORMULAS Discrete Random Frequency Distribution...

DEPARTMENT OF EDUCATION ANGELO L. LOYOLA SENIOR HIGH SCHOOL SOCIETY FOR INSPIRING GROWTH IN MATHEMATICS ACHIEVEMENT STATISTICS & PROBABILITY Comprehensive Reviewer COMPILATION OF FORMULAS Discrete Random Frequency Distribution Variable Table Table. z-score formula Degrees of Freedom Sample Population List of variables (for z- score and t-test only) Test Statistic Formula Population mean Sample mean Population standard deviation Sample standard Sample Standard deviation Deviation Formula Degrees of freedom Sample size Raw score z-score LESSON 1: DISCRETE AND CONTINUOUS RANDOM VARIABLES Random variable - a numerically valued function defined over a sample space. Discrete Random Variable Continuous Random Variable Can only take a finite (countable) Can assume an infinite number number of distinct of values in an interval values. between specific values. Values can be represented not Distinct values mean are exact only by non-negative and can be represented by whole numbers but also by non-negative whole numbers. fraction and decimals Examples: Examples: Amount of elephants in the Weight of a dumbell room Height of a person Amount of books in a library Properties of a Discrete Random Variable 1. The probability of each value of a discrete random variable is between 0 and 1 inclusively. 0 ≤ P(x) ≤ 1 2. The sum of all probabilities is 1. LESSON 2: DISCRETE RANDOM VARIABLE TABLE The probabilities that a cashier at a grocery store processes 0, 1, 2, 3, 4, or 5 customers in a minute are 0.05, 0.15, 0.25, 0.3, 0.15, and 0.1, respectively. What is the average number of customers that the cashier processes per minute? Step 1: Construct the probability distribution table and find the mean using the given formula. Step 2: Subtract the mean from each value of the random variable x. Step 3: Square the results obtained in Step 2. Step 4: Multiply the results obtained in Step 4 by the corresponding probability. Step 5: Get the sum of the results obtained in Step 4. Step 6: Get the square root of the The result is the value of the variance. variance to get the standard deviation. LESSON 3: FREQUENCY DISTRIBUTION TABLE Step 1: Construct the frequency distribution table and find the mean using the given formula. Step 2: Subtract the mean from each score then square the results. Step 3: Multiply the results by the corresponding frequency. Add the results in the f(X-X)² column. Step 4: Compute the variance using the given formula. Step 5: To get the standard deviation, square root the variance. LESSON 4: COMPUTATION OF THE AREA UNDER THE CURVE Normal Distribution - One of the most commonly observed continuous random variables has a bell-shaped probability distribution. It is known as a normal random variable and its probability distribution is called normal distribution or a Gaussian distribution. Properties of a Normal Distribution Curve The normal distribution curve is bell-shaped. The curve is symmetric about its mean. The mean, median, and mode are equal. The spread of the curve depends on the standard deviation of the distribution. The tails of the curve flatten out indefinitely along the horizontal axis, always approaching the axis but never touching it. The total area under the curve is equal to 1. Standard Deviation - is the measure of how spread out a normally distributed set of data is. It tells you how closely all of the examples are gathered around the mean in a data set. How the Normal Curve is Illustrated: Sketch the curve with the mean µ = 25 and a standard deviation 𝜎 = 5. Sketch the curve with the mean µ = 60 and a standard deviation 𝜎 = 10. How to Use Z-Table For example, you want to find the area from the mean to the z-score of 1. You have identify the z-score which is 1 then find its corresponding area on the z-table. In this case the area from the mean to the z-score of 1 is 0.3413. REMEMBER The area under a normal distribution curve is 1, and since the curve is symmetrical, half of the curve’s area is 0.5. Area Under the Normal Curve A normal curve can be converted into a standard normal distribution by obtaining the z-value. A z-value is the signed distance between a selected value (x) and the mean (μ) divided by the standard deviation. In terms of formula: Where: z - z value x - the value of any particular observation or measurement μ – the mean of the distribution 𝜎 – standard deviation of the distribution Computation of The Area Under the Normal Curve GRAPH AREA COMPUTATION Example: Find the area to the right of 𝑧 = 2.23 Hence, the area to the right of 𝑧 = 2.23 is 0.0129 LESSON 5: PERCENTILE Percentile - a measure of relative standing. It is the percent of cases that are at or below a score. It tells you how a value compares to other values. Steps in Finding the z-score with a Given Percentile Find the 95th percentile of a normal curve Step 1: Express the percentile in decimal form. In this case, the 95th percentile is 0.9500. Now, since this occupies more than half of the curve, we will subtract 0.5 (area of the left side of the curve) from 0.9500 giving us an area of 0.4500. Step 2: Locate the z-value correesponding the the area 0.4500. If there is no z-value corresponding to exactly the area 0.4500, take the nearest area. In the case that both areas are of equal distance to each other, take the average of the two z-scores. The nearest areas are 0.4495 and 0.4505. Step 3: Find the z-value corresponding to 0.4495 and 0.4505. These are z = 1.64 and z = 1.65. We get the average of the two z-values. Thus, the 95th percentile is z = 1.645. More Examples Find the upper 2% of the normal curve. Upper 2% = 0.0200 The nearest area to 0.4800 is 0.4798, which is z = 2.05. Find the 10th percentile of a normal curve. 10th percentile = 0.1000 The nearest area to 0.4000 is 0.3997, which is z = 1.28 The z - score is -1.28. It is negative because it is on the left side of the curve. LESSON 6: SAMPLING DISTRIBUTION Parameter - a measure that describes a population. Usually denoted by Greek letters. Statistic - a measure that describes a sample. Usually denoted by Roman letters. Sampling Distribution of Sample Means - the probability distribution that describes the population for each mean of all the samples with the same sample size n in called sampling distribution. x = sample mean n = total number of observations Example: Consider the population consisting of the values 2, 3, and 5. List all the possible samples of size 2 that can be drawn from the population WITH replacement. Then, compute the mean x for each sample. Lastly, find the mean of the sampling distribution of means and the mean of the population. Therefore, the mean of the sampling distribution of means is equal to the mean of the population. LESSON 6: SAMPLING DISTRIBUTION Example (Parameter): Consider the population consisting of the values (1, 3, 8). Find the population mean, variance and standard deviation. Mean Variance Standard deviation Example (Sample): Find the mean, variance, and standard deviation of the sampling distribution of means with possible samples of size 2 WITHOUT replacement. x̄ x̄ x̄ Mean _ The sampling distribution of the sample means taken x̄ with replacement from a Variance population N with mean μ x̄ and variance σ² will x̄ approach a normal Standard deviation distribution according to the x̄ Central Limit Theorem. Note: WITH replacement and WITHOUT replacement are different LESSON 7: TYPES OF RANDOM SAMPLING Probability sampling is a sampling method wherein individuals from the population are chosen randomly. There are 4 types of probability sampling which are listed and defined in the table below. Simple Random Systematic Stratified Cluster Sampling Sampling Sampling Sampling Involves dividing involves dividing Involves the the population the population whole into Every member into subgroups population but subpopulations of the similar to instead of or groups called population has stratified randomly strata based on a an equal sampling, but selecting, every relevant chance of being instead of member of the characteristic. selected. The randomly population is You then sampling frame selecting listed with a randomly select also includes individuals from number and people from each the whole subgroup either each group, you individuals are select entire population. via random or chosen at nᵗʰ subgroups systematic intervals. altogether. sampling. Example: Example: Example: Example: Having all the Selecting 10 Selecting 5 Selecting 50 students in a students each hospitals from students out of single line and from the 5 the 20 hospitals 500. selecting every sections in a in the state. 3rd person. strand. LESSON 8: POINT ESTIMATION AND PARAMETERS Point estimation - the process of finding a single value, called point estimate, from a random sample of the population to approximate a population parameter. The sample mean (x̄ ) is the point estimate of the population mean (μ) and the sample variance (s²) is the point estimate of population variance (σ²) Point estimator - needed to obtain a point estimate. A good point estimate is one that is unbiased. Identify if it is a good point estimate or not. A teacher, with the intent to get the average height of Grade 9 students in their school gathered the heights of one of the eight sections in Grade 9. He got a mean height of 165 cm. Answer: With the point estimate being 165 cm, this is not a good point estimate as the sampling used in obtaining the sample mean is biased. Confidence interval - uses interval estimate to define a range of values that includes the parameter being estimated with specific level of confidence. Confidence level - the probability that the confidence interval contains the true population parameter. probability that the confidence interval does not contain the true population parameter. Critical value - indicates the point which lies the rejection region. This region does not contain the true population parameter. Null hypothesis - claims that there is no effect in the population. Margin of Error = LESSON 9: T-DISTRIBUTION The T-Distribution is a statistical analysis on some studies which cannot be done using the normal distribution can be done using t-distribution. It is used with small samples taken from population that is approximately normal. Uses the sample deviation especially when the population variance is unknown. Used when n < 30. PROPERTIES OF T-DISTRIBUTION Bell-shaped and unimodal like the standard normal curve. Symmetric about t=0. The variance is greater than 1. It has more area in its tails than that of the standard normal curve. Its shape depends on the sample size n. As the sample size n becomes larger, the t-distribution gets closer and closer to the standard normal distribution. FORMULA USED FINDING T CRITICAL VALUE AT T-TABLE To find the value, there is a need to adjust the sample size n by converting it to degrees of freedom df. where: df = n - 1 x̅ = sample mean where: μ = population n = sample size mean s = standard deviation (sample mean) n = sample size EXAMPLES: 1. A student researcher wants to determine whether the mean score in mathematics of the 25 students in Grade 8 Section Newton is significantly different from the school mean of 89. The mean and the standard deviation of the scores of the students in section Newton are 95 and 15, respectively. Assume a 95% confidence level. Solution: Step 1. Find the degrees of freedom. df = n – 1 = 25 – 1 = 24 Step 2. Find the critical value. Use the Table of t Critical Values. (Confidence level is 95%) (1 – α) 100% = 95% Look at 24 under the column headed df. Move to the right (1 – α)1 = 0.95 along the row until reaching 1 – α = 0.95 the column headed 0.05 area in two tails or 0.025 for α = 0.05 area in one tail. The critical value is 2.064 α /2 = 0.025 Step 3. Compute the test x̅ = sample mean statistic t. μ = population mean s = standard deviation of the sample mean n = sample size The computed value of t is equal to 2 which is smaller than the tabular value of 2.064. Conclusion: The value of the test statistic or compound t value does not fall in the critical region. Therefore, the mean score of Grade 8 - Newton in Mathematics is the same with the mean score of all the students taking up Grade 8 Mathematics.

Use Quizgecko on...
Browser
Browser