Advanced Statistics and Probabilities PDF
Document Details
Uploaded by SignificantBlue
Helwan National University
Dr. M. Elgendy, Dr. S. Shatta
Tags
Summary
This document provides a lecture on advanced statistics and probabilities. It covers z-scores and their application to normal distributions, including how to calculate z-scores and how to interpret them. The notes also include examples and exercises.
Full Transcript
Advanced Statistics and Probabilities Dr. M. Elgendy Dr. S. Shatta 04 z-scores and the Standard Normal Distribution Normal Distributions: The normal distribution is the most widely used distribution in statistics. It is also known as the "bell curve" or "Gaussian distribution," nam...
Advanced Statistics and Probabilities Dr. M. Elgendy Dr. S. Shatta 04 z-scores and the Standard Normal Distribution Normal Distributions: The normal distribution is the most widely used distribution in statistics. It is also known as the "bell curve" or "Gaussian distribution," named after Karl Friedrich Gauss. There are many normal distributions. Normal distributions can differ in their means and in their standard deviations. In Figure 1 shows three normal distributions. The green (left-most) distribution has a mean (μ) = -3 and a standard deviation (σ) = 0.5, The red (the middle distribution) has a mean (μ) = 0 and a standard deviation (σ) = 1, The black (right-most) has a mean (μ) = 2 and a standard deviation (σ) = 3. These as well as all other normal distributions are symmetric with relatively more values at the center of the distribution and relatively few in the tails. Figure 1. Normal distributions differing in mean and standard deviation. ✓ Overview of z-Scores ✓ Probability & Normal Distribution ✓ Distribution of Sample Means OVERVIEW OF Z-SCORES ❑ Student A earned a score of 76 on an exam ✓ How many points were possible? ▪ 76 out of 80? Not bad! ▪ 76 out of 100? Not so great! ✓ How does a score of 76 compare to other students? 76 the lowest score in the class? Anyone earn a score higher than 76? z-Score Standardized value that specifies the exact location of an X value within a distribution by describing its distance from the mean in terms of standard deviation units Standard Deviation Unit Standardized value 1 SD unit = value of 1 SD before standardization Because z-scores are in units of standard deviations, this means that 68% of scores fall between z= -1.0 and z= 1.0 We call this 68% (or any percentage we have based on our z-scores) SCORE LOCATION z-Scores describe the exact location of a score within a distribution Sign: Whether score is above (+) or below (-) the mean Number: Distance between score and mean in standard deviation units Example z = +1.00 Sign: positive (+) so score is above the mean Number: 1.00 SD units from the mean Example z= -.50 Sign: negative (-) so score is below the mean Number:.50 SD units from the mean How to calculate the standardized score than make up a standard normal distribution. z-scores: A z-score is a standardized version of a raw score (x) that gives information about the relative location of that score within its distribution. The formula for converting a raw score into a z-score is: for values from a population and for values from a sample. Example Population A has μ = 5 and σ = 1. Find z-Score for X = 3 z = (3-5) / 1 = -2/1 = -2 Example Sample B has M = 5 and s = 1. Find z-Score for X = 5.5 z = (5.5-5) / 1 =.5/1 = +.5 Example: If a score of 68 on an exam and an average score of 54 with a standard deviation of 8. we simply convert our test score into a z-score. We find that 1.75 standard deviations above the average, above our rough cut off for close and far. Suddenly our 68 is looking pretty good! Transform z-Score to X value (raw score ) 4 pieces of information: X = raw score μ = population/sample mean z = z-Score σ = population/sample standard deviation Setting the scale of a distribution Z-scores can be converted to any “scale”. The term scale means how far apart the scores are (their spread) and where they are located (their central tendency). This can be very useful if we don’t want to work with negative numbers or if we have a specific range we would like to present. The formulas for transforming z to x are: for a population and Example Person A from Sample Y has a z-Score of -.75 for μ = 10, σ = 2 Find X for z-Score = -.75 o X = 10 + (-.75)(2) = 8.5 RELATIONSHIPS z-Scores establish relationships between score, mean, standard deviation Example Population: μ = 65 and X = 59 corresponds to z = -2.00 find standard deviation : O σ =(X - μ) / z = (59-65)/(-2)=3 Example O Population: σ = 4 and X = 33 corresponds to z = +1.50 find the mean: O σ * z =(4 * 1.5) = 6 O μ = X - σ * z = 33 – 6 = 27 Finally, z-scores are incredibly useful if we need to combine information from different measures that are on different scales. Example: A set of employees a series of tests on things like job knowledge, personality, and leadership. We may want to combine these into a single score we can use to rate employees for development or promotion, but look what happens when we take the average of raw scores from different scales, as shown in Table 1: Because the job knowledge scores were so big and the scores were so similar, they overpowered the other scores and removed almost all variability in the average. However, if we standardize these scores into z-scores, our averages retain more variability and it is easier to assess differences between employees, as shown in Table 2. Hint: ▪ Job Knowledge (0–100) Job Knowledge: Mean = (98 + 96 + 97) / 3 = 97 Deviation (SD) = 1 ▪ Personality (1–5) Mean = (4.2 + 3.1 + 2.9) / 3 = 3.4 Personality: Deviation (SD) = 0.7 ▪ Leadership (1–5) Mean = (1.1 + 4.5 + 3.6) / 3 = 3.0667 (approximately) Leadership: Calculate Standard Deviations Deviation (SD) ≈1.76 Hint: Calculate Z- scores Calculate Average Z-scores Notice that these are just simple rearrangements of the original formulas for calculating z from raw scores. Example: If we create a new measure of intelligence, and initial calibration finds that our scores have a mean of 40 and standard deviation of 7. Three people who have scores of 52, 43, and 34 want to know how well they did on the measure. We can convert their raw scores into z -scores: The Standard Normal Distribution (also known as the Unit Normal Distribution), which has a mean = 0 and a standard deviation = 1 (i.e. the red distribution in Figure 1). Seven features of normal distributions are : 1. Normal distributions are symmetric around their mean. 2. The mean, median, and mode of a normal distribution are equal. 3. The area under the normal curve is equal to 1.0. 4. Normal distributions are denser in the center and less dense in the tails. 5. Normal distributions are defined by two parameters, the mean (μ) and the standard deviation (σ). 6. 68% of the area of a normal distribution is within one standard deviation of the mean. 7. Approximately 95% of the area of a normal distribution is within two standard deviations of the mean. DISTRIBUTION TRANSFORMATIONS How-To Transform all X values into z-Scores To z-Score Distribution Advantage Possible to compare scores or individuals from different distributions Results more generalizable O z-Score distributions have equal means (0) and standard deviations (1) DISTRIBUTION TRANSFORMATIONS z-Score distributions include positive and negative numbers Standardize to distribution with predetermined μ and σ to avoid negative values Procedure Transform raw scores to z-scores Transform z-scores into new X values with desired μ and σ values 𝑋𝑛𝑒𝑤 = 𝜇𝑛𝑒𝑤 + 𝑍 𝜎𝑛𝑒𝑤 Example Population distribution with μ = 57 and σ = 14 Transform distribution to have μ = 50 and σ = 10 Solution Calculate new X values for raw scores of X = 64 and X = 43 oTransform raw scores to z-scores z = (X – μ) / σ z = (64 – 57) / 14 = (7 / 14) =.50 z = (43 – 57) / 14 = (-14 / 14) = -1.0 In new distribution, z =.50 corresponds to score 5 points above mean (X = 55) 𝑋𝑛𝑒𝑤 = 𝜇𝑛𝑒𝑤 + 𝑍 𝜎𝑛𝑒𝑤 = 50+0.5(10)=55 In new distribution, z = -1.00 corresponds to score 10 points below mean (X = 40) 𝑋𝑛𝑒𝑤 = 𝜇𝑛𝑒𝑤 + 𝑍 𝜎𝑛𝑒𝑤 = 50-(10)=40 PROBABILITY & NORMAL DISTRIBUTION Example p(X > 80) = ? O Translate into a proportion question: Out of all possible adult heights, what proportion consists of values greater than 80”? O The set of “all possible adult heights” is the population distribution O We are interested in all heights greater than 80”, so we shade in the area of the graph to the right of where 80” falls on the distribution Example (continued) p(X > 80) = ? Transform X = 80 to a z-score z = (X – μ) / σ = (80 – 68) / 6 = 12 / 6 = 2.00 Express the proportion we are trying to find in terms of the z-score: p(z > 2.00) = ? By Figure 6.4, p(X > 80) = p(z > +2.00) = 2.28% UNIT NORMAL TABLE Body = Larger part of the distribution Tail = Smaller part of the distribution Distribution is symmetrical Proportions to right of mean are symmetrical to (read as “the same as”) those on the left side of the mean Proportions are always positive, even when z-scores are negative Identify proportions that correspond to z-scores or z-scores that correspond to proportions Unit Normal Table Relationships between z-score locations and proportions in a normal distribution If proportion is known, use table to identify z-score Probability = Proportion Example: O What proportion of normal distribution corresponds to z-scores < z = 1.0? O What is the probability of selecting a z-score less than z = 1.00? Answer: p(z < 1.00) =.8413 (or 84.13%) Example: What proportion of a normal distribution corresponds to z-scores > z = -1.0? What is the probability of selecting a z-score greater than z = -1.00? Answer: p(z > -1.00) =.8413 (or 84.13%) Example: What proportion of a normal distribution corresponds to z-scores > z = 1.00? What is the probability of selecting a z-score value greater than z = 1.00? Answer: p(z > 1.00) =.1587 (or 15.87%) Example: What proportion of a normal distribution corresponds to z-scores < z = -1.00? What is the probability of selecting a z-score value less than z = -1.00? Answer: p(z < -1.00) =.1587 (or 15.87%) Example: What proportion of normal distribution corresponds to positive z-scores < z = 1.00? What is the probability of selecting a positive z-score less than z = 1.00? Answer: p(0 < z < 1.00) =.3413 (or 34.13%) Example: What proportion of a normal distribution corresponds to negative z-scores > z = -1.00? What is the probability of selecting a negative z-score greater than z = -1.00? Answer: p(0 < z < 1.00) =.3413 (or 34.13%) Example: What proportion of a normal distribution corresponds to z-scores within 1 standard deviation of the mean? What is the probability of selecting a z-score greater than z = -1.00 and less than z = 1.00 ? Answer:.3413 +.3413 =.6826 p(-1.00 < z < 1.00) =.6826 (or 68.26%) Example: What z-score separates the bottom 80% from the remainder of the distribution? What z-score separates the top 20% from the remainder of the distribution? Answer: 80% (or.8000) z =.84 Example: What z-score separates the middle 60% from the remainder of the distribution? Example Assume a normal distribution with μ = 100 and σ = 15 What is the probability of randomly selecting an individual with an IQ score less than 130? p(X < 130) = ? Use Unit Normal Table to convert z-score to corresponding percentage/proportion Example Assume a normal distribution with μ = 58 and σ = 10 for average speed of cars on a section of interstate highway What proportion of cars traveled between 55 and 65 miles per hour? p(55 < X < 65) = ? Use Unit Normal Table to convert z-scores to corresponding proportions Example Assume a normal distribution with μ = 58 and σ = 10 for average speed of cars on a section of interstate highway What proportion of cars traveled between 65 and 75 miles per hour? p(65 < X < 75) = ? Use Unit Normal Table to convert z-scores to corresponding proportions MCQ Which are other names for the Select all of the statements that are true normal distribution? Select all that about normal distributions. apply. a) typical curve a) They are symmetric around their mean. b) Gaussian curve b) The mean, median, and mode are equal. c) Regular distribution c) They are defined by their mean and skew. d) Galileo curve d) The area under the normal curve is equal to 1.0. a) Bell-shaped curve e) They have high density in their tails. b) Laplace's distribution f) They are discrete distributions. A standard normal distribution has: A z-score value = 1.0 tells us that this z- score is: a) a mean of 1 and a standard deviation of 1 b) a mean of 0 and a standard deviation of 1 a) standard deviation=1, above the mean. b) standard deviation=1, below the mean. c) a mean larger than its standard deviation c) mean=1, above the standard deviation. d) mean=1, below the standard deviation. d) all scores within one standard deviation of the mean A number 1.5 standard deviations below the mean has a z score of: a) 1.5 b) -1.5 c) 3 d) more information is needed Thanks! Slidesgo CREDITS: This presentation template was created by Slidesgo, and includes icons by Flaticon, and infographics & images by Freepik Flaticon Freepik