Statistics Chapter 4.1 PDF
Document Details
Uploaded by RomanticCedar
Central Philippine University
Tags
Summary
This document provides an overview of chapter 4.1 in a statistics textbook. It covers measures of central tendency, such as the arithmetic mean, median, and mode, for numerical data. The document also discusses the concept of a weighted mean.
Full Transcript
CHAPTER 4.1 Statistics Copyright © Cengage Learning. All rights reserved. Measures of Central Section 4.1 Tendency Copyright © Cengage Learning. All rights reserve...
CHAPTER 4.1 Statistics Copyright © Cengage Learning. All rights reserved. Measures of Central Section 4.1 Tendency Copyright © Cengage Learning. All rights reserved. The Arithmetic Mean 3 The Arithmetic Mean Statistics involves the collection, organization, summarization, presentation, and interpretation of data. The branch of statistics that involves the collection, organization, summarization, and presentation of data is called descriptive statistics. The branch that interprets and draws conclusions from the data is called inferential statistics. 4 The Arithmetic Mean One of the most basic statistical concepts involves finding measures of central tendency of a set of numerical data. We will consider three types of averages, known as the arithmetic mean, the median, and the mode. Each of these averages is a measure of central tendency for the numerical data. 5 The Arithmetic Mean In statistics it is often necessary to find the sum of a set of numbers. The traditional symbol used to indicate a summation is the Greek letter sigma, . Thus the notation x, called summation notation, denotes the sum of all the numbers in a given set. We can define the mean using summation notation. 6 The Arithmetic Mean Statisticians often collect data from small portions of a large group in order to determine information about the group. In such situations the entire group under consideration is known as the population, and any subset of the population is called a sample. It is traditional to denote the mean of a sample by (which is read as “x bar”) and to denote the mean of a population by the Greek letter (lowercase mu). 7 Example 1 – Find a Mean Six friends in a biology class of 20 students received test grades of 92, 84, 65, 76, 88, and 90 Find the mean of these test scores. Solution: The 6 friends are a sample of the population of 20 students. Use to represent the mean. 8 Example 1 – Solution cont’d The mean of these test scores is 82.5. 9 The Median 10 The Median Another type of average is the median. Essentially, the median is the middle number or the mean of the two middle numbers in a list of numbers that have been arranged in numerical order from smallest to largest or largest to smallest. Any list of numbers that is arranged in numerical order from smallest to largest or largest to smallest is a ranked list. 11 Example 2 – Find a Median Find the median of the data in the following lists. a. 4, 8, 1, 14, 9, 21, 12 b. 46, 23, 92, 89, 77, 108 Solution: a. The list 4, 8, 1, 14, 9, 21, 12 contains 7 numbers. The median of a list with an odd number of entries is found by ranking the numbers and finding the middle number. Ranking the numbers from smallest to largest gives 1, 4, 8, 9, 12, 14, 21 The middle number is 9. Thus 9 is the median. 12 Example 2 – Solution cont’d b. The list 46, 23, 92, 89, 77, 108 contains 6 numbers. The median of a list of data with an even number of entries is found by ranking the numbers and computing the mean of the two middle numbers. Ranking the numbers from smallest to largest gives 23, 46, 77, 89, 92, 108 The two middle numbers are 77 and 89. The mean of 77 and 89 is 83. Thus 83 is the median of the data. 13 The Mode 14 The Mode A third type of average is the mode. 15 Example 3 – Find a Mode Find the mode of the data in the following lists. a. 18, 15, 21, 16, 15, 14, 15, 21 b. 2, 5, 8, 9, 11, 4, 7, 23 Solution: a. In the list 18, 15, 21, 16, 15, 14, 15, 21, the number 15 occurs more often than the other numbers. Thus 15 is the mode. b. Each number in the list 2, 5, 8, 9, 11, 4, 7, 23 occurs only once. Because no number occurs more often than the others, there is no mode. 16 The Weighted Mean 17 The Weighted Mean A value called the weighted mean is often used when some data values are more important than others. For instance, many professors determine a student’s course grade from the student’s tests and the final examination. Consider the situation in which a professor counts the final examination score as 2 test scores. To find the weighted mean of the student’s scores, the professor first assigns a weight to each score. 18 The Weighted Mean In this case the professor could assign each of the test scores a weight of 1 and the final exam score a weight of 2. A student with test scores of 65, 70, and 75 and a final examination score of 90 has a weighted mean of 19 The Weighted Mean 20 Example 4 – Find a Weighted Mean Table 13.1 shows Dillon’s fall semester course grades. Use the weighted mean formula to find Dillon’s GPA for the fall semester. Dillon’s Grades, Fall Semester Table 13.1 21 Example 4 – Solution The B is worth 3 points, with a weight of 4; the A is worth 4 points with a weight of 3; the D is worth 1 point, with a weight of 3; and the C is worth 2 points, with a weight of 4. The sum of all the weights is 4 + 3 + 3 + 4, or 14. Dillon’s GPA for the fall semester is 2.5. 22 ***End*** 23 CHAPTER 4.2 Statistics Copyright © Cengage Learning. All rights reserved. Section 4.2 Measures of Dispersion Copyright © Cengage Learning. All rights reserved. The Range 3 The Range To measure the spread or dispersion of data, we must introduce statistical values known as the range and the standard deviation. 4 Example 1 – Find a Range Range (R) The difference between the maximum and minimum value in a data set, i.e. R = MAX – MIN Example: Pulse rates of 15 male residents of a certain village 54 58 58 60 62 65 66 71 74 75 77 78 80 82 85 R = 85 - 54 = 31 5 The Standard Deviation 6 The Standard Deviation The range of a set of data is easy to compute, but it can be deceiving. The range is a measure that depends only on the two most extreme values, and as such it is very sensitive. A measure of dispersion that is less sensitive to extreme values is the standard deviation. The standard deviation of a set of numerical data makes use of the amount by which each individual data value deviates from the mean. These deviations, represented by , are positive when the data value x is greater than the mean and are negative when x is less than the mean. The sum of all the deviations is 0 for all sets of data. 7 The Standard Deviation 8 The Standard Deviation Because the sum of all the deviations of the data values from the mean is always 0, we cannot use the sum of the deviations as a measure of dispersion for a set of data. Instead, the standard deviation uses the sum of the squares of the deviations. 9 The Standard Deviation Most statistical applications involve a sample rather than a population, which is the complete set of data values. Sample standard deviations are designated by the lowercase letter s. In those cases in which we do work with a population, we designate the standard deviation of the population by , which is the lowercase Greek letter sigma. 10 The Standard Deviation We can use the following procedure to calculate the standard deviation of n numbers. 11 Example 2 – Find the Standard Deviation The following numbers were obtained by sampling a population. 2, 4, 7, 12, 15 Find the standard deviation of the sample. Solution: Step 1: The mean of the numbers is 12 Example 2 – Solution cont’d Step 2: For each number, calculate the deviation between the number and the mean. 13 Example 2 – Solution cont’d Step 3: Calculate the square of each of the deviations in Step 2, and find the sum of these squared deviations. 14 Example 2 – Solution cont’d Step 4: Because we have a sample of n = 5 values, divide the sum 118 by n – 1, which is 4. Step 5: The standard deviation of the sample is. To the nearest hundredth, the standard deviation is s = 5.43. 15 The Variance 16 The Variance A statistic known as the variance is also used as a measure of dispersion. The variance for a given set of data is the square of the standard deviation of the data. The following chart shows the mathematical notations that are used to denote standard deviations and variances. 17 Example 5 – Find the Variance Find the variance for the sample given earlier in Example 2. Solution: The standard deviation which we found in Example 2 is. The variance is the square of the standard deviation. Thus the variance is 18 Standard Deviation and Variance n n (x i x) 2 (x i x )2 s i 1 s i 1 n 1 n 1 19 Comparing Standard Deviation Data A Mean = 15.5 11 12 13 14 15 16 17 18 19 20 21 s = 3.338 Data B Mean = 15.5 11 12 13 14 15 16 17 18 19 20 21 s =.9258 Data C Mean = 15.5 11 12 13 14 15 16 17 18 19 20 21 s = 4.57 20 Comparing Standard Deviation Example: Team B - Heights of five marathon players in inches Mean = 65” SD = 5.0” 60 “ 60 “ 65 “ 70 “ 70 “ 21 Comparing Standard Deviation Example: Team A - Heights of five marathon players in inches Mean = 65 S =0 65 “ 65 “ 65 “ 65 “ 65 “ 22 Standard Deviation and Variance 23 Standard Deviation and Variance Standard Variance Deviation Standard Variance 1 1 Deviation 6 ? 0 0 ? 49 5 25 8 ? 10 100 24 N O R M AL D I S T R I B U T I O N AN D AR E A S U N D E R T H E N O R M AL C U RV E OBJECTIVES Interpret graphs of normal probability distributions Find areas under the standard normal curve PROPERTIES OF NORMAL DISTRIBUTIONS Normal distribution A continuous probability distribution for a random variable, x. The most important continuous probability distribution in statistics. The graph of a normal distribution is called the normal curve. x PROPERTIES OF NORMAL DISTRIBUTIONS 1. The mean, median, and mode are equal. 2. The normal curve is bell-shaped and is symmetric about the mean. 3. The total area under the normal curve is equal to 1. 4. The normal curve approaches, but never touches, the x-axis as it extends farther and farther away from the mean. x PROPERTIES OF NORMAL DISTRIBUTIONS 5. Between μ – σ and μ + σ (in the center of the curve), the graph curves downward. The graph curves upward to the left of μ – σ and to the right of μ + σ. The points at which the curve changes from curving upward to curving downward are called the inflection points. μ – 3σ μ – 2σ μ–σ μ μ+σ μ + 2σ μ + 3σ M E A N S A N D S TA N D A R D D E V I AT I O N S A normal distribution can have any mean and any positive standard deviation. The mean gives the location of the line of symmetry. The standard deviation describes the spread of the data. μ = 3.5 μ = 3.5 μ = 1.5 σ = 1.5 σ = 0.7 σ = 0.7 E X A M P L E : U N D E R S TA N D I N G M E A N A N D S TA N D A R D D E V I AT I O N 1. Which normal curve has the greater mean? Solution: Curve A has the greater mean (The line of symmetry of curve A occurs at x = 15. The line of symmetry of curve B occurs at x = 12.) E X A M P L E : U N D E R S TA N D I N G M E A N A N D S TA N D A R D D E V I AT I O N 2. Which curve has the greater standard deviation? Solution: Curve B has the greater standard deviation (Curve B is more spread out than curve A.) EXAMPLE: INTERPRETING GRAPHS The scaled test scores for the New York State Grade 8 Mathematics Test are normally distributed. The normal curve shown below represents this distribution. What is the mean test score? Estimate the standard deviation. Because the inflection points are Solution: one standard deviation from the Because a normal curve is mean, you can estimate that σ ≈ symmetric about the mean, 35. you can estimate that μ ≈ 675. T H E S TA N D A R D N O R M A L DISTRIBUTION Standard normal distribution A normal distribution with a mean of 0 and a standard deviation of 1. Area = 1 z –3 –2 –1 0 1 2 3 Any x-value can be transformed into a z-score by using the formula Value Mean x z Standard deviation T H E S TA N D A R D N O R M A L DISTRIBUTION If each data value of a normally distributed random variable x is transformed into a z- score, the result will be the standard normal distribution. Standard Normal Normal Distribution Distribution x σ z σ1 x 0 z Use the Standard Normal Table to find the cumulative area under the standard normal curve. P R O P E R T I E S O F T H E S TA N D A R D NORMAL DISTRIBUTION 1. The cumulative area is close to 0 for z-scores close to z = –3.49. 2. The cumulative area increases as the z-scores increase. Area is close to 0 z –3 –2 –1 0 1 2 3 z = –3.49 P R O P E R T I E S O F T H E S TA N D A R D NORMAL DISTRIBUTION 3. The cumulative area for z = 0 is 0.5000. 4. The cumulative area is close to 1 for z-scores close to z = 3.49. Area z is close to 1 –3 –2 –1 0 1 2 3 z=0 z = 3.49 Area is 0.5000 EXAMPLE: USING THE S TA N D A R D N O R M A L TA B L E Find the cumulative area that corresponds to a z-score of 1.15. Solution: Find 1.1 in the left hand column. Move across the row to the column under 0.05 The area to the left of z = 1.15 is 0.8749. EXAMPLE: USING THE S TA N D A R D N O R M A L TA B L E Find the cumulative area that corresponds to a z-score of –0.24. Solution: Find –0.2 in the left hand column. Move across the row to the column under 0.04 The area to the left of z = –0.24 is 0.4052. FINDING AREAS UNDER THE S TA N D A R D N O R M A L C U R V E 1. Sketch the standard normal curve and shade the appropriate area under the curve. 2. Find the area by following the directions for each case shown. a. To find the area to the left of z, find the area that corresponds to z in the Standard Normal Table. 2. The area to the left of z = 1.23 is 0.8907 1. Use the table to find the area for the z- 16 of 105 score FINDING AREAS UNDER THE S TA N D A R D N O R M A L C U R V E b. To find the area to the right of z, use the Standard Normal Table to find the area that corresponds to z. Then subtract the area from 1. 2. The area to the 3. Subtract to find the area left of z = 1.23 to the right of z = 1.23: is 0.8907. 1 – 0.8907 = 0.1093. 1. Use the table to find the area for the z-score. FINDING AREAS UNDER THE S TA N D A R D N O R M A L C U R V E c. To find the area between two z-scores, find the area corresponding to each z-score in the Standard Normal Table. Then subtract the smaller area from the larger area. 2. The area to the 4. Subtract to find the area of left of z = 1.23 the region between the two is 0.8907. z-scores: 3. The area to the 0.8907 – 0.2266 = 0.6641. left of z = –0.75 is 0.2266. 1. Use the table to find the area for the z-scores. EXAMPLE: FINDING AREA UNDER T H E S TA N D A R D N O R M A L C U R V E Find the area under the standard normal curve to the left of z = –0.99. Solution: 0.1611 z –0.99 0 From the Standard Normal Table, the area is equal to 0.1611. EXAMPLE: FINDING AREA UNDER T H E S TA N D A R D N O R M A L C U R V E Find the area under the standard normal curve to the right of z = 1.06. Solution: 0.8554 1 – 0.8554 = 0.1446 z 0 1.06 From the Standard Normal Table, the area is equal to 0.1446. EXAMPLE: FINDING AREA UNDER T H E S TA N D A R D N O R M A L C U R V E Find the area under the standard normal curve between z = –1.5 and z = 1.25. Solution: 0.8944 – 0.0668 = 0.8276 0.8944 0.0668 z –1.50 0 1.25 From the Standard Normal Table, the area is equal to 0.8276. THANK YOU!