Lecture 4 - Statistics & Probability (1) (2) PDF
Document Details
Uploaded by ProudOtter
Tags
Related
- Mathematics III(A) Syllabus PDF
- Mathematics III(A) Syllabus and Introduction Lecture (C. V. Raman Global University 2021) PDF
- The Nature of Probability and Statistics PDF
- Chapter 8 - Statistics and Probability PDF
- Probability Statistical and Numerical techniques Assignment-1 PDF
- Brier Score Explained PDF
Summary
This lecture covers measures of center and variability in statistics, including mean, median, mode, range, variance, and standard deviation. It provides examples and calculations.
Full Transcript
Lecture (4) Chapter (2) “Describing Data with Numerical Measures” وصف البيانات باستخدام مقاييس عددية Describing data Measures...
Lecture (4) Chapter (2) “Describing Data with Numerical Measures” وصف البيانات باستخدام مقاييس عددية Describing data Measures of Measures of center variability Standard Mean Median Mode Range Variance deviation 2.3 Measures of variability مقاييس التشتت 1) Range المدى 2) Variance التباين 3) Standard deviation اإلنحراف المعياري Range: Range = max – min Example: Given the following data: 2, 5, 10, 1, 4 and 3. Find the range. Range = max – min = 10 – 1 = 9 Variance and Standard Deviation: Population variance (𝜎 2 ): Let x1 , x2 , … , xN be observations in a population then the Population variance is: ∑𝑁 𝑖=1(𝑥𝑖 −𝜇) 2 ∑𝑁 𝑖=1 𝑥𝑖 𝜎2 = where 𝜇 = 𝑁 𝑁 o The population standard deviation (𝜎) is √𝜎 2. Sample variance (𝑆 2 ): Let x1 , x2 , … , xn be observations in a sample then the sample variance is: ∑𝑛 𝑖=1(𝑥𝑖 −𝑥̅ ) 2 ∑𝑛 𝑖=1 𝑥𝑖 𝑆2 = where 𝑥̅ = 𝑛−1 𝑛 or ∑𝑛𝑖=1 𝑥𝑖2 − 𝑛𝑥̅ 2 𝑆2 = 𝑛−1 Note that: ∑𝑛𝑖=1 𝑥𝑖2 ≠ (∑𝑛𝑖=1 𝑥𝑖 )2 o The sample standard deviation (𝑆) is √𝑆 2. Example: Given the following data: 1, 14, 15, 9, 4, 3 ∑𝑛𝑖=1 𝑥𝑖 𝑥̅ = 1) Calculate the sample variance and standard deviation. 𝑛 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2 1 + 14 + 15 + 9 + 4 + 3 2 𝑆 = = 𝑛−1 6 = 7.67 (1 − 7.67)2 + ⋯ + (3 − 7.67)2 = = 35.07 5 𝑆 = √𝑆2 = √35.07 = 5.92 2 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2 𝑆 = 𝑛−1 2) Calculate the sample standard deviation using the second formula. ∑𝑛𝑖=1 𝑥𝑖2 − 𝑛𝑥̅ 2 528 − 6 × (7.67)2 𝑆2 = = 𝑛−1 5 𝑛 = 35.07 ∑ 𝑥𝑖2 = 12 + 142 + 152 𝑖=1 𝑆 = √𝑆2 = √35.07 = 5.92 + 92 + 42 + 32 = 528 Example: If the deviation of five observation around their mean: −2, 2, 4, 3, −7. Calculate the sample variance. ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2 (−2)2 + 22 + 42 + 32 + (−7)2 𝑆2 = = = 20.5 𝑛−1 5−1 Example: Given the following data. ∑𝑛𝑖=1 𝑥𝑖 𝑥̅ = 9, 9, 9, 9, 9, 9, 9, 9, 9, 9. 𝑛 9 + ⋯ + 9 90 Calculate the sample variance. = = 10 10 =9 2 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2 (9 − 9)2 + ⋯ + (9 − 9)2 𝑆 = = =0 𝑛−1 10 − 1 Note that if all the observation have the same value the the sample variance (𝑆2 ) is always zero (= 0). The largest 𝑆 2 the greater the variability..كلما كانت التباين كبيرة كلما كانت االنحرافات أكثر Example: Given the following table: Sample 1 Sample 2 𝑥̅ = 18 𝑥̅ = 18 𝑆2 = 9 𝑆2 = 25 Which sample is the best? Sample 1.في حال تساوي الوسط فإن قيمة التباين األقل هي األفضل Example: Given the following informatios: 2 𝑛 2 ∑50 𝑖=1 𝑥𝑖 = 300 and the sample variance 𝑆 = 3. Calculate ∑𝑖=1 𝑥𝑖. ∑50 𝑖=1 𝑥𝑖 𝑥̅ = ∑𝑛𝑖=1 𝑥𝑖2 − 𝑛𝑥̅ 2 𝑛 𝑆2 = 𝑛−1 300 = =6 50 ∑𝑛𝑖=1 𝑥𝑖2 − 50 × (6)2 3= (50 − 1) 𝑛 ∑ 𝑥𝑖2 = 1947 𝑖=1 Homework: Q1: You are given n = 10 measurements: 3, 5, 4, 6, 10, 5, 6, 9, 2, 8. a. Calculate the sample mean. b. Find median. c. Find the mode. Q2: You are given n = 8 measurements: 4, 1, 3, 1, 3, 1, 2, 2. a. Find the range. b. Calculate 𝑥̅. c. Calculate 𝑆 2 and 𝑆. لذلك، االنحراف المعياري يأخذ جميع خصائص التباين ماعدا الوحدة فوحدة االنحراف المعياري هي نفس وحدة قياس البيانات:مالحظة. 𝑆 ألن وحدة 𝑆 نفس وحدة البيانات2 يفضل استخدام 𝑆 أكثر من 2.4 On the Practical Significance of the Standard Deviation األهمية العلمية لإلنحراف المعياري Tchebysheff’s Theorem )كم نسبة البيانات الواقعة ضمن الفترة (تعطينا أقل نسبة للفترة Given the number k any real number greater than or equal 1 and a set of n measurement, 1 at least (1 − 𝑘 2 ) of the measurement will lie within k standard deviation of their mean). 𝑥̅ ± 𝑘𝑆 k عن ̅𝑥 بمقدارS المعنى كم تبعد Example: Let k = 2. 1 3 At least (1 − ) = = 75% will lie in the interval (𝑥̅ − 2𝑆, 𝑥̅ + 2𝑆) 4 4 Let k = 3. 1 8 At least (1 − 9) = 9 = 88.9% will lie in the interval (𝑥̅ − 3𝑆, 𝑥̅ + 3𝑆) Let k = 1. 1 At least (1 − 1) = 0 (Non of the measurement will lie in the interval (𝑥̅ − 𝑆, 𝑥̅ + 𝑆) Example: Find the proportion of the measurement that fall within 2.5 standard deviation of the mean. 1 5.25 At least (1 − 2.52 ) = 6.25 = 84% will lie in the interval (𝑥̅ − 2.5𝑆, 𝑥̅ + 2.5𝑆) Example: Given a set of 40 measurements that have mean 60 and variance 100. 1) Find the proportion of the measurement that lie in the interval [40, 80]. 𝑥̅ − 𝑘𝑆 = 40 60 − 10𝑘 = 40 𝑘=2 1 3 At least (1 − 22 ) = 4 = 75% will lie in the interval [40, 80]. 2) Find the number of measurements that lie in the interval [40, 80]. 3 × 40 = 30 4 3) Find the proportion of the measurement that lie in the interval [30, 90]. 𝑥̅ − 𝑘𝑆 = 30 60 − 10𝑘 = 30 𝑘=3 1 8 At least (1 − 9) = 9 = 88.9% will lie in the interval [30, 90]. Example: If the mean is 75 and the standard deviation is 10. Find the internal that at least 8 of the measurement will lie in it. 9 1 8 1− 2 = 𝑘 9 1 1 = 𝑘2 9 𝑘2 = 9 𝑘 = ±3.نهمل القيمة السالبة ألنها طول (𝑥̅ − 2𝑆, 𝑥̅ + 2𝑆) = (75 − 3 × 10, 75 + 3 × 10) = (45, 105) Empirical Rule Given a distribution of measurement that is approximately mound shape. 1) The interval (𝜇 ± 𝜎) contains approximately 68% of measurements. 2) The interval (𝜇 ± 2𝜎) contains approximately 95% of measurements. 3) The interval (𝜇 ± 3𝜎) contains approximately 99.7% of measurements. Empirical Tchebysheff’s Mound shape Any distribution The value of k is only 1, 2, 3 Use any value within k , 𝑘 ≥ 1 تقدر نسبة البيانات بالظبط تعطينا أقل نسبة للفترة. ألنه يعطي قيمة تقريبية دقيقةTchebysheff’s أدق منEmpirical Example: Let k = 2. o Empirical: Approximately 95% of measurements will lie in the interval (𝜇 ± 2𝜎) o Tchebysheff’s: 1 3 At least (1 − 4) = 4 = 75% will lie in the interval (𝑥̅ − 2𝑆, 𝑥̅ + 2𝑆) Example: Given that the distribution of a measurement is mound- sahpe with mean and variance 40 and 81, respectively. 1) What is the proportion of measurement that lie within one standard deviation of the mean. k = 1 contains approximately 68% of measurements. 2) What is the proportion of measurement that lie within the interval (22, 58). 𝑥̅ − 𝑘𝑆 = 22 40 − 9𝑘 = 22 𝑘=2 Contains approximately 95% of measurements. 2.5 A Check on the Calculation of S The approximate value of the standard deviation is 𝑅 𝑆= 4 where R = max – min Example: Given the following data: 4, 9, 15, 20, 13, 14, 8, 14, 17, 5. R = max – min 1) Calculate the approximate value of the standard deviation. 𝑅 16 = 20 – 4 𝑆= = =4 4 4 =16 2) Calculate the standard deviation. ∑𝑛𝑖=1 𝑥𝑖 2 − 𝑛𝑥̅ 2 1522 − 10 × (11.9)2 𝑆2 = = = 11.77 𝑛−1 10 − 1 𝑥𝑖 𝑥𝑖 2 4 16 ∑ 𝑥𝑖 9 81 𝑥̅ = 15 255 𝑛 20 400 4 + 9 + 15 + 20 + 13 + 14 + 8 + 14 + 17 + 5 13 169 = 10 14 196 119 8 64 = = 11.9 14 196 10 17 289 5 25 Total 1522 𝑆 = √𝑆 2 = √11.77 = 3.4 3) Calculate the approximate value of the variance. 𝑅 16 𝑆= = =4 4 4 𝑆2 = 42 = 16