Probability and Statistics for Engineers - Taibah University - STAT 301+305 - PDF

Summary

This document is a set of lecture slides from a probability and statistics course for engineers at Taibah University. It covers topics such as descriptive and inferential statistics, types of data (categorical and quantitative), discrete and continuous data, frequency distributions, and relationships between variables.

Full Transcript

Taibah University ‫جامعة طيبة‬ Faculty of Science ‫كلية العلوم‬ Department of Mathematics ‫قسم الرياضيات‬ Probability and Statistics for Engineers...

Taibah University ‫جامعة طيبة‬ Faculty of Science ‫كلية العلوم‬ Department of Mathematics ‫قسم الرياضيات‬ Probability and Statistics for Engineers STAT 301+305 Lesson 1 Introduction Definition of Statistics A collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data. 1 The field of statistics is divided into two parts: 1. Descriptive statistics: Describe data that have been collected. Commonly used descriptive statistics include frequency counts, ranges (high and low scores or values), means, modes, median scores, and standard deviations. 2. Inferential Statistics : Generalizing from samples to populations using probabilities. Performing hypothesis testing, determining relationships between variables, and making predictions. Definitions: Data: Are observations that have been collected, such as measurements, genders, survey responses. Variable: Is a characteristic or attribute that takes different values in different person. Random Variable: A variable whose values are determined by chance. Population: Is the complete collection of all elements to be studied. (scores, people, measurements, and so on). Sample: A subgroup or subset of the population. Parameter: Characteristic or measure obtained from a population. Statistic: Characteristic or measure obtained from a sample. 2 Table below explains some parameters and statistics Measure Population Sample Size 𝑁 𝑛 Mean µ 𝑥ҧ Variance 𝜎2 𝑆2 Standard Deviation 𝜎 𝑆 Populations and Samples: Population Sample = Observations (Some Unknown Parameters) (We calculate Some Statistics) Example: TU Students Example: 20 Students from TU (Height Mean) (Sample Mean) 𝑵 = Population Size 𝒏 = Sample Size Let 𝒙𝟏, 𝒙𝟐, … , 𝒙𝑵 be the population values (in general, they are unknown). Let 𝒙𝟏, 𝒙𝟐, … , 𝒙𝒏 be the sample values (these values are known). Statistics obtained from the sample are used to estimate (approximate) the parameters of the population. 3 Lesson 2 Types of Data Categorical Data The objects being studied are grouped into categories based on some qualitative trait. The resulting data are merely labels or categories. Examples of Categorical Data Eye color (blue, brown, hazel, green, etc.) Gender (Male , Female.) Smoking status (smoker, non-smoker) Attitudes towards the death penalty (Strongly disagree, disagree, neutral, agree, strongly agree.) Categorical data classified as Nominal, Ordinal, and/or Binary Categorical data Nominal Ordinal data data Binary Not binary Binary Not binary 4 Nominal Data A type of categorical data in which objects fall into unordered categories. Examples of Nominal Data Gender (Male, Female) Nationality (French, Japanese, Egyptian, Chinese,… etc) Smoking status (smoker, non-smoker) Ordinal Data A type of categorical data in which order is important. Examples of Ordinal Data Class of degree (1st class, 2nd, 3rd class, fail) Degree of illness (none, mild, moderate, acute, chronic.) Opinion of students about stats classes (Very unhappy, unhappy, neutral, happy, ecstatic!) Binary Data A type of categorical data in which there are only two categories. Smoking status: smoker, non-smoker Attendance: present, absent Class of mark: pass, fail Status of student: undergraduate, postgraduate 5 Quantity Data The objects being studied are ‘measured’ based on some quantitative trait. The resulting data are set of numbers. Quantity data can be classified as discrete or continuous. Examples of quantity Data Pulse rate Height Age Exam marks Time to complete a statistics test Family Size Discrete Data If the values/observations belonging to it may take only specific values (integer). There are gaps between the possible values. It does not contain fraction. Implies counting. Continuous Data If the values/observations belonging to it may take on any value within a finite or infinite interval (real). Can contain fraction. Implies Measurement. Discrete data -- Gaps between possible values- count 0 1 2 3 4 5 6 7 Continuous data no gaps between possible values- measure 0 1000 6 Examples of Discrete Data Number of children in a family Number of students passing a stats exam Number of crimes reported to the police Number of cars sold in a day Generally, discrete data are counts. We would not expect to find 2.2 children in a family or 88.5 students passing an exam, or 127.2 crimes being reported to the police, or half a bicycle being sold in one day. Examples of Continuous data Weight Height Time to run 500 metres Age Generally, continuous data come from measurements. (any value within an interval is possible with a fine enough measuring device). Relationships between variables Variables Category Quantity Discrete Continuous Nominal Ordinal (counting) (measuring) Interval and ratio variables Interval: Ratio: – Numerical data – Numerical data – data can be ranked – data can be ranked – Data has equal intervals – Data has equal intervals between data points between data points – There is no meaningful zero – True zero – ratios are meaningless. – True ratios exist between the different units of measure. Difference between interval and ratio usually not important for statistical analysis. 7 Lesson 3 Organization and Presentation of Data Introduction After the data have been collected, the main tasks a statistician must accomplish are the organization and presentation of the data The organization must be done in a meaningful way and the presentation should be such that an interested reader of the study can understand the data distribution. Definitions: Raw data: is data collected in original form (before it has been organized). Example : The following data is raw data. Class: is quantitative or qualitative category in which the raw data is placed. Classes must satisfy the following conditions: 1. There is usually between 5 and 20 classes. 2. The classes must be mutually exclusive. 3. The classes must be exhaustive. 8 Frequency Distribution The researchers organizes the raw data by using frequency distribution. The frequency is the number of values in a specific class of data. A frequency distribution is the organizing of raw data in table form, using classes and frequencies. For the first data set, a frequency distribution is shown as follow: Class Tally Frequency 1-3 //// / 6 4-6 //// //// / 11 7-9 //// 4 10-12 / 1 13-15 //// 4 16-18 //// 4 Types of Frequency Distribution There are three basic types of frequency distribution: – Categorical – Ungrouped – Grouped Categorical Frequency Distribution The categorical frequency distribution is used for data that can be placed in specific categories, such as nominal or ordinal data. For example, data such as political affiliations, religion affiliations, or major field of study would use categorical frequency distribution. Example: The blood type of different students: Class Tally Frequency A //// 5 B //// // 7 O //// //// 9 AB //// 4 Total 25 9 Ungrouped Frequency Distribution When the range of data is small, the data must be grouped into classes that are not more than one unit in width. Example 4 8 8 9 8 5 9 9 10 11 7 7 8 7 8 4 8 7 5 7 6 5 8 8 9 The range in the example is 𝑹 = 𝒉𝒊𝒈𝒉𝒆𝒔𝒕 𝒗𝒂𝒍𝒖𝒆 – 𝒍𝒐𝒘𝒆𝒔𝒕 𝒗𝒂𝒍𝒖𝒆 = 𝟏𝟏 – 𝟒 = 𝟕 Since the range is small, classes consisting of single data value can be used. Example Cont. Class Tally Frequency 4 // 2 5 /// 3 6 / 1 7 //// 5 8 //// // 7 9 //// 4 10 // 2 11 / 1 Grouped Frequency Distribution When the range of the data is large, the data must be grouped into classes that are more than one unit in width. In this case we have additional conditions for the classes: 1. The class width should be preferably an odd number. 2. The classes must be equal in width. 3. The classes must be continuous for continuous variable and discrete for discrete one. 10 Example: Set a frequency distribution for the following sample data 1 2 6 7 12 13 2 6 9 5 18 7 3 15 15 4 17 1 14 5 4 16 4 5 8 6 5 18 5 2 9 11 12 1 9 2 10 11 4 10 9 18 8 8 4 14 7 3 2 6 Class Tally Frequency 1-3 //// //// 10 4-6 //// //// //// 14 7-9 //// //// 10 10-12 //// / 6 13-15 //// 5 16-18 //// 5 In this distribution, the values 1 and 3 of the first class are called “class limits”. 1 is the “lower class limit” and 3 is the “upper class limit.” Cumulative Frequency The cumulative frequency is the sum of the frequencies accumulated up to the upper boundary of a class in the distribution. They are used to visually represent how many values are below a certain upper class boundary. Example of Cumulative Frequency Distribution Cumulative Class Frequency frequency 1-4 6 6 5-8 2 8 9-12 5 13 13-16 3 16 11 Lesson 4 Graphical Presentation Presenting categorical data Bar Chart A chart containing rectangles in which the length of each bar represents the count, amount, or percentage of responses of one category. There are gabs between bars Example The following table represents distribution of students according to their faculties in one of universities: Faculty Students Science 150 Medicine 100 Arts 250 Education 300 Economics 200 Total 1000 12 Pie Chart A pie chart is a circle that is divided into sections according to the percentage of frequencies in each category of the distribution. Steps : 1. Calculate the percentage contribution for each category by dividing the value of each category by the total number, and multiply the product by 100. 2. Calculating the number of degrees by multiplying the percent by 3.6 (a circle has 360 degrees) Example The following table represents the exports of one of countries from petroleum and the categories of the imported countries: category Quantity percentage Angle degree Arabic 2803 1.27% 5 Australia 42886 19.37% 70 America 11552 5.22% 19 Asia 158764 71.71% 258 Africa 5383 2.43% 9 Total 221388 100% 360 For example (for Arabic category): 𝟐𝟖𝟎𝟑 percentage (1.27%) = ×100% 𝟐𝟐𝟏𝟑𝟖𝟖 Angle degree (5) = (1.27× 3.6) = 4.56 ~ 5 Presenting quantitative data Stem-and-Leaf Plots The stem and leaf plot represents an effective way to summarize data. A stem-and-leaf plot can help you compare data. Example 1 The heights of 11 fourth-grade badminton players are (in inches): 56 58 61 58 61 63 60 61 59 59 13 57 Example 1 cont. The ordered numbers are: 56, 57, 58, 58, 59, 59, 60, 61, 61, 61, 63 Each STEM stands for the first digit of each number. Record the tens digits in order from least to greatest. HEIGHT IN INCHES Stem Leaf 5 6, 7, 8, 8, 9, 9 6 0, 1, 1, 1, 3 Example 2 The stem represents the digit preceding the decimal and the leaf corresponds to the decimal part of the number. DOT PLOT This type of chart uses for numerical raw data, each response is represented as a dot above a horizontal line that extends through the range of all values. Should two or more response values be identical, the dots for these responses are stacked (placed vertically) above each. DOT PLOT (Steps)  Step 1: Label your axis and title your graph. Draw a horizontal line and label it with the variable.  Step 2: Scale the axis based on the values of the variable.  Step 3: Mark a dot above the number on the horizontal axis corresponding to each data value. 14 Example The number of goals scored by each team in the first round of the California Southern Section Division V high school soccer playoffs is shown in the following table. 5 0 1 0 7 2 1 0 4 1 0 3 0 2 0 3 1 5 0 3 0 1 0 1 0 2 0 3 Histogram Is a bar graph in which the horizontal scale represents classes of data values, and the vertical scale represents frequencies. A special bar chart for grouped numerical data in which the frequencies or percentages of each group of numerical data are represented as individual bars on the vertical Y-axis and the variable is plotted on the horizontal X- axis. In a histogram, there are no gaps between adjacent bars as there would be in a bar chart of categorical data. Example The following distribution represents marks of 54 students. Draw a histogram for this data Mark students 0-4 8 4–8 12 8 – 12 20 12 – 16 6 16 – 20 8 2 Total 54 15 Example If the previous table converted to Mark students 1-4 8 5–8 12 9 – 12 20 13 – 16 6 17 – 20 8 Total 54 Solution: First we must compute classes boundaries. Class Class students Limit boundaries 1–4 0.5 – 4.5 8 5–8 4.5 – 8.5 12 9 –12 8.5 – 12.5 20 13 –16 12.5 – 16.5 6 17 –20 16.5 – 20.5 8 2 Total 54 Polygon A frequency polygon uses line segments connected to points located directly above class midpoint values. The heights of the points correspond to the class frequencies, and the line segments are extended to the right and left so that the graph begins and ends on the horizontal axis. Example Draw a polygon for the following data To draw polygon, we need to compute classes midpoints 16 Lesson 5 Measures of Location (Central Tendency) Measures of Central Tendency The data (observations) often tend to be concentrated around the center of the data.  Some measures of location are the mean, median and mode.  These measures are considered as representatives (or typical values) of the data.  They are designed to give some quantitative measures of where the center of the data is in the sample. The Sample Mean  Is the most common measure of central tendency.  The sum of the values (positive , negative or zero) divided by the number of values.  Is called the Mean , Sample Mean , Arithmetic Mean and average.  If the list is a statistical population, then the mean of that population is called a population mean, denoted by µ.  If the list is a statistical sample, we call the resulting statistic a sample mean, denoted by ഥ. 𝑿 17 The Sample Mean: If 𝑿𝟏, 𝑿𝟐, … … ….. , 𝑿𝒏 are the sample values, then the sample mean is: X 1  X 2 .........  X n X  n 1 n Using summation: X   Xi n i 1 Example 1: Suppose that the following sample represents the ages (in year) of a sample of 3 men: 𝑿𝟏 = 𝟑𝟎 , 𝑿𝟐 = 𝟑𝟓 , 𝑿𝟑 = 𝟐𝟕 Then, the sample mean is ഥ = 𝟑𝟎+𝟑𝟓+𝟐𝟕 = 𝟗𝟐 = 𝟑𝟎. 𝟔𝟕 𝑿 𝟑 𝟑 The Sample Mean (Example 2): For what value of X will 8 and X have the same sample mean as 27 and 5? Solution: First, find the mean of 27 and 5: 27  5  16 2 Now, find the X value, knowing that the sample mean of X and 8 must be 16 : X 8  16 2 cross multiply and solve: 32 = X + 8 X = 24 The Sample Mean (Example 3): On his first 5 Stat. tests, Omer received the following marks: 72, 86, 92, 63, and 77. What test mark must Omer earn on his sixth test so that his average for all six tests will be 80? Solution Set up an equation to represent the situation. 72 + 86 + 92 + 63 + 77 + 𝑋 = 80 𝑋 = 90 6 Omer must get a 90 on the sixth test. 18 The Sample Mean (Advantages):  Most popular measure in fields such as business, engineering and computer science.  It is unique - there is only one answer.  Useful when comparing sets of data. The Sample Mean (Disadvantages):  Affected by extreme values (outliers) Example.  The sample mean of 2,3,4 is 3.  The sample mean of 2,3,40 is 15.  The mean increased from 3 to 15 because 40 is an extreme value. The Sample Mean (Properties):  The sum of the deviations of the observations from their mean is always zero  X  n i X 0 i 1 Example: Find the sum of the deviations of the values 3, 4, 6, 8,14 from their mean 3  4  6  8  14 35 X   7 5 5  X  n i  X  (3  7)  4  7   6  7   8  7   14  7   0 i 1 The Sample Median  The purpose of the sample median is to reflect the central tendency of the sample in such a way that it is uninfluenced by extreme values or outliers.  The value which divides the data into two equal halves, with half of the data being lower than the median and half higher than the median. 19 The Sample Median (Steps): If 𝑿𝟏, 𝑿𝟐, … … ….. , 𝑿𝒏 are the sample values, then the sample median computed as follows:  Sort the values into ascending order.  If we have an odd number (𝒏) of values, the median is the middle value. Me  X 1 ( n 1) 2  If we have an even number of values, the median is the sample mean of the two middle values. 1  Me  Xn  Xn   2 2 1  2  The Sample Median (Example 1 ): Compute The sample median of (12, 24, 19, 20, 7). Solution Sort the values into ascending order 7,12, 19, 20, 24 Number of values is 5 is an odd the sample median is : Me  X 1  X1  X 3  19 ( n 1) ( 51) 2 2 The Sample Median (Example 2 ): Compute The sample median of (12, 24, 19, 20, 7 , 5). Solution Sort the values into ascending order 5,7,12, 19, 20, 24 Number of values is 6 is an even the sample median is :  X n  X n    X 6  X 6   X 3  X 4  1  1  Me  2  2 2 1   2  2 2 1   2  12  19  15.5 2 20 The Sample Median (Advantages):  Extreme values do not affect the median as strongly as they do to the mean.  It is unique - there is only one answer.  Useful when comparing sets of data. The Sample Median (Disadvantages):  Not as popular as sample mean. The Sample Mode  The value of a variable, which occurs with the highest frequency.  The value of a distribution for which the frequency is maximum. The Sample Mode (Steps):  Calculate the frequencies for all of the values in the data.  The mode is the value (or values) with the highest frequency. The Sample Mode (Examples):  The sample mode of the list (1, 2, 2, 3, 3, 3, 4) is 3.  The list (1, 2, 2, 3, 3, 5) has the two sample modes 2 and 3.  No sample mode of the list (1, 6, 2, 7, 3, 5). The Sample Mode (Advantages):  Extreme values do not affect the mode. Example:  The sample mode of the list (1, 2, 2, 3, 3, 3, 4) is 3.  The sample mode of the list (1, 2, 2, 3, 3, 3, 4000) is 3. The Sample Mode (Disadvantages):  Not as popular as mean and median.  Not necessarily unique - may be more than one answer.  When no values repeat in the data set, the mode is every value and is useless.  When there is more than one mode, it is difficult to interpret and/or compare. 21 Quartiles Quartiles divide the distribution into four groups, separated by 𝑸𝟏 , 𝑸𝟐 and 𝑸𝟑. Note that:  𝑸𝟏 (also called lower quartile) is the value such that 25% of the ranked data are smaller and 75% are larger.  𝑸𝟐 is another name for the median.  𝑸𝟑 (also called upper quartile) is the value such that 75% of the ranked data are smaller and 25% are larger. To find the quartiles, follow the following steps: Example 1: (even n) For the following data: 8 12 7 17 14 45 10 13 17 13 9 11 (a) Find the values of the three quartiles? Example 2: (odd n) For the following data: 47 28 39 51 33 37 59 24 33 (a) Find the values of the three quartiles? Note: Quartiles can be computed using different ways. 22 Lesson 6 Measures of Variability (Dispersion or Variation) Measures of Dispersion  The variation or dispersion in a set of data refers to how spread out the observations are from each other.  The variation is small when the observations are close together. There is no variation if the observations are the same.  Measures of dispersion are important for describing the spread of the data, or its variation around a central value, or express quantitatively the degree of variation or dispersion of values.  There are various methods that can be used to measure the dispersion of a dataset, each with its own set of advantages and disadvantages. The Range  The difference between the largest and smallest sample values.  If 𝑿𝟏, 𝑿𝟐, … , 𝑿𝒏 are the values of observations in a sample then range is given by: Range( X 1 , X 2 ,......, X n )  max( X 1 , X 2 ,......, X n )  min( X 1 , X 2 ,......, X n ) Example: Find The range of 12, 24, 19, 20, 7 Solution: Range  24  7  17  One of the simplest measures of variability to calculate.  Depends only on extreme values and provides no information about how the remaining data is distributed. 23 The Population Variance If 𝑿𝟏, 𝑿𝟐, … , 𝑿𝑵 are the population values, then the population variance is: 2   X 1   2   X 2   2 .........   X N   2 N Using summation form: N  X   1 2  i 2 N i 1 Where μ is population mean. The Sample Variance If 𝑿𝟏, 𝑿𝟐, … , 𝑿𝒏 are the sample values, then the sample variance is: S2  X 1 X   X 2 2  2   X .........  X n  X  2 n 1 Using summation form: S2  1 n  Xi  X n  1 i 1  2 n Where: X   Xi / n is the sample mean. i 1 Note: (n 1) is called the degrees of freedom (df) associated with the sample variance 𝑺𝟐. The Sample Standard Deviation The standard deviation is another measure of variation. It is the square root of the variance. It is given by: n ( X i  X )2 S S  2 i 1 n 1 24 Example 1: Compute the sample variance and standard deviation of the following observations (ages in year): 10, 21, 33, 53, 54. Solution n 5 X i X i 10  21  33  53  54 171 X  i 1  i 1    34.2 (year) n 5 5 5 n 5 ( X i  X )2 ( X i  34.2) 2 S  2 i 1  i 1 n 1 5 1  10  34.2   21  34.2   33  34.2   53  34.2   54  34.2  2 2 2 2 2 4 1506.8   376.7 4 The sample standard deviation is: S  S 2  376.7  19.41 The Sample Variance(another formula) n X 2 i 2  nX Another Formula for Calculating 𝑺𝟐 is S2  i 1 n 1 (It is simple and more accurate) For the previous Example, Xi 10 21 33 53 54 X i  171 X i2 100 441 1089 2809 2916 X i 2  7355 n X 2  nX 7355  534.2  2 2 i 1506.8 S  2 i 1    376.7 n 1 5 1 4 25 Interquartile range rule = IQR 26 Lesson 7 Grouped Data Calculation Mean – Grouped Data Example: The following table gives the frequency distribution of the number of orders received each day during the past 50 days at the office of a mail-order company. Calculate the mean. Number f of order 10 – 12 4 13 – 15 12 16 – 18 20 19 – 21 14 n = 50 Solution: Number f x fx X is the midpoint of the of order class. It is calculated by 10 – 12 4 11 44 adding the class limits 13 – 15 12 14 168 and dividing by 2. 16 – 18 20 17 340 19 – 21 14 20 280 ∑𝑓𝑥 ∑𝑓𝑥 832 50 =n 832 = 𝑥ҧ = = = 𝑛 ∑𝑓 50 = 16.64 Median and Interquartile Range – Grouped Data Step 1: Construct the cumulative frequency distribution. Step 2: Decide the class that contain the median. Class Median is the first class with the value of cumulative frequency equal at least n/2. Step 3: Find the median by using the following formula: 𝒏 −𝑭 𝟐 𝑴𝒆𝒅𝒊𝒂𝒏 = 𝑳𝒎 + 𝒊 Where: 𝒇𝒎 𝑛 = the total frequency 𝐹 = the cumulative frequency before class median 𝑓𝑚 = the frequency of the class median 𝑖 = the class width 𝐿𝑚 = the lower boundary of the class median 27 Example: Based on the grouped data below, find the median: Time to travel to work Frequency 1 – 10 8 11 – 20 14 21 – 30 12 31 – 40 9 41 – 50 7 Solution: First, construct the cumulative frequency distribution Time to travel Frequency Cumulative to work Frequency 1 – 10 8 8 11 – 20 14 22 21 – 30 12 34 31 – 40 9 43 41 – 50 7 50 𝑛 50 = = 25 class median is the 3rd class 2 2 So, 𝐹 = 22, 𝑓𝑚 = 12, 𝐿𝑚 = 20.5 𝑎𝑛𝑑 𝑖 = 10 Therefore, 𝑛 −𝐹 2 𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿𝑚 + 𝑖 𝑓𝑚 25 − 22 = 20.5 + 10 12 = 23 Thus, 25 persons take less than 23 minutes to travel to work and another 25 persons take more than 23 minutes to travel to work. Quartiles Using the same method of calculation as in the Median, we can get Q1 and Q3 equation as follows: 𝒏 𝟑𝒏 −𝑭 −𝑭 𝟒 𝟒 𝑸𝟏 = 𝑳𝑸𝟏 + 𝒊 𝑸𝟑 = 𝑳𝑸𝟑 + 𝒊 𝒇𝑸𝟏 𝒇𝑸𝟑 Example: Based on the grouped data below, find the Interquartile Range (IQR) Time to travel to work Frequency 1 – 10 8 11 – 20 14 21 – 30 12 31 – 40 9 41 – 50 7 28 Solution: 1st Step: Construct the cumulative frequency distribution Time to travel Frequency Cumulative to work Frequency 1 – 10 8 8 11 – 20 14 22 21 – 30 12 34 31 – 40 9 43 41 – 50 7 50 2nd Step: Determine the Q1 and Q3 n 50 3n 150 Class 𝑄1 = = = 12.5 Class 𝑄3 = = = 37.5 4 4 4 4 Class of 𝑄1 is the 2 class, Therefore, nd Class of 𝑄3 is the 4th class, Therefore, 𝑛 3𝑛 −𝐹 −𝐹 4 4 𝑄1 = 𝐿𝑄1 + 𝑖 𝑄3 = 𝐿𝑄3 + 𝑖 𝑓𝑄1 𝑓𝑄3 12.5 −8 37.5 −34 = 10.5 + 10 = 30.5 + 10 14 9 = 13.7143 = 34.3889 Interquartile Range is given by: IQR = Q3 – Q1 So, IQR = Q3 – Q1 = 34.3889 – 13.7143 = 20.6746 Mode – Grouped Data Mode is the value that has the highest frequency in a data set. For grouped data, class mode (or, modal class) is the class with the highest frequency. To find mode for grouped data, use the following formula: 𝜟𝟏 𝑴𝒐𝒅𝒆 = 𝑳𝒎𝒐 + 𝒊 𝜟𝟏 + 𝜟𝟐 Where: i is the class width 1 is the difference between the frequency of class mode and the frequency of the class before the class mode. 2 is the difference between the frequency of class mode and the frequency of the class after the class mode Lmo is the lower boundary of class mode 29 Example: Based on the grouped data below, find the mode Time to travel to work Frequency 1 – 10 8 11 – 20 14 21 – 30 12 31 – 40 9 41 – 50 7 Solution: Based on the table, 𝐿𝑚𝑜 = 10.5 Δ1 = 14 − 8 = 6 Δ2 = 14 − 12 = 2 𝑖 = 10 6 Mode = 10.5 + 6+2 10 = 18 Mode can also be obtained from a histogram. Step 1: Identify the modal class and the bar representing it. Step 2: Draw two cross lines as shown in the diagram. Step 3: Drop a perpendicular from the intersection of the two lines until it touch the horizontal axis. Step 4: Read the mode from the horizontal axis. Variance and Standard Deviation - Grouped Data 2 ∑𝑓𝑥 ∑𝑓𝑥 2 − Population Variance: 𝜎2 = 𝑁 N 2 ∑𝑓𝑥 ∑𝑓𝑥 2 − Variance for sample data: 𝑠2 = 𝑛 n−1 Standard Deviation: Population: 𝜎= 𝜎2 Sample: s= 𝑠2 30 Example: Find the variance and standard deviation for the following data: No. of order f 10 – 12 4 13 – 15 12 16 – 18 20 19 – 21 14 Total n = 50 Solution: No. of order f x fx fx2 10 – 12 4 11 44 484 13 – 15 12 14 168 2352 16 – 18 20 17 340 5780 19 – 21 14 20 280 5600 Total n = 50 832 14216 2 ∑𝑓𝑥 ∑𝑓𝑥 2 − Variance, 𝑠2 = 𝑛 n−1 2 832 14216 − = 50 = 7.5820 50 − 1 Standard Deviation, s = 7.5820 = 2.75 Thus, the standard deviation of the number of orders received at the office of this mail-order company during the past 50 days is 2.75. 31 Lesson 8 Regression Regression Analysis  The idea behind linear regression is to build a model that describes the dependence of one variable (the response or dependent variable) on another variable(s) (the explanatory or independent variable).  In regression analysis we analyze the relationship between two or more variables.  The relationship between two or more variables could be linear or non linear.  If the relationship between only two variables called Simple regression.  Simple Linear Regression is a linear Regression Between Two Variables. Simple Linear Regression Model A statistical technique that uses a straight-line relationship to predict a numerical dependent variable Y from a single numerical independent variable X. The general form of a linear equation with one independent variable can be written as y  b0  b1x Where 𝒃𝟎 and 𝒃𝟏 are constants (fixed numbers), 𝒙 is the independent variable, and 𝒚 is the dependent variable. The graph of a linear equation with one independent variable is a straight line. 32 Simple Linear Regression Model Examples of linear equations with one independent variable are 𝑦 = 4 + 0.2𝑥 𝑦 = −1.5 – 2𝑥, and 𝑦 = −3.4 + 1.8𝑥. The graphs of these three linear equations are shown in the above Figure. Intercept and slope For a linear equation 𝒚 = 𝒃𝟎 + 𝒃𝟏𝒙 the number 𝒃𝟎 is called the y-intercept and the number 𝒃𝟏 is called the slope. 𝒃𝟎 is the value of 𝒀 when 𝒙 = 𝟎, 𝒃𝟏 is the change in 𝒚 per unit change in 𝒙. Positive slope means 𝒚 increases as 𝒙 increases. Negative slope means 𝒚 decreases as 𝒙 increases. The 𝒚 intercept and the slope are known as the regression coefficients. b1 b0 Graphical interpretation of slope 33 Least squares Method For plotted sets of 𝒙 and 𝒚 values, there are many possible straight lines, each with its own values of 𝒃𝟎 and 𝒃𝟏 , that might seem to fit the data. The least-squares method finds the values for the 𝒚 intercept and the slope that makes the sum of the squared differences between the actual values of the dependent variable 𝒚 and the predicted values of 𝒚 as small as possible. Consider the problem of fitting a line to the four data points, whose scatterplot is shown in Fig. below. Many lines can “fit” those four data points. Two possibilities are shown in Figs.(a) and (b) on the next slide. Least squares Method Least squares Method ෝ=𝟑 For example , as we have just demonstrated, Line A predicts a Y-value of 𝒚 when 𝒙 = 𝟐. The actual y-value for 𝒙 = 𝟐 is 𝒚 = 𝟐. So, the error made in using Line A to predict the Y-value of the data point (2, 2) is 𝑒 = 𝑦 − 𝑦ො = 2 − 3 = −1 An Error e: In general, an error (𝑒) is the signed vertical distance from the line to a data point. 34 Least squares Method The fourth column of Table (a) below shows the errors made by Line A for all four data points; the fourth column of Table (b) shows the same for Line B. Least squares Method To decide which line, Line A or Line B, fits the data better, we first compute the sum of the squared errors, 𝑒𝑖2 , in the final column of Table (a) and Table (b). The line having the smaller sum of squared errors, in this case Line B, is the one that fits the data better. Among all lines, the least-squares criterion is that the line having the smallest sum of squared errors is the one that fits the data best. The least-squares method is that the line that best fits a set of data points is the one having the smallest possible sum of squared errors. Least squares Method Regression line: The line that best fits a set of data points according to the least- squares method. Regression equation: The equation of the regression line. Notation used in Regression For a set of n data points, the defining and computing formulas for 𝑺𝒙𝒙 , 𝑺𝒙𝒚, and 𝑺𝒚𝒚 are as follows: 35 Least squares Method Regression equation The regression equation for a set of n data points is 𝑦ො = 𝑏0 + 𝑏1 𝑥 Where: 𝑆𝑥𝑦 ∑ 𝑥𝑖 𝑦𝑖 − ∑ 𝑥𝑖 ∑ 𝑦𝑖 /𝑛 𝑏1 = = 𝑆𝑥𝑥 ∑ 𝑥𝑖 2 − ∑ 𝑥𝑖 2 /𝑛 1 𝑏0 = ∑ 𝑦𝑖 − 𝑏1 ∑ 𝑥𝑖 = 𝑦ത − 𝑏1 𝑥ҧ 𝑛 Regression equation (Example) Age and Price of Orions in the first two columns of Table below, we repeat our data on age and price for a sample of 11 Orions. a) Determine the regression equation for the data. b) Graph the regression equation and the data points. c) Describe the apparent relationship between age and price of Orions. d) Interpret the slope of the regression line in terms of prices for Orions. e) Use the regression equation to predict the price of a 3-year-old Orion and a 4-year-old Orion. Solution: a) We first need to compute b1 and b0 by using their Formulas. We did so by constructing a table of values for 𝒙 (age), 𝒚 (price), 𝒙𝒚 and 𝒙𝟐 and their sums in The following Table. The slope of the regression line therefore is 𝑆𝑥𝑦 ∑ 𝑥𝑖 𝑦𝑖 − ∑ 𝑥𝑖 ∑ 𝑦𝑖 /𝑛 𝑏1 = = 𝑆𝑥𝑥 ∑ 𝑥𝑖 2 − ∑ 𝑥𝑖 2 /𝑛 4732 − 58 975 /11 = 326 − 58 2 /11 = −20.26 1 𝑏0 = 𝑛 ∑ 𝑦𝑖 − 𝑏1 ∑ 𝑥𝑖 1 = 975 − −20.26 58 11 = 195.47 So, the regression equation is 𝑦ො = 195.47 − 20.26𝑥 36 Solution b) To graph the regression equation, we need to substitute two different x- values in the regression equation to obtain two distinct points. Let’s use the x- values 2 and 8. The corresponding y-values are 𝑦ො = 195.47 − 20.26(2)=154.95 𝑦ො = 195.47 − 20.26(8)=33.39 Therefore, the regression line goes through the two points (2, 154.95) and (8, 33.39). In the Figure below, we plotted these two points with open dots. Drawing a line through the two open dots yields the regression line, the graph of the regression equation. This Figure also shows the data points from the first two columns of the previous Table. (c) Because the slope of the regression line is negative, price tends to decrease as age increases, which is no particular surprise. (d) Because x represents age in years and y represents price in hundreds of dollars, the slope of -20.26 indicates that Orions depreciate an estimated $2026 per year, at least in the 2- to 7-year-old range. (e) For a 3-year-old Orion, 𝑥 = 3, and the regression equation yields the predicted price of ෝ = 𝟏𝟗𝟓. 𝟒𝟕 − 𝟐𝟎. 𝟐𝟔 𝟑 = 𝟏𝟑𝟒. 𝟔𝟗 𝒚 Similarly, the predicted price for a 4-year-old Orion is ෝ = 𝟏𝟗𝟓. 𝟒𝟕 − 𝟐𝟎. 𝟐𝟔 𝟒 = 𝟏𝟏𝟒. 𝟒𝟑 𝒚 Interpretation The estimated price of a 3-year-old Orion is $13,469, and the estimated price of a 4-year-old Orion is $11,443. 37 Linear Correlation The linear correlation Coefficient: The linear correlation coefficient is a descriptive measure of the strength and direction of the linear (straight-line) relationship between two variables. Represented by the symbol 𝒓. The values of this coefficient vary from –1, which indicates perfect negative correlation, to +1, which indicates perfect positive correlation. The sign of the correlation coefficient 𝒓 is the same as the sign of the slope.  If the slope is positive, 𝒓 is positive.  If the slope is negative, 𝒓 is negative. The linear correlation Coefficient For a set of 𝒏 data points, the linear correlation coefficient is defined by Where 𝑺𝒙 , 𝑺𝒚 denote the sample standard deviation of the x-values and y- values respectively Using algebra, we can show that the linear correlation coefficient can be expressed as 𝑺𝒙𝒚 𝒓= 𝑺𝒙𝒙 𝑺𝒚𝒚 The computing formula for a linear correlation coefficient is The computing formula is almost always preferred for hand calculations, but the defining formula reveals the meaning and basic properties of the linear correlation coefficient. Various degrees of linear correlation 38 𝑛𝑜 𝑙𝑖𝑛𝑒𝑎𝑟 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑟=0 Example a) Compute the linear correlation coefficient for the data. b) Interpret the result in terms of the relationship between the variables age and price of Orions. c) Discuss the graphical implications of the value of 𝒓. Example b) Interpretation: The linear correlation coefficient, 𝒓 = −𝟎. 𝟗𝟐𝟒, suggests a strong negative linear correlation between age and price of Orions. In particular, it indicates that as age increases, there is a strong tendency for price to decrease, which is not surprising. C) Because the correlation coefficient, 𝒓 = −𝟎. 𝟗𝟐𝟒, is quite close to −1, the data points should be clustered closely about the regression line. Correlation and causation Two variables may have a high correlation without being causally related. A correlation coefficient close to zero does not necessarily mean that X and Y are not related. Two variables may be strongly correlated because they are both associated with other variables, called lurking variables, that cause changes in the two variables under consideration. 39 Lesson 9 Probability Concepts Statistical Experiment Sample Space Events Statistical Experiment: An Experiment Is some procedure (or process) that we do, and it results in an outcome. A random experiment Is an experiment we do not know its exact outcome in advance, but we know the set of all possible outcomes. It is also called statistical experiment. The Sample Space: Definition :  The set of all possible outcomes of a statistical experiment is called the sample space and is represented by the symbol S.  Each outcome (element or member) of the sample space S is called a sample point. Example 1  The sample space of possible outcomes when a coin is tossed, may be written: S = {H,T} where, H and T correspond to "heads" and "tails," respectively. 40 Example 2 Consider the experiment of tossing a dice. If we are interested in the number that shows on the top face, the sample space would be S1 = {1,2,3,4,5,6} If we are interested only in whether the number is even or odd, the sample space is S2= {even, odd} This example illustrates the fact that more than one sample space can be used to describe the outcomes of an experiment. In this case S1 provides more information than S2 It is desirable to use a sample space that gives the most information concerning the outcomes of the experiment Example 3 An experiment consists of flipping a coin and then flipping it a second time if a head occurs. If a tail occurs on the first, flip, then a dice is tossed once. To list the elements of the sample space, we construct the tree diagram S = {HH, HT, T1, T2, T3, T4, T5, T6}. Example 4 Sample spaces with a large or infinite number of sample points are best described by a statement or rule. If the possible outcomes of an experiment are the set of cities in the world with a population over 1 million, our sample space is written S = {x | x is a city with a population over 1 million}, which reads "S is the set of all x such that x is a city with a population over 1 million." 41 Events: Definition  An event A is a subset of the sample space S. That is A  S.  We say that an event A occurs if the outcome (the result) of the experiment is an element of A.    S is an event  ( is called the impossible event)  S  S is an event (S is called the sure event) Example 1 Given the sample space S = { t | t > 0 }, where t is the life in years of a certain electronic component. The event A that the component fails before the end of the fifth year is the subset A = { t | 0 < t < 5 }. Example 2 Experiment: Selecting a ball from a box containing 6 balls numbered 1,2,3,4,5 and 6. (or tossing a dice) This experiment has 6 possible outcomes The sample space is S = {1,2,3,4,5,6}. Consider the following events: E1= getting an even number = {2,4,6}  S E2 = getting a number less than 4 = {1,2,3}  S E3 = getting 1 or 3 = {1,3}  S E4 = getting an odd number = {1,3,5}  S E5 = getting a negative number = { } =   S E6 = getting a number less than 10 = {1,2,3,4,5,6} = S  S 42 Events Notation:  n(S) = no. of outcomes (elements) in S.  n(E) = no. of outcomes (elements) in the event E. Example 3 Experiment: Selecting 3 items from manufacturing process; each item is inspected and classified as defective (D) or non-defective (N). This experiment has 8 possible outcomes S = {DDD,DDN,DND,DNN,NDD,NDN,NND,NNN} Example 3: cont. Consider the following events: A = {at least 2 defectives} = {DDD,DDN,DND,NDD}  S B = {at most one defective} = {DNN,NDN,NND,NNN}  S C = {3 defectives} = {DDD}  S 43 Lesson 10 Operations on Events Contents Complement Intersection Mutually Exclusive Union Operation on Events (Complement) Definition The complement of an event A with respect to S is the subset of all elements of S that are not in A. We denote the complement of A by the symbol A` or AC. Ac = {x S : xA } Ac occurs if A does not. Venn Diagram S 44 Operation on Events (Example 1) Let R be the event that a red card is selected from an ordinary deck of 52 playing cards, and let S be the entire deck. Then RC is the event that the card selected from the deck is not a red but a black card. Operation on Events (Example 2) Consider the sample space S = {yellow, red, blue, green, black, white}. Let A = {yellow, red, black, white} Then A' = {blue, green}. Operation on Events (Intersection) Definition Let A and B be two events defined on the sample space S. The intersection of two events A and B denoted by the symbol A  B, is the event containing all elements that are common to A and B. A  B = AB = {x S : xA and xB } A  B Consists of all points in both A and B. A  B Occurs if both A and B occur together. Operation on Events (Intersection): Venn Diagram S 45 Example 1: Let the sample space S = {1,2,3,4}, and events A = {12,3}, B = {2,3,4}, then A  B = {2,3}. Example 2: Let M = { a , e , i , o , u } and N = { r , s , t } M  N = . M and N have no elements in common, therefore, cannot both occur simultaneously. Mutually Exclusive Definition Two events A and B are mutually exclusive (or disjoint) if and only if A  B = ; that is, A and B have no common elements (they do not occur together). Venn Diagram AB AB= A and B are not mutually A and B are mutually exclusive exclusive (disjoint) Operation on Events (Union) Definition The union of the two events A and B, denoted by the symbol A  B, is the event containing all the elements that belong to A or B or both. A  B = {x S : xA or xB } A  B Consists of all outcomes in A or in B or in both A and B. A  B Occurs if A occurs, or B occurs, or both A and B occur. That is A  B Occurs if at least one of A and B occurs. 46 Operation on Events (Union) Venn Diagram S Examples Let A = { a , b , c } and B = { b , c , d , e } Then A  B = { a , b , c , d , e } If M = {x | 3 < x < 9} and V = {y \ 5 < y < 12}, Then M  V = [z | 3 < z < 12} Example: Let the sample space 𝑺 = {𝟏, 𝟐, 𝟑, 𝟒, 𝟓, 𝟔, 𝟕, 𝟖} and let A and B be subsets of S such that 𝑨 = {𝟐, 𝟒, 𝟔, 𝟖} and 𝑩 = {𝟔, 𝟕, 𝟖}, find the following: 1- 𝑨𝒄 = {𝟏, 𝟑, 𝟓, 𝟕} 2- 𝑩𝒄 = {𝟏, 𝟐, 𝟑, 𝟒, 𝟓} 3- 𝐀 ∩ 𝑩 = {𝟔, 𝟖} 4- 𝐀 ∪ 𝑩 = {𝟐, 𝟒, 𝟔, 𝟕, 𝟖} 5- 𝑨 ∩ 𝑩𝒄 = {𝟐, 𝟒} 6- 𝑨𝒄 ∩ 𝑩𝒄 = {𝟏, 𝟑, 𝟓} 7- (𝑨 ∪ 𝑩)𝒄 = 𝟐, 𝟒, 𝟔, 𝟕, 𝟖 𝒄 = {𝟏, 𝟑, 𝟓} 47 Lesson 11 Counting Sample Points Contents Multiplication Rule Permutations Combinations Counting Sample Points: Many times, a person must know the number of all possible outcomes for a sequence of events. To determine this number, three rules can be used: Multiplication Rule Permutations Combinations In many cases, we can compute the probability of an event by using the counting techniques. 48 Multiplication Rule: Theorem If an operation can be performed in n1 ways, and if for each of these ways a second operation can be performed in n2 ways, then the two operations can be performed together in n1 n2 ways. Multiplication Rule (Example 1): How many sample points are there in the sample space when a pair of dice is thrown once? Solution: The first dice can land in any one of n1 = 6 ways. For each of these 6 ways the second dice can also land in n2 = 6 ways. Therefore, the pair of dice can land in: n1 n2 = (6) (6) = 36 possible ways. Multiplication Rule: Theorem If an operation can be performed in n1 ways, and if for each of these a second operation can be performed in n2 ways, and for each of the first two a third operation can be performed in n3 ways, and so forth, then the sequence of k operations can be performed in n1 n2 … nk ways. Multiplication Rule (Example 2): Sam is going to assemble a computer by himself. He has the choice of ordering chips from two brands, a hard drive from four, memory from three, and an accessory bundle from five local stores. How many different, ways can Sam order the parts? Solution: Since n1 = 2 , n2 = 4 , n3 = 3 and n4 = 5 There are: n1 × n2 × n3 × n4 = 2 × 4 × 3 × 5 = 120 different ways to order the parts. Multiplication Rule (Example 3): How many even four-digit numbers can be formed from the digits 0, 1, 2, 5, 6, and 9 if each digit can be used only once? Solution: Since the number must be even, we have only n1 = 3 (0,2,6) choices for the units position. However, for a four-digit number the thousands position cannot be 0. Hence, we consider the units position by two parts, 0 or not 0. 49 Multiplication Rule (Example 3): If units position is 0 ( i.e. n1 = 1 ) we have n2 = 5 : thousands position. n3 = 4 : hundreds position. n4 = 3 : tens position. Therefore, in this case we have a total of n1 × n2 × n3 × n4 = 1 × 5 × 4 × 3 = 60 even four-digit numbers Multiplication Rule (Example 3): If units position is not 0 ( i.e. n1 = 2 ) we have n2 = 4 : thousands position. n3 = 4 : hundreds position. n4 = 3 : tens position. Therefore, in this case we have a total of n1 × n2 × n3 × n4 = 2 × 4 × 4 × 3 = 96 even four-digit numbers Since the two cases are mutually exclusive of each other, the total number of even four-digit numbers can be calculated by: 60 + 96 = 156 even four-digit numbers Permutations: Definition A permutation is an arrangement of all or part of a set of objects. Consider the three letters a, b, and c. The possible permutations are: abc, acb, bac, bca, cab, and cba. There are 6 distinct arrangements as follows: There are n1 = 3 choices for the first position, n2 = 2 choices for the second position and n3 = 1 choice for the last position, giving a total of n1 n2 n3 = (3)(2)(1) = 6 permutations 50 Permutations (Factorial): In general, n distinct objects can be arranged in n (n - l) (n - 2) (3) (2) (1) ways. Definition: For any non-negative integer n, n!, called “n factorial,” is defined as n! = n(n − 1) · · · (2)(1), with special case 0! = 1. This product is called factorial and represents by n! Examples: 5! = 5. 4. 3. 2. 1 = 120 and 9! = 9. 8. 7. 6. 5. 4. 3. 2. 1 = 362880 Theorem: The number of permutations of n objects is n!. Permutations: Theorem The number of permutation of n distinct objects taken r at a time is n! n Pr  ; r  0, 1, 2, , n n  r ! Example 1: In how many ways can a president, vice president, secretary, and treasurer be selected from an organization containing 20 members? Solution: Since the order is important, the solution is Permutation (Example 2): In one year, three awards (research, teaching, and service) will be given for a class of 25 graduate students in a statistics department. If each student can receive at most one award, how many possible selections are there? Solution: Since the awards are distinguishable, it is a permutation problem. The total number of sample points is 25! 25! P3   25 25  3! 22 ! 25  24  23  22!  22 !  25  24  23  13800 51 Permutations (Example 3): A president and a treasurer are to be chosen from a student club consisting of 50 people. How many different choices of officers are possible if (a) There are no restrictions (b) A will serve only if he is president (c) B and C will serve together or not at all (d) D and E will not serve together Solution: (a) The total number of choices of the officers if there are no restrictions is: 50! P2   50  49  2450 50 50  2 ! Permutations (Example 3): (b) Since A will serve only if he is the president, we have two situations here: (i) A is selected as the president, which yields 49 possible outcomes; Treasurer B President A C AB AC AD … AAX. AX (ii) Officers are selected from the remaining 49 people which has the number of choices 49! P2   49  48  2352 49 49  2 ! Therefore, the total number of choices is: 49  2352  2401 Permutations (Example 3): (C) The number of selections when B and C serve together is 2 BC CB The number of selections when both B and C are not chosen is: 48! P2   48  47  2256 48 48  2 ! Therefore, the total number of choices is: 2  2256  2258 (D) (i) The number of selections when D serves as an officer but not E is (2) (48) = 96 Treasurer President President Treasurer D 48 + 48 D E not Exist E not Exist (ii) The number of selections when E serves as an officer but not D is also (2)(48) = 96 52 Permutations (Example 3): (iii) The number of selections when both D and E are not chosen is 48! P2   48  47  2256 48 48  2 ! Therefore, the total number of choices is: 96  96  2256  2448 OR: Since D and E can only serve together in 2 ways, the answer is 2450 - 2 = 2448. Permutations: Theorem The number of distinct permutations of n things of which n1 are of one kind, n2 of a second kind,..., nk of a kth kind is n! n1 !  n2 ! .....  nk ! Example 4: How many different letter arrangements can be made from the letters in the word of STATISTICS ? Solution: Here we have total 10 letters, while 2 letters (S, T) appear 3 times each, one letter appears twice, and letters A and C appear once each.  10  10!     50400  3,3,2 ,1,1 3!  3!  2 !  1!  1! Example 5: In a college football training session, the defensive coordinator needs to have 10 players standing in a row. Among these 10 players, there are 1 freshman, 2 sophomores, 4 juniors, and 3 seniors, respectively. How many different ways can they be arranged in a row if only their class level will be distinguished? Solution: The total number of arrangements is 10!  12600 1!2 !4 ! 3! 53 Permutations: Theorem The number of ways of partitioning a set of n objects into r cells with n1 elements in the first cell, n2 elements in the second, and so forth, is:  n  n!     n1,n2 ,n3,....,nr  n1 !  n2 ! .....  nr ! where n1  n2  n3 ....  nr  n Example 6: In how many ways can 7 graduate students be assigned to one triple and two double hotel rooms during a conference ?  7  7!    3,2 ,2  3!  2 !  2 !  210   Combinations: In many problems, we are interested in the number of ways of selecting r objects from n objects without regard to order. These selections are called combinations. Theorem The number of combinations of n distinct objects taken r at a time is denoted  n  n n!  r  by    and is given by: r  r! n  r ! ; r  0, 1, 2, , n   Combinations (Notes):  n  r is read as ‘n ’ choose ‘r ’. Or n combination r    n  n n  n  n    n  1   0  1  1   n  r   n  r             n  n  n  1  n  2 .........  n  r  1 7 765       3  2  1  35  3 r r!   Combinations (Example 7): A TV news director wishes to use 3 news stories on an evening show. If the director has a total of 8 stories to choose from, in how many possible ways can the program be set up? Solution: He wants only to select 3 stories from 8, the solution is 54 Combinations (Example 8): If we have 10 equal–priority operations and only 4 operating rooms are available, in how many ways can we choose the 4 patients to be operated on first? Combinations (Example 8): Solution: n = 10 r = 4 The number of different ways for selecting 4 patients from 10 patients is 10  10! 10!    4! 10  4 !  4!  6!  4    10  9  8  7  6  5  4  3  2  1  4  3  2  1  6  5  4  3  2  1  210 (different ways) OR 10  10  9  8  7    4  3  2  1  210  4    55 Lesson 12 Probability of an Event Probability of an Event To every point (outcome) in the sample space of an experiment S, we assign a weight (or probability), ranging from 0 to 1, such that the sum of all weights (probabilities) equals 1. The weight (or probability) of an outcome measures its likelihood (chance) of occurrence. To find the probability of an event A, we sum all probabilities of the sample points in A. This sum is called the probability of the event A and is denoted by P(A). Definition The probability of an event A is the sum of the weights (probabilities) of all sample points in A. Therefore, 1) 0  P A  1 2) PS   1 3) P    0 Example 1 A balanced coin is tossed twice. What is the probability that at least one head occurs? Solution S = {HH, HT, TH, TT} A = {at least one head occurs}= { HH , HT, TH } Since the coin is balanced, the outcomes are equally likely, i.e., all outcomes have the same weight or probability. 56 Example 1 Outcome Weight (Probability) HH P(HH) = w HT P(HT) = w TH P(TH) = w TT P(TT) = w Sum 4w = 1 4w = 1  w = 1/4 = 0.25 P(HH) = P(HT) = P(TH) = P(TT) = 0.25 The probability that at least one head occurs is: P(A) = P({at least one head occurs}) = P({HH, HT, TH}) = P(HH) + P(HT) + P(TH) = 0.25 + 0.25 + 0.25 = 0.75 Probability of an Event: Theorem If an experiment has 𝒏(𝑺) = 𝑵 equally likely different outcomes, then the probability of the event A is: n( A) n( A) no. of outcomes in A P ( A)    n( S ) N no. of outcomes in S Probability of an Event (Example 2) A mixture of candies consists of 6 mints, 4 toffees, and 3 chocolates. If a person makes a random selection of one of these candies, find the probability of getting: (a) a mint (b) a toffee or chocolate. Probability of an Event (Example 2) Solution Define the following events: M = {getting a mint} T = {getting a toffee} C = {getting a chocolate} Experiment: selecting a candy at random from 13 candies n(S) = no. of outcomes of the experiment of selecting a candy. = no. of different ways of selecting a candy from 13 candies. 13     13 1 57 Probability of an Event (Example 2) The outcomes of the experiment are equally likely because the selection is made at random. Probability of an Event (Example 2) (a) M = {getting a mint} n(M) = no. of different ways of selecting a mint candy from 6 mint candies  6  6  1 n M  6 P(M ) = P({getting a mint}) =   n S  13 (b) a toffee or chocolate. T  C = {getting a toffee or chocolate} n (T) = no. of different ways of selecting a toffee candy from 4 toffee candies  4     4 1  3 n(C) = no. of different ways of selecting a chocolate candy from 3   3 1   Probability of an Event (Example 2) n(T  C) = no. of different ways of selecting a toffee or a chocolate candy = no. of different ways of selecting a toffee candy + no. of different ways of selecting chocolate candy  4  3      43 7 1 1 = no. of different ways of selecting a candy from 7 candies 7  7 1 n T  C  7 P(TC ) = P ({getting a toffee or chocolate})   nS  13 58 Probability of an Event (Example 3) In a poker hand consisting of 5 cards, find the probability of holding 2 aces and 3 jacks. Standard deck of card 52 cards 13 card in each of 4 suits Hearts diamonds spades clubs Each suit consists of 13 ranks Ace, 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, King Probability of an Event (Example 3) Solution Experiment: selecting 5 cards from 52 cards. n(S) = no. of outcomes of the experiment of selecting 5 cards from 52 cards.  52  52!    2598960  5  5!  47! The outcomes of the experiment are equally likely because the selection is made at random. Define the event: A = {holding 2 aces and 3 jacks} n(A) = no. of ways of selecting 2 aces and 3 jacks Probability of an Event (Example 3) n(A) = (no. of ways of selecting 2 aces)  (no. of ways of selecting 3 jacks) = (no. of ways of selecting 2 aces from 4 aces)  (no. of ways of selecting 3 jacks from 4 jacks)  4  4 4! 4!        6  4  24  2  3 2!  2!  3! 1! P(A ) = P ({holding 2 aces and 3 jacks}) n A  24    0.000009 n S  2598960 59 Lesson 13 Additive Rules Additive Rules Theorem If A and B are any two events, then: P(A  B) = P(A) + P(B)  P(A  B) Corollary 1 If A and B are mutually exclusive (disjoint) events, then: P(A  B) = P(A) + P(B) Corollary 2 If A1, A2, …, An are n mutually exclusive (disjoint) events, then: P(A1 A2 … A n) = P(A1) + P(A2) +… + P(An)  P  A   P( A ) n i 1 i  n i 1 i Corollary 3 If A1, A2, …, An is a partition of sample space S, then P(A1  A2  …. An) = P(A1) + P(A2) + …+ P(An) = P(S) = 1. Example 1 Note: Two event Problems: * Total area = P(S) = 1 * In Venn diagrams, consider the probability of an event A as the area of the region corresponding to the event A. Additive Rules (Venn Diagram) Total area = P(S) =1 60 Additive Rules (Example) P(A)= P(AB)+ P(ABC) Additive Rules (Example) P(AB)= P(A) + P(ACB) Additive Rules (Example) P(AB)= P(A) + P(B)  P(AB) 61 Additive Rules (Example) P(ABC)= P(A)  P(AB) Additive Rules (Example) P(ACBC)= 1  P(AB) Example 2 The probability that Ali passes Mathematics is 2/3, and the probability that he passes English is 4/9. If the probability that he passes both courses is 1/4, what is the probability that he will: (a) pass at least one course? (b) pass Mathematics and fail English? (c) fail both courses? Solution Define the events: M = {Ali passes Mathematics} and E = {Ali passes English} We know that: P(M) = 2/3, P(E) = 4/9 and P(M  E) = 1/4. 62 Example 2 (a) Probability of passing at least one course is: P(M  E) = P(M) + P(E)  P(M  E) 2 4 1 31     3 9 4 36 (b) Probability of passing Mathematics and failing English is: P(M  EC) = P(M)  P(M  E) 2 1 5    3 4 12 (c) Probability of failing both courses is: P(MC  EC) = 1  P(M  E) 31 5  1  36 36 Additive Rules Theorem If A and AC are complementary events, then: P(A) + P(AC) = 1  P(AC) = 1  P(A) Proof Since A  AC = S and the sets A and AC are disjoint, then 1 = P(S) = P(A  AC) = P(A) + P(AC). Example 3 If the probabilities that an automobile mechanic will service 3, 4, 5, 6, 7, or 8 or more cars on any given workday are, respectively, 0.12, 0.19, 0.28, 0.24, 0.10, and 0.07, what is the probability that he will service at least 5 cars on his next day at work? Solution Let E be the event that at least 5 cars are serviced. Now, P(E) = 1 - P(EC), where EC is the event that fewer than 5 cars are serviced. Since P(EC) = 0.12+ 0.19 = 0.31, it follows from the last Theorem that P(E) = 1 - 0.31 = 0.69. 63 Example 4 Suppose the manufacturer specifications of the length of a certain type of computer cable are 2000 ± 10 millimeters. In this industry, it is known that small cable is just as likely to be defective (not meeting specifications) as large cable. That is, the probability of randomly producing a cable with length exceeding 2010 millimeters is equal to the probability of producing a cable with length smaller than 1990 millimeters. The probability that the production procedure meets specifications is known to be 0.99. (a) What is the probability that a cable selected randomly is too large? (b) What is the probability that a randomly selected cable is larger than 1990 millimeters? Example 4 Solution (a) Let M be the event that a cable meets specifications. S and L be the events that the cable is too small and too large, respectively. Then P(M) = 0.99 1  0.99 P(S) = P(L) = = 0.005. 2 (b) Let X denotes the length of a randomly selected cable, then we have P(1990 < X < 2010) = P(M) = 0.99. Since P(X > 2010) = P(L) = 0.005 then P(X > 1990) = P(M) + P(L) = 0.995. Example 4 (b) can be also solved by using the last Theorem : P(X > 1990) + P(X < 1990) = 1. Thus, P(X > 1990) = 1 – P(S) = 1 - 0.005 = 0.995. 64 Lesson 14 Conditional Probability Conditional Probability The probability of an event B occurring when it is known that some event A has occurred is called a conditional probability and is denoted by P(B|A). The symbol P (B|A) is usually read "the probability that B occurs given that A occurs“ or simply "the probability of B given A." Definition 2.9: The conditional probability of B, given A, denoted by P(B|A) is defined by 𝐏(𝐀 ∩ 𝐁) 𝑷(𝑩|𝑨) = 𝐏(𝐀) provided P(A) > 0. Example 1 Suppose that our sample space S is the population of adults in a small town who have completed the requirements for a college degree. We shall categorize them according to gender and employment status. The data are given in the following table: Employed Unemployed Total Male 460 40 500 Female 140 260 400 Total 600 300 900 One of these individuals is to be selected at random for a tour throughout the country to publicize the advantages of establishing new industries in the town. We shall be concerned with the following events: M: a man is chosen, E: the one chosen is employed. What is 𝐏 𝐌|𝐄 ? 65 Example 1 Solution: Let 𝐧(𝐀)denote the number of elements in any set A, then we write 𝒏(𝑴 ∩ 𝑬) 𝒏(𝑺) 𝐧(𝐌 ∩ 𝑬) 𝑷(𝑴 ∩ 𝑬) 𝐏(𝐌|𝐄) = = = 𝒏(𝑬) 𝐧(𝑬) 𝑷(𝑬) 𝒏(𝑺) where 𝐏(𝐌 ∩ 𝑬) and 𝐏(𝑬) are found from the original sample space S. 𝟔𝟎𝟎 𝟐 𝟒𝟔𝟎 𝟐𝟑 To verify this result, note that 𝐏(𝐄) = = , 𝐏(𝐌 ∩ 𝐄) = = 𝟗𝟎𝟎 𝟑 𝟗𝟎𝟎 𝟒𝟓 𝟐𝟑ൗ 𝟐𝟑 𝟒𝟓 Hence 𝐏(𝐌|𝐄) = 𝟐ൗ = 𝟑 𝟑𝟎 Example 2 The probability that a regularly scheduled flight departs on time is 𝐏 𝐃 = 𝟎. 𝟖𝟑; the probability that it arrives on time is 𝐏 𝑨 = 𝟎. 𝟖2; and the probability that it departs and arrives on time is 𝐏 𝐃 ∩ 𝐀 = 𝟎. 𝟕𝟖. Find the probability that a plane: (a) arrives on time given that it departed on time, (b) departed on time given that it has arrived on time. Solution Independent Events Although conditional probability allows for an alteration of the prob. of an event in the light of additional material, it also enables us to understand better the very important concept of independence or, in the present context, independent events. consider the situation where we have events A and B and P(A|B) = P(A). Here the occurrence of A is independent of the occurrence of B. Definition 2.10: Two events A and B are independent if and only if : P(A|B) = P(A) and P(B|A) = P(B). provided the existences of the conditional probabilities. Otherwise, A and B are dependent. 66 In practice Independent Trials Tosses of a coin Rolls of a die Draws with replacement Dependent Trials Cards dealt from a deck Draws without replacement Independent Events (Example ) consider an experiment in which 2 cards are drawn in succession from an ordinary deck, with replacement. The events are defined as: A: the first card is an ace, B: the second card is a spade. Find the P(B|A) and P(A|B) ? Solution: Since the first card is replaced, our sample space for both the first and second draws consists of 52 cards, containing 4 aces and 13 spades. Hence 𝟏𝟑 𝟏 𝟏𝟑 𝟏 𝐏 𝐁|𝑨 = = , 𝐏 𝑩 = = 𝟓𝟐 𝟒 𝟓𝟐 𝟒 𝟏 then 𝐏 𝐁|𝐀 = 𝐏 𝐁 = 𝟒 𝟒 𝟏 𝟒 𝟏 and 𝐏 𝐀|𝑩 = = , 𝐏 𝑨 = = 𝟓𝟐 𝟏𝟑 𝟓𝟐 𝟏𝟑 𝟏 then 𝐏 𝐀|𝑩 = 𝐏 𝑨 = 𝟏𝟑 Multiplicative Rules: Multiplying the formula of Definition 2.9 by P(A), we obtain the following important multiplicative rule, which e

Use Quizgecko on...
Browser
Browser