Chapter 5 Probability and Statistics PDF

Chapter 5 Probability and Statistics CHAPTER HIGHLIGHTS ☞ Probability ☞ Statistics ☞ Some special continuous distributions ☞ Hypoth...

Chapter 5 Probability and Statistics CHAPTER HIGHLIGHTS ☞ Probability ☞ Statistics ☞ Some special continuous distributions ☞ Hypothesis testing Probability Independent Events Two events E1 and E2 are said to be independent, if the occurrence of the event E2 is not affected The word PROBABILITY is used, in a general sense, to by the occurrence or non-occurrence of the event E1. indicate a vague possibility that something might happen. It is also used synonymously with chance. Example: Two drawings of one ball each time are made from a bag containing balls. Random Experiment Here, we have two events drawing a ball first time (E1) and If the result of an experiment conducted any number of drawing a ball second time (E2). If the ball of the first draw times under essentially identical conditions, is not certain is replaced in the bag before the second draw is made, then but is any one of the several possible outcomes, the experi- the outcome of E2 does not depend on the outcome of E1. In ment is called a trial or a random experiment. Each of the this case E1 and E2 are Independent events. out comes is known as an event. If the ball of the first draw is not replaced in the bag Examples: before the second draw is made, then the outcome of E2 1. Drawing 3 cards from a well shuffled pack is a random depends on the outcome of E1. In this case, events E1 and experiment while getting an Ace or a King are events. E2 are Dependent events. 2. Throwing a fair die is a random experiment while Compound Events When two or more events are in rela- getting the score as ‘2’ or an odd number’ are events. tion with each other, they are known as compound events. Mutually Exclusive Events If the happening of any one of Example: When a die is thrown two times, the event of get- the events in a trial excludes or prevents the happening of all ting 3 in the first throw and 5 in the second throw is a com- others, then such events are said to be mutually exclusive. pound event. Example: The events of getting a head and that of getting a tail when a fair coin is tossed are mutually exclusive. Definition of Probability If an event E can happen in m ways and fail in k ways out of Equally Likely Events Two events are said to be equally a total of n ways and each of them is equally likely, then the likely when chance of occurrence of one event is equal to that of the other. m m probability of happening E is = where n = (m + k). Example: When a die is thrown, any number from 1 to 6 (m + k ) n may be got. In this trial, getting any one of these events are In other words, if a random experiment is conducted n equally likely. times and m of them are favourable to event E, then the Chapter 05.indd 88 5/31/2017 10:55:27 AM Chapter 5 Probability and Statistics | 2.89 m Example 2 probability of happening of E is P(E) =. Since the n If a card is drawn from a pack of cards, find the probability event does not occur (n - m) times, the probability of non- of getting a queen. occurrence of E is P ( E ). n−m m Solution P( E ) = = − n = 1 − P( E ) n n When a card is drawn, the number of possible outcomes is 52. The number of favourable outcomes of getting a queen Therefore, P ( E ) + P ( E ) = 1. card is 4. 4 1 = = The required probability. NOTES 52 13 1. Probability [P(E)] of the happening of an event E is Example 3 known as the probability of success and the probabil- A bag contains 5 green balls and 4 red balls. If 3 balls are ity [ P ( E )] of the non-happening of the event is the picked from it at random, then find the odds against the probability of failure. three balls being red. 2. If P(E) = 1, the event is called a certain event and if P(E) = 0 the event is called an impossible event. Solution 3. Instead of saying that the chance of happening of an m The total number of balls in the bag = 9. Three balls can be event is , we can also say that the odds in favour n selected from 9 balls in 9C3 ways. of the event are m to (n - m) or the odds against the Three red balls can be selected from 4 red balls in 4C3 ways. event are (n - m) to m. Probability of picking three red balls 4C 4 1 20 Addition Theorem of Probability = = = 9C 3 = ; P( E ) 3 84 21 21 If A and B are two events, then P(A ∪ B) = P(A) + P(B) - P(A ∩ B) Odds against the three balls being red are This result follows from the corresponding result in set 20 1 = P= ( E ) : P( E ) : = 20 : 1. theory. If n (X) represents the number of elements in set X, 21 21 n (X ∪ Y) = n (X) + n (Y) - n (X ∩ Y). Example 4 Example: If a die is rolled, what is the probability that the When two dice are rolled together, find the probability of number that comes up is either even or prime? getting at least one 4. A = The event of getting an even number = {2, 4, 6} Solution B = The event of getting a prime = {2, 3, 5} Let E be the event that at least one dice shows 4. E be the A ∪ B = {2, 3, 4, 5, 6} event that no dice shows 4. The number of favourable out- A ∩ B = {2} 25 comes of E is 5 × 5 = 25 ⋅ P ( E ) = 3 3 5 1 36 P ( A) = , P ( B) = , P ( A ∪ B) = and P ( A ∩ B ) =. We 6 6 6 6 25 11 \ P( E ) = 1 − P( E ) = 1 − =. can verify that P(A ∪ B) = P(A) + P(B) - P(A ∩ B). 36 36 Example 5 SOLVED EXAMPLES When two dice are rolled together find the probability that Example 1 total score on the two dice will be 8 or 9. When a cubical dice is rolled, find the probability of getting Solution an even integer. When two dice are rolled, the total number of outcomes Solution = 6 × 6 = 36. When a dice is rolled, the number of possible out comes is Favourable outcomes for getting the sum 8 or 9 are {(2, 6), 6. The number of favourable outcomes of getting an even (6, 2), (3, 5), (5, 3), (4, 4), (3, 6), (6, 3), (4, 5), (5, 4)}, i.e., integer is 3. the total number of favourable outcomes = 9. 3 1 9 1 The required probability= =. The required probability = =. 6 2 36 4 Chapter 05.indd 89 5/31/2017 10:55:29 AM 2.90 | Part II Engineering Mathematics Example 6 n( A ∩ B) If two cards are drawn simultaneously from a pack of cards,  A  n( A ∩ B) n( S ) P ( A ∩ B) what is the probability that both will be jacks or both are \ P  = = =.   B n ( B ) n ( B ) P ( B) queens? n( S ) Solution NOTES Here two events are mutually exclusive, P(J ∪ Q) = P(J) 1. This definition is also valid for infinite sample spaces. 4 C2 2. The conditional probability of B given A is denoted by + P(Q). Probability of drawing two jacks is P ( J ) = 52 C2 B  B  P ( A ∩ B) P   and P   =. 4  A  A P ( A) C2 Probability of drawing two queens is P (Q ) = 52 C2 Multiplication Theorem P(J ∪ Q) = P(J) + P(Q) Let A and B be two events of certain random experiment such that A occurs only when B has already occurred. Then, 4 C2 4 C2 4 C2 2 A = + = 2 ⋅ =. for the conditional event , the total possible outcomes are 52 C2 52 C2 52 C2 221 B the outcomes favourable to the event B and its favourable outcomes are the outcomes favourable to both A and B. Example 7  A  n( A ∩ B) When two cards are drawn from a pack of cards, find the So, P   = B n( B ) probability that the two cards will be kings or blacks. n( A ∩ B) n( S ) 1 = × = P ( A ∩ B) × Solution n( S ) n( B) P ( B) 4C That is, P   ⋅ P ( B) = P ( A ∩ B ) The probability of drawing two kings = 2 A 52 C 2 B 26 C 2 This is called the multiplication theorem on probability. The probability of drawing two black cards is = 52 C 2 Example 8 2C 2 A letter is selected at random from the set of English alpha- The probability of drawing two black kings is 52 C 2 bet and it is found to be a vowel. What is the probability \ The required probability that it is ‘e’? 4C 26 C 2C 55 Solution 2 2 2 = 52 C + 52 C − 52 C =. Let A be the event that the letter selected is ‘e’ and B be the 2 2 2 221 event that the letter is a vowel. Then, A ∩ B = {e} and B = {a, e, i o, u} Conditional Probability  1    Let S be a finite sample space of a random experiment and  A  P ( A ∩ B )  26  1 So, P   = = =. A, B are events, such that P(A) > 0, P(B) > 0. If it is known B P ( B)  5  5 that the event B has occurred, in light of this we wish to  26    compute the probability of A, we mean conditional proba- Independent Events In a random experiment, if A, B are bility of A given B. The occurrence of event B would reduce events such that P(A) > 0, P(B) > 0 and if P   = P(A) or A the sample space to B, and the favourable cases would now be A ∩ B. B  B  P   = P(B) (conditional probability equals to uncondi- A⋅∩⋅B⋅(new⋅favourable⋅set)  A tional probability) then we say A, B are independent events. A Sample space If A, B are independent, P(A ∩ B) = P(A) P(B). B B⋅(new⋅sample⋅space) Example 9 Two coins are tossed one after the other and let A be the Notation The conditional probability of A given B is de- event of getting tail on second coin and B be the event of noted by P  A .  A getting head on first coin, then find P  . B B Chapter 05.indd 90 5/31/2017 10:55:31 AM Chapter 5 Probability and Statistics | 2.91 Solution A discrete random variable takes the values that are Sample space = {HH, HT, TH, TT}, A = {HT, TT} and B = finite or countable. For example when we consider the experi- {HH, HT}, (A ∩ B) = {HT} ment of tossing of 3 coins, the number of heads can be appre- ciated as a discrete random variable (X). X would take 0, 1, 1 2 and 3 as possible values. \ P ( A= 2 1 ) =  A  P ( A ∩ B) 4 1 and P  = = = A continuous random variable takes values in the form 4 2 B P ( B) 1 2 of intervals. Also, in the case of a continuous random 2 variable P(X = c) = 0, where c is a specified point. Heights Thus P  A  = P ( A) and weights of people, area of land held by individuals, etc., B are examples of continuous random variables. \ Logically too we understand that occurrence or non- occurrence of tail in 2nd coin. Probability Mass Function (PMF) Baye’s Rule If X is a discrete random variable, which can take the val- ues x1, x2, … and f(x) denote the probability that X takes the Suppose A1, A2,... , An are n mutually exclusive and exhaus- value xi, then p(x) is called the Probability Man Function tive events such that P(Ai) ≠ 0. Then for i = 1, 2, 3,..., n, (pmf) of X.  A p(xi) = P(x = xi). The values that X can take and the P ( Ai ) ⋅ P   A   Ai  corresponding probabilities determine the probability P i  = distribution of X. We also have   A  A  ∑ k =1 P ( AK ) P  A  n  K 1. p(x) ≥ 0; Where A is an arbitrary event of S. 2. ∑p(x) = 1. Example 10 Probability Density Function (PDF) Akshay speaks the truth in 45% of the cases, In a rainy If X is a continuous random variable then a function f(x), season, on each day there is a 75% chance of raining. On a x ∈ I (interval) is called a Probability Density Function. The certain day in the rainy season, Akshay tells his mother that probability statements are made as P(x ∈ I ) = ∫ f ( x ) dx. ⋅ it is raining outside. What is the probability that it is actu- I ally raining? We also have, 1. f (x) ≥ 0 Solution ∞ Let E denote the event that it is raining and A denote the 2. ∫−∞ f ( x )dx = 1 event that Akshay tells his mother that it is raining outside. The probability P(X ≤ x) is called the cumulative distribu- 3 1 = Then, P( E ) = , P( E ) tion function (CDF) of X and is denoted by F(X). It is a 4 4 point function. It is defined for discrete and continuous ran-  A  45 9  A  11 dom variables. P  = = and P   =  E  100 20  E  20 The following are the properties of probability distribution function F(x), By Baye’s Rule, we have 1. F(x) ≥ 0  A 2. F(x) is non-decreasing i.e., for x > y, F(x) ≥ F(y) P( E )P   E E 3. F(x) is right continuous P  =  A  A  A P( E ) P   + P( E ) P   4. F(- ∞) = 0 and F(+ ∞) = 1   E E Also, 3 9 × 5. P(a < x ≤ b) = F(b) - F(a). 4 20 27 = =. For a continuous random variable: 3 9 1 11 38 × + × 6. Pr{x < X ≤ x + dx} = F(x + dx) - F(x) = f(x) dx; where 4 20 4 20 dx is very small Advanced Probability 7. f ( x ) = d [ F ( x )] where; dx Random Variable (a) f (x) ≥ 0 ∀ x ∈ R. A random variable is a real valued function defined over the sample space (discrete or continuous). (b) ∫ f ( x )dx = 1. R Chapter 05.indd 91 5/31/2017 10:55:33 AM 2.92 | Part II Engineering Mathematics Mathematical Expectation [E(X)] Properties of Binomial Distribution Mathematical Expectation is the weighted mean of values of 1. E (X) = np (mean) a variable. 2. V (X) = E (X 2) - (E(X))2 = npq; (variance) If X is a random variable which can assume any one of the values x1, x2,..., xn with the respective probabilities p1, (mean > variance) p2,..., pn, then the mathematical expectation of X is given 3. SD ( X ) = npq by E(X) = p1x1 + p2x2 +... + pn xn 4. Mode of a binomial distribution lies between (n + 1) For a continuous random variable, p - 1 ≤ x ≤ (n + 1)p +∞ 5. If X1 ~ b(n1, p) and X2 ~ b (n2, p) and if X1 and X2 E( X ) = ∫ xf ( x )dx where f(x) is the PDF of X. are independent, then. X1 + X2 ~ b (n1 + n2, p) where −∞ (n, p) is the pmf of binomial distribution. Some Special Discrete Poisson Distribution A random variable X is said to fol- Distributions low a Poisson distribution with parameter l, l > 0, if it assumes only non-negative values and its probability mass Discrete Uniform Distribution function is given by A discrete random variable defined for values of x from 1 to n is said to have a uniform distribution if its probability  e −λ λ x : x = 0, 1, 2,...  mass function is given by p( x ) = p( x; λ ) =  x ! λ >0  0 otherwise 1   ; for x = 1, 2, 3,..., n f ( x) =  n 0, otherwise In a binomial distribution if n is large compared to p, then np approaches a fixed constant say l. Such a distribution is called The cumulative distribution function F(x) of the poisson distribution (limiting case of binomial distribution) discrete uniform random variable x is given by Properties of Poisson Distribution 0, for x < 1 x e −λ λ x  1. E ( X ) = ∑ x ⋅ =λ F ( x ) =  ; for 1 ≤ x ≤ n x x! n 1; for x > 1 2. V(X) = E(X 2) - (E(X ))2 = l SD ( X ) = λ n +1 Mean of X = µ = \ Mean = l = Variance 2 3. Mode of a Poisson distribution lies between l - 1 and n2 − 1 Variance of X = σ 2 = l 12 4. If X1 ~ P (l1) and X2 ~ P (l2), and X1, X2 independent then X1 + X2 ~ P (l1 + l2). Binomial Distribution An experiment which is made of n independent trials, each of which resulting in either ‘success’ with probability ‘p’ or Some Special Continuous ‘failure’ with probability ‘q’ (q = 1 - p), then the probabil- Distributions ity distribution for the random variable X when represents the number of success is called a binomial distribution. The Continuous Uniform Distribution probability mass function, or Rectangular Distribution p(x) = b(x; n, p) = nCx px qn-x; x = 0, 1, 2,..., n A continuous random variable x defined on [a, b] is said to have a uniform distribution, if its probability density func- Example: Hitting a target in 5 trials. Here the random tion is given by variable (X) represents the number of trials made for hitting the target, i.e., x = 0 or 1 or 2 or 3 or 4 or 5.  1  ; for x ∈ [a, b] We have a set of 5 trials n = 5 F ( x) =  b − a 0; otherwise Each trial may hit the target termed to be success (p) or not termed to be failure (q), which are independent. The cumulative distribution function of the continuous \ This is an example for Binomial distribution. uniform random variable X is given by Chapter 05.indd 92 5/31/2017 10:55:34 AM Chapter 5 Probability and Statistics | 2.93 0; if x ≤ a 10. If m = 0 and s² = 1, we call it as standard normal x −a distribution. The standardization can be obtained by  F ( x) =  ; if a < x < b the transformation, b − a x−µ X −µ 1; if x ≥ b z=. Also, ∼ N (0, 1). σ σ a+b Mean of X = µ = 2 Exponential Distribution ( b − a) 2 A continuous random variable X is said to have an expo- Variance of X = σ 2 =. nential distribution if its probability density function f(x) is 12 given by, Normal Distribution λ e − λ x ; for x > 0 A continuous random variable X is said to have a normal f ( x) =  distribution with parameters m and s2 if its density function 0; otherwise is given by the probability density function, Here l is the parameter of the exponential distribution and  ( x − µ )2 − ∞ < x < ∞ l > 0.  1 e − 2σ 2 − ∞ < µ < ∞  The cumulative distribution function F(x) of an expo-  f ( x ) = σ 2π  nential distribution with l as parameter is  σ >0   0 otherwise  1 − e − λ x ; if x > 0 F ( x) =  It is denoted as X ~ N (m, s ). 2 0, otherwise The graphical representation of normal distribution is as given below. 1 Mean = µ = , λ 1 Variance = σ 2 =. λ2 μ −σ μ μ +σ Example 11 An unbiased die is thrown at random. What is the expectation Properties of Normal Distribution of the number on it? 1. The function is symmetrical about the value m. Solution 2. It has a maximum at x = m Let X denotes the number on the die, which can take the 3. The area under the curve within the interval (m ± s) is values 1, 2, 3, 4, 5 or 6; 68%. 1 Probability of each will be equal to That is, P(m - s ≤ X ≤ m + s) = 0.68. 6 4. A fairly large number of samples taken from a X 1 2 3 4 5 6 ‘Normal’ population will have average, median 1 1 1 1 1 1 and mode nearly the same, and within the limits of P(X = x) 6 6 6 6 6 6 average ±2 × SD, there will be 95% of the values. 5. E ( X ) = ∫ +∞ x ⋅ f ( x )dx = µ. E ( X ) = ∑ x xP ( X = x ) −∞ 6. V (X) = s2; S.D (X ) = s 1 1 1 1 1 1 = 1× + 2 × + 3 × + 4 × + 5 × + 6 × 7. For a normal distribution, 6 6 6 6 6 6 Mean = Median = Mode 1 6×7 7 = (1 + 2 + 3 + 4 + 5 + 6) = = 8. All odd order moments about mean vanish for a 6 6×2 2 normal distribution. = 3.5. That is, µ2 n +1 = 0∀ = n = 0, 1, 2,... Example 12 9. If X1 ~ N (m1, s12) and X2 ~ N (m2, s22), X1, X2 In a city 5 accidents take place in a span of 25 days. independent, then, Assuming that the number of accidents follows the Poisson X1 + X2 ~ N (m1 + m2, s12 + s22) distribution, what is the probability that there will be Also, X1 - X2 ~ N (m1 - m2, s12 + s22) 3 or more accidents in a day? (Given e–0.2 = 0.8187) Chapter 05.indd 93 5/31/2017 10:55:36 AM 2.94 | Part II Engineering Mathematics Solution Let X denote the number of games won by Shyam. 5 P(Shyam wins at least 3 games) = P(X ≥ 3) Average number of accidents per day = = 0.2; \ l = 0.2. 25 x 5− x 3  2 3x 25− x Probability (3 or more accidents per day) = ∑ x = 3 5C x     = ∑ x = 3 5C x 5 5 = 1 - P (2 or less accidents) 5  5 55 = 1 - [P(X = 0) + P(X = 1) + P (X = 2)] 33 5 = [ C3 22 + 5C4 × 3 × 2 + 1× 32 × 1] = 1 - [e–0.2 + 0.2e–0.2 + 0.02e–0.2] 55 = 1 - e–0.2[1.22] = 1 - 0.99814 = 0.001186. 27 × 79 = = 0.68. Example 13 3125 What is the area under the normal curve to the left of Z = Example 16 -1.54 (given area between 0 and -1.54 = 0.4382)? The PDF of a random variable X is Solution  1   − x  Required area = 0.5 - 0.4382 = 0.0618  e 10  ; x > 0 f ( x ) =  10  f (z )   0 otherwise What is P(X ≤ 10)? (given e–1 = 0.3679) −1.54 0 +1.54 z Solution Example 14 x 10 10 1 −10 A family consists of five children. If the random variable P ( X ≤ 10) = ∫ f ( x ) dx = ∫ e dx (X) represents the number of boys in that family then, 0 0 10 1. Find the expected value E(X) of X.  −x  10 2. Find the variance of X. 1  e 10  =   = 1 − e −1 = 0.6321. Solution 10  1  −  This situation can be modelled as binomial distribution.  10 0  1 1 Joint Distribution of Random Variables Joint X ∼ b  5,  ; E ( X ) = np = 5 × = 2.5  2 2 Probability Mass Function 1 1 Let X and Y be two discrete random variables on the same V ( X ) = npq = 5 × × = 1.25 2 2 sample space S with the range space of X as Rx = {x1, x2, …, Example 15 xm} and the range space of y as, Ry = {y1, y2, …, yn} and PX(x) and PY(y) as the probability mass functions of x and y. Ram and Shyam play a game in which their chances of win- Then the joint probability mass function Pxy(x, y) of the two ning are in the ratio 2 : 3. Find Shyam’s chance of winning dimensional random variable (x, y) on the range space Rx × at least 3 games out of five games played. Ry is defined as, Solution  P ( X = xi , Y = y j ), for ( xi , y j ) ∈ RX × RY 3 PXY ( xi , y j ) =  P(Shyam wins) = ; 0, otherwise 5 2 This joint probability mass function can be represented in P(Shyam loses) = 5 the form of a table as follows: Y n X y1 y2 y3 … yn ∑ j =1Pxy ( x i, y ) j x1 Pxy(x1, y1) Pxy(x1, y2) Pxy(x1, y3) … Pxy(x1, yn) Px(x1) x2 Pxy(x2, y1) Pxy(x2, y2) Pxy(x3, y3) … Pxy(x3, yn) Px(x2) x3 Pxy(x3, y1) Pxy(x3, y2) Pxy(x3, y3) … Pxy(x3, yn) Px(x3) … … … … … … xm Pxy(xm, y1) Pxy(xm, y2) Pxy(xm, y3) … Pxy(xm, yn) Px(xm) ∑ i =1 Pxy ( x i , y j ) m Py(y1) Py(y2) Py(y3) … Py(yn) Chapter 05.indd 94 5/31/2017 10:55:38 AM Chapter 5 Probability and Statistics | 2.95 From the above table, it can be easily observed that the mar- 1. The conditional probability mass (density) function ginal probability mass functions of X and Y namely Px(x) x and PY(y) respectively can be obtained from the joint prob- f X   of X , given Y = y is defined as y Y   ability mass function Pxy (x, y) as  x  f ( x, y ) F X   = XY , where fY(y) ≠ 0 and Px ( xi ) = ∑ n P (x , y j ), for i = 1, 2, … , m y fY ( y ) j =1 xy i Y   2. The conditional probability mass (density) function And  y  y f Y   of Y , given X = x is defined as f Y   X    x  x Py ( y j ) = ∑ j =1 Pxy ( xi , y j ) for j = 1, 2, 3, … , n m X f XY ( x, y ) = where f X ( x ) ≠ 0. Pxy (xi, yj) ≥ 0 ∀ i, j f X ( x) ∑ i =1 ∑ j =1 Pxy ( xi , y j ) = 1 m n Independent Random Variables The cumulative joint distribution function of the two Two discrete (continuous) random variables X and Y defined dimensional random variable (X, Y) is given by Fxy (x, y) on the same sample space with joint probability mass (den- = P(X ≤ x, Y ≤ y). sity) function PXY(x, y) are said to be independent, if and only if, PXY(x, y) = PX(x) PY(y) Joint Probability Density Function Where PX(x) and PY(y) are the marginal probability mass Let X and Y are two continuous random variables on the (density) functions of the random variables X and Y same sample space S with fx(x) and fy(y) as the probabil- respectively. ity density functions respectively. Then a function fxy(x, y) is called the joint probability density function of the two NOTE dimensional random variable (X, Y) if the probability that If the random variables X and Y are independent then the point (x, y) will lie in the infinitesimal rectangular region Pxy (a ≤ X ≤ b, c ≤ Y ≤ d) = Px (a ≤ X ≤ b) Py(c ≤ Y ≤ d) of area dx dy is fxy(x, y) dx dy, That is, Statistics  1 1 1 1  Statistics is basically the study of numeric data. It includes P  x − dx ≤ X ≤ x + dx, y − dy ≤ Y ≤ y + dy   2 2 2 2  methods of collection, classification, presentation, analysis and inference of data. Data as such is qualitative or quan- = fXY (x, y) dx dy titative in nature. If one speaks of honesty, beauty, colour, ∞ ∞ etc., the data is qualitative while height, weight, distance, ∫−∞ ∫−∞ f XY ( x, y)dxdy = 1 marks, etc are quantitative. The marginal probability density functions fX(x) and fY(y) of The present course aims to systematically study statistics the two continuous random variables X and Y are given by, of quantitative data. The quantitative data can be divided ∞ ∞ into three categories f x ( x) = ∫ f XY ( x, y )dy and f y ( y ) = ∫ f XY ( x, y )dx −∞ −∞ 1. Individual series The cumulative joint distribution function FXY(x, y) of the 2. Discrete series and two-dimensional random variable (X, Y) (where X and Y 3. Continuous series are any two continuous random variables defined on the same sample space) is given by, Individual Series x y Examples: FXY ( x, y ) = ∫ −∞ ∫−∞ f XY ( x, y )dxdy. 1. Heights of 8 students 5.0, 4.9, 4.5, 5.1, 5.3, 4.8, 5.1, 5.3 (in feet) 2. The weight of 10 students Conditional Probability Functions 46, 48, 52, 53.4, 47, 56.8, 52, 59, 55, 52 (in kgs) of Random Variables Discrete Series Let X and Y be two discrete (continuous) random variables Example: defined on the same sample space with joint probability x : Number of children in a family mass (density) function fXY(x, y), then f : Number of families Chapter 05.indd 95 5/31/2017 10:55:39 AM 2.96 | Part II Engineering Mathematics Total number of families = 50 3. Continuous series: x 0 1 2 3 4 f1m1 + f 2 m2 + + f n mn Σfi mi f 8 10 19 8 5 x= = f1 + f 2 + + f n Σf i Continuous Series where f1, f2, f3, …, fn are the frequencies of the classes Example: Total number of students = 50 whose mid-values are m1, m2, …, mn respectively. Class Interval (CI) Frequency (f ) Some Important Results Based on AM 0-10 8 1. The algebraic sum of deviations taken about mean is 10-20 12 zero. 20-30 13 2. Its value is based on all items. 30-40 10 40-50 7 ( n +1) 3. Mean of first n natural numbers is. 2 In order to analyze and get insight into the data some math- ematical constants are devised. These constants concisely ( a + b) 4. Arithmetic mean of two numbers a and b is. describe any given series of data. Basically we deal with two 2 of these constants, 5. If b is AM of a and c then a, b, c are in arithmetic 1. Averages or measures of central tendencies progression. 2. Measures of spread or dispersion Combined Mean If x1 and x2 are the arithmetic means of two series with n1 and n2 observations respectively, the com- Measures of Central Tendencies These tell us about how the data is clustered or concentrated. They give the central n x +n x bined mean, xc = 1 1 2 2. idea about the data. The measures are n1 + n2 1. Arithmetic mean or mean 2. Geometric mean Median 3. Harmonic mean If for a value the total frequency above (or below) it is half 4. Median of the overall total frequency the value is termed as median. 5. Mode Median is the middle-most item. The first three are mathematical averages and the last two Individual Series If x1, x2, …, xn are arranged in ascending are averages of position.  n +1 order of magnitude then the median is the size of   th Measures of Dispersion It is possible that two sets of data  2  may have the same central value, yet they may differ in spread. item. So there is a need to study about the spread of the data. Some Results Based on Median The measures we deal with are, 1. Median does not take into consideration all the items. 1. Range 2. The sum of absolute deviations taken about median is 2. Quartile deviation or semi inter-quartile range least. 3. Mean deviation 3. Median is the abscissa of the point of intersection of 4. Standard deviation (including variance) the cumulative frequency curves. The formulae for each of the above mentioned measures is 4. Median is the best suited measure for open end listed for each of the series in what follows. classes. Measures of Central Tendencies Mode The most frequently found item is called mode. Being so, it is easy and straight forward to find for indi- Arithmetic Mean (AM or x) vidual and discrete series. 1. Individual series: x + x + + xn Σxi x= 1 2 = Empirical Formula n n 2. Discrete series: For moderately symmetrical distribution, Mode = 3 median - 2 mean f x + f x + + f n xn Σfi xi x= 1 1 2 2 = For a symmetric distribution, Mode = Mean = Median. f1 + f 2 + + f n Σf i This formula is to be applied in the absence of sufficient where x1, x2, …, xn are n distinct values with data. Given any two, of the mean, median or mode the frequencies f1, f2, f3, …, fn respectively. third can be found. Chapter 05.indd 96 5/31/2017 10:55:40 AM Chapter 5 Probability and Statistics | 2.97 Measures of Dispersion 2 Σxi2  Σxi  Range Alternatively σ = − is a useful formula for n  n  The range of a distribution is the difference between the computational purpose. greatest and the least values observed. Some Important Results Based on Range Some Results Based on SD 1. Range is a crude measure of dispersion as it is based 1. The square of standard deviation is termed as only on the value of extreme observations. variance. 2. It is also very easy to calculate. 2. SD is the least mean square deviation. 3. It does not depend on the frequency of items. 3. If each item is increased by a fixed constant the SD does not alter or SD is independent of change of Quartile Deviation (QD) origin. Q3 − Q1 4. Standard deviation depends on each and every data QD = item. 2 5. For a discrete series in the form a, a + d, a + 2d,... (AP), Individual Series The numbers are first arranged in n2 − 1 ascending or descending order, then we find the quartiles the standard deviation is given by SD = d , Q1 and Q3 as 12 where n is number of terms in the series. Q1 → size of [(n + 1)/4]th item Q3 → size of [3(n + 1)/4]th item Co-efficient of Variation (CV) The first quartile (or the lower quartile) Q1 is that value of SD the variable, which is such that one-quarter of the observa- Co-efficient of variation (CV) is defined as, CV = ×100. tions lies below it. The third quartile Q3 is that value of the AM variable, which is such that three-quarters of the observa- This is a relative measure, which helps in measuring the tions lie below it. consistency. Smaller the co-efficient of variation, greater is the consistency. Mean Deviation (MD) It is defined as the arithmetic mean of the deviation from ori- Example 17 gin, which may be either mean or median or mode. For the individual series, compute the mean, median and Individual Series mode 8, 11, 14, 17, 20, 23, 26, 29. | x1 − A | + | x2 − A | + + | xn − A | Solution MD = n Σxi 8 + 11 + + 29 Mean = x = = = 18.5 where x1, x2, …, xn are the n observations and A is the mean n 8 or median or mode. Median: As the numbers are in ascending order and the Some Results Based on MD number 17 and 20 being middle terms. 1. Mean deviation depends on all items. 17 + 20 37 Median = = = 18.5 2. By default, mean deviation is to be computed about 2 2 mean. Mode: As no term can be regarded as ‘most often found’, 3. Mean deviation about the median is the least. mode is not-defined. However using empirical formula, | a−b| Mode = 3 median - 2 mean 4. Mean deviation of two numbers a and b is. = 3(18.5) - 2(18.5) = 18.5. 2 Standard Deviation (SD) Example 18 Standard deviation is referred to as root mean squared devi- The arithmetic mean of 8, 14, x, 20 and 24 is 16; then find x. ation about the mean. Solution Individual Series 8 + 14 + x + 20 + 24 x= = 16 ( x1 − x ) 2 + ( x2 − x ) 2 + + ( xn − x ) 2 5 SD (s) = n ⇒ x = 80 − 66 = 14. where x1, x2, …, xn are n observations with mean as x. Chapter 05.indd 97 5/31/2017 10:55:42 AM 2.98 | Part II Engineering Mathematics Example 19 Sample: Any subset of the population is called as sample. Calculate standard deviation of first five prime numbers. The size (i.e., the number of elements) of the sample is Solution denoted by n. n is always finite. Given set of observations {2, 3, 5, 7, 11} Σx 2 208 = Examples: n 5 1. All the GATE applicants—Population Σx 28 = GATE applicants form a city—Sample n 5 2. Cars manufactured by Tata Motors—Population 2 Σx 2  Σx  Nano cars manufactured by Tata Motors—Sample \ SD = − n  n  3. All possible outcomes of 10 roles of a die—Population 12 possible outcomes of 10 roles of a die—Sample 2 208  28  4. Number of units of electricity consumed by the = − = 3.2. 5  5  residents of a colony in a city—Population Number of units of electricity consumed by the Example 19 residents of 5 houses of that colony—Sample In a series of observations, co-efficient of variation is 25 and 5. Diameters of screws produced by a company— mean is 50. Find the variance. Population Solution Diameters of screws produced on one machine of that company—Sample Co-efficient of variation: CV = SD × 100 x CV Sampling ⇒ SD = ⋅x 100 The process of drawing samples from the population is 25 called sampling. = 50 × = 12.5 100 Random Sampling A sampling in which each member of Variance = (12.5)2 = 156.25. the population has the same chance of being included in the sample is called random sampling. Hypothesis Testing Simple Sampling A random sampling in which the chance Introduction of being included in the sample for different members of the In probability theory, we set up mathematical models of pro- population is independent of whether included or not in the cesses and systems that are affected by ‘chance’. In statis- previous trails is called simple sampling. tics, we check these models against the reality, to determine Large and Small Samples If the size of the sample is great- whether they are faithful and accurate enough for practical er than or equal to 30 (i.e., n ≥ 30), then the sample is called purposes. The process of checking models is called statisti- a large sample. Otherwise it is called a small sample. cal inference. Methods of statistical inference are based on drawing Parameter A statistical measure or constant of the popula- samples (or sampling). One of the most important methods tion is called a parameter. of statistical inference is ‘Hypothesis Testing’. Examples: Some Basic Definitions 1. Population mean (denoted by m) Population 2. Population standard deviation (denoted by s) Population is the set of individuals or objects, animate or Statistic A statistical measure or constant of the sample inanimate, actual or hypothetical under study. drawn from the population is called a statistic. (statistic- Size of the Population The number of individuals or ob- singular, statistics-plural) jects or observations in the population. Examples: The size of the population is denoted by N. N can be finite or infinite (i.e., population can be finite or 1. Sample mean (denoted by x ) infinite) 2. Sample standard deviation (denoted by s) Chapter 05.indd 98 5/31/2017 10:55:43 AM Chapter 5 Probability and Statistics | 2.99 NOTE Null Hypothesis and Alternative Hypothesis In general, the population parameters are not known and Null Hypothesis A statistical hypothesis which is to be their estimates given by the corresponding sample statis- actually tested for acceptance or rejection is called a null tics are used. hypothesis. (According to RA Fisher, Null hypothesis is the hypoth- Sampling Distribution Consider samples of size n drawn esis which is tested for possible rejection under the assump- from a given population. Compute some statistic S, say tion that it is true) mean ( x ) or variance (s2) for each of the samples. The val- ues of the statistics can be given in the form of a frequency Null hypothesis is denoted by H0 table. The frequency table so formed is known as a sampling Alternative Hypothesis Any hypothesis other than the null distribution of the statistic. hypothesis is called an alternative hypothesis. Example: Consider the set of numbers {1, 2, 3, 4, 5, 6} as Alternative hypothesis is denoted by H1 population. Let q be a population parameter and q0 be the specified Consider the following 15 samples each of size 3 drawn value of q. Then we define null and alternative hypotheses from the above population. as follows. (1, 2, 3), (3, 5, 5), (2, 4, 6), (5, 5, 5), (1, 2, 6) (1, 3, 5), (6, 6, 6), (4, 4, 5), (2, 3, 4), (1, 1, 4) Null hypothesis H0 : q = q0 (2, 5, 5), (2, 2, 5), (3, 4, 6), (2, 4, 5), (4, 5, 6) Alternative Hypothesis H1 : q ≠ q0 (two tailed alternative) Then the sampling distribution of means for these sam- (OR) H1 : q > q0 (right tailed alternative) ples is (OR) H1 : q < q0 (left tailed alternative) Sample Mean ( x ) 2 3 3.67 4 4.33 5 6 Frequency 2 4 1 2 3 2 1 Type I and Type II Errors Type I Error Rejecting the null hypothesis (H0), when it Standard Error The standard deviation of the sampling should be accepted is called type I error. distribution of a statistic is called the standard error (SE) of Type II Error Accepting the null hypothesis (H0) when it that statistic. should be rejected is called type II error. The standard deviation of the sampling distribution of Accept H0 Reject H0 means is called the standard error of means where as the H0 is true Correct decision Type I error standard deviation of the sampling distribution of vari- H0 is false Type II error Correct decision ances is called the standard error of variances. Precision: The reciprocal of the standard error is called Level of Significance precision. The probability level, below which, we reject the null hypothesis is called the level of significance. NOTE (OR) If the sample size n is large, (i.e., n ≥ 30), then the sam- The probability of committing type I error is known as the pling distribution of a statistic is approximately normal. level of significance. (Irrespective of the population distribution being normal or not) The level of significance is denoted by ‘a’. It is customary to fix a, before sample information is collected. Testing of Hypothesis In most of the cases, we choose a as 0.05 or 0.01. We have some information about a characteristic of the pop- a = 0.05 is used for moderate precision and a = 0.01 is ulation which may or may not be true. This information is used for high precision. called statistical hypothesis or briefly hypothesis. We wish Level of significance can also be expressed as percentage. to know, whether this information can be accepted or to be a = 5% means there are 5 chances in 100 that the null rejected. We choose a random sample and obtain informa- hypothesis H0 is rejected when it is true or one is 95% tion about this characteristic. Based on this information, a confident that a right decision is made. process that decides whether the hypothesis to be accepted The probability of committing type II error is denoted by b. or rejected is called testing of hypothesis. i.e., In brief, the \ b = P (accept H0 when H0 is false) test of hypothesis or the test of significance is a procedure Level of significance (a) is also known as the size of the to determine whether observed samples differ significantly test. from expected results. 1 - b is known as the power of the test. Chapter 05.indd 99 5/31/2017 10:55:43 AM 2.100 | Part II Engineering Mathematics Critical Region and Critical Value 3. Two tailed test: If the alternative hypothesis H1 is Consider the area under the probability curve of the sam- of not equal to type (For example, H1 : m ≠ m0 or H1 pling distribution of the test statistic which follows some : σ 12 ≠ σ 22), then the critical region lies on both sides known distribution. The area under the probability curve (right and left tails) of the probability curve of the test α is divided into two regions, namely the region of rejection statistic S* such that the critical region of area lies where null hypothesis is rejected and the region of accept- 2 ance where null hypothesis is accepted. α on the right tail and the critical region of area lies 2 Critical Region (or) the Region of Rejection on the left tail as shown in the figure. (or) the Significant Region Acceptance Region of area The region under the probability curve of the sampling dis- 1− α tribution of the test statistic, where the null hypothesis (H0) Critical Region of area α /2 Critical Region is rejected is called the critical region. of area α /2 The area of the critical region is equal to the level of sig- nificance a. Critical Value (OR) Significant Value The value of the test statistic (for given level of significance Sα* / 2 Sα* / 2 a) which separates the area under the probability curve into critical and non-critical regions. Two-tailed test In this case, the test of hypothesis is known as two-tailed test. One Tailed and Two Tailed Tests Procedure for Test of Hypothesis 1. Right one-tailed test: If the alternative hypothesis H1 is of greater than type (For example, H1 : m > m0 or H1 Step 1: Formulate null hypothesis H0 : s12 > s22) then the entire critical region of area a lies Step 2: Formulate alternative hypothesis H1 on the right side tail of the probability curve of the Step 3: Choose th

Chapter 5 Probability and Statistics PDF

Document Details

Tags

Related

Summary

Full Transcript