Probability Distributions PDF

Chapter 4 SPECIAL PROBABILITY DISTRIBUTIONS 4.1 Introduction In chapters 1 and 2 we studied probability distributions in general. In this chapter we study some commonly occurring probability distributions and investigate their basic properties. The results of this chapter will be of considerable use in theoretical as well as practical applications. We begin with some discrete distributions and follow with some continuous distributions. 4.2 Discrete Distributions In this section we study some well-known discrete distributions and describe their important properties. I- The Binomial Distribution Consider a sequence of n independent Bernoulli trials, each resulting in one of the two possible events, "success" or "failure". Let the probability p = P(success occurs at any given trial) remains constant from trial to trial. Let the r.v. X denote the total number of successes in the n trials, then X is said to have the binomial distribution Definition 4.1 The r.v. X is said to have the binomial distribution and it is referred to as a binomial r.v. with parameters n and p iff its p.m.f. is given by n x n-x b( x ; n , p ) = f ( x) =   p q for x = 0 , 1 , 2 ,... , n. (4.1)  x where 0  p  1 and q = 1 - p. The term "parameters" is used quite generally to refer to a set of constants in a distribution whose values may vary from one application to another. The notation b(x;n,p), which have used instead of f(x), reflects the dependence on the parameters n and p. The general properties of the p.m.f. satisfied by equation (4.1), since 0  p  1 and - 80 - n n n x n-x     p q = ( p + q ) = 1 n b( x ; n , p ) = x=0 x=0  x  The name "binomial distribution" derives from the preceding summation, that is the values of b(x; n, p) for x = 0, 1, 2,..., n are the successive terms of the binomial expansion of (q + p)n. A short notation to designate that X has the binomial distribution with parameters n and p is X ~ b(n,p) or an alternative notation X ~ BIN(n,p) Example 4.1 Acompany produces computer chips of which 40% are undefective, 7 chips are chosen at random, what is the probability that: (a) exactly 3, (b) at least 5 , (b) at most 5 chips will be defective Solution Let X be the number of the defective chips, then P(one chip will be defective) = p = 0.6  q = 1 - p = 0.4 number of all chips = n = 7  7 (a) P(exactly 3 chips will be defective) = P ( X = 3 ) =   (0.6 ) (0.4 ) = 0.199 3 4  3 (b) P(at least 5 chips will be defective) = P(X = 5) + P(X = 6) + P(X = 7) 7 7 7 =   ( 0.6 )5 ( 0.4 )2 +   ( 0.6 )6 ( 0.4 ) +   ( 0.6 )7 (.4 )0  5  6 7 = 0.261 + 0.131 + 0.028 = 0.42. (c) P(at most 5 chips will be defective) = P(X = 0) + P(X = 1) +... + P(X = 5) = 1 - P(X = 6) - P(X = 7) = 1 - 0.131 - 0.028 = 0.841. Example 4.2 How many times of rolling a fair die needed for the probability of getting at least one 6 to be  0.5. - 81 - Solution Suppose that the fair die is rolled n times. The probability of obtaining no 6 is n n  5  5   , and the probability of obtaining at least one 6 is 1 -  .  6  6 The number of trials for the probability of at least one 6 to be  0.5 is given by the smallest integer n such that n  5 1 1-    6 2 or  5 n log    - log 2  6 Therefore, log 2 n  3.8. log 1.2 Let us now find formulas for the moment generating function, the mean and the variance of the binomial distribution. Theorem 4.1 The M.G.F. of the binomial distribution is given by M X (t) = ( p e + q ) n t (4.2) and the mean and the variance are given by μ = np and σ2 = npq Proof By definitions, we have n n x n-x n n M X (t) = E[ e ] =  e.   p q =    ( p e ) q = ( p e + q ) tX tx t x n-x t n x=0  x x =0  x  Now, differentiating MX(t) w.r.t. t, then putting t=0, we get   = E[ X ] = M X ( 0 ) = n p et ( p et + q )n - 1  t =0 = np If we differentiate MX(t) twice w.r.t. t , then putting t=0, we get  2 = MX (0) = n p et ( p et + q )n - 1 + n(n - 1) p2 e2 t (p et + q )n - 2  t =0 2 = np + n(n- 1) p 2  = var ( X ) = E ( X ) - [ E ( X ) ] = n p q 2 2 Hence, - 82 - II- The Negative Binomial Distribution We again consider a sequence of independent Bernoulli trials with probability of success p = P(E). In the case of the binomial distribution, the number of trials was a fixed number n, and the variable of interest to achieve a specified number of successes. If we denote by X the number of trials required to obtain the kth success (i.e. the kth success is to occur on the Xth trial), then there must be k-1 successes on the first x-1 trials, and the probability for this is  x - 1 k - 1 x - k b( k - 1 ; x - 1 , p ) = f ( x) =   p q  k - 1 The probability of a success on the xth trial is p, and the probability that the kth success occurs on the xth trial is, therefore,  x - 1 k x - k p. b( k - 1 ; x - 1 , p ) =   p q  k - 1 Definition 4.2 The r.v. X is said to have the negative binomial distribution and it is referred to as a negative binomial r.v. with parameters k and p iff its p.m.f. is given by  x - 1 k x - k b ( x ; k , p ) =   p q * for x = k , k+ 1 , k+ 2 ,... (4.3)  k - 1 where 0  p  1 and q = 1 - p. A special notation, which designates that X has the negative binomial distribution (4.3) is X ~ NBIN(k,p) The name "negative binomial distribution" derives from the fact that the values of b*(x; k, p) for x = k, k+1, k+2,... k -k are the successive terms of the binomial expansion p (1 - q ) i.e.   x - 1 x - k   b ( x ; k ,p ) = p    q = pk (1 - q )-k = 1 * k x =k x =k  k - 1  In the literature of statistics, negative binomial distribution is also referred to as Pascal distribution. There is a relationship between the binomial distribution and the negative binomial - 83 - distributions given by the following identity, * k b ( x ; k , p ) =. b( k ; x , p ) x Example 4.3 If the probability is 0.4 that a child exposed to a certain contagious disease will catch it, what is the probability that the tenth child exposed to the disease will be the third to catch it? Solution Substituting x=10, k=3, and p=0.4 into the formula for the negative binomial distribution (4.4), we get  9 b (10 ; 3 ,0.4 ) =   (0.4 ) (0.6 ) = 0.0645 * 3 7  2 Example 4.4 Team A plays with team B in a seven-game world series. That is, the series is over when either team wins four games. For each game, P(A wins) =0.6, and the games are assumed independent. What is the probability that the series will end in exactly six games? Solution We have x=6 , k=4 , and p=6 in (4.4), and  5 P ( A wins series in 6 ) = b* ( 6 ; 4 ,0.6 ) =   (0.6 ) (0.4 ) = 0.20736 4 2  3  5 P ( B wins series in 6 ) = b* ( 6 ; 4 ,0.4 ) =   (0.4 ) (0.6 ) = 0.09216 4 2  3 P ( series goes 6 games ) = 0.20736 + 0.09216 = 0.29952 The mean and the variance of the negative binomial distribution may be obtained by proceeding as in the proof of theorem 4.1, and we get Theorem 4.2 The M.G.F., the mean and the variance of the negative binomial distribution are given by - 84 - k  p et  k k q M X (t) =  t , = and  = 2 2  1 - q e  p p Proof By definitions, we have   x - 1  x - 1 k x - k  M X (t) = E[ e ] =  e.   p q = p    (q et )x tX tx k x=k  k - 1 x = k  k - 1   i + k - 1  k t i = p e kt  (qe ) i =0 k - 1 = ( p et ) (1 - q et ) k -k k  p et  = t 1-qe  Now, differentiating MX(t) w.r.t. t,  (1 - q et ) p et - p et (- q et )   k  k -1  p et  M X(t) = k  t  2 = t M X (t)  1-q e   ( 1 - q et )  1-qe  then putting t=0, we get k  = E[ X] = M X (0) = p If we differentiate MX(t) twice w.r.t. t ,we get  (1 - q et ) MX (t) + q MX (t) et  MX (t) = k  2   ( 1 - q et )  then putting t=0, k (k + q) E[ X2] = MX (0) = 2 p Hence, 2 kq  = var (X) = E ( X ) - [E (X) ] = 2 2 2 p III- The Geometric Distribution Since the negative binomial distribution with k=1 has many important applications, it is given a special name; it is called the geometric distribution. If we denote by X the number of trials required to obtain the 1st success, then X is - 85 - said to have the geometric distribution and its p.m.f. is given by Definition 4.3 The r.v. X is said to have the geometric distribution and it is referred to as a geometric r.v. with parameter p iff its p.m.f. is given by x -1 g ( x;p ) = p q for x = 1 , 2 , 3 ,... (4.4) where 0  p  1 and q = 1 - p. A special notation, which designates that X has the geometric distribution (4.4) is X ~ GEOM(p) The general properties of the p.m.f. g(x;p) are satisfied by (4.4) since 0 < p < 1 and    1  p  g (x ; p) = p  q = p (1 + q + q +... ) = p   = = 1 x -1 2 x =1 x =1 1-q p The name "geometric distribution" derives from its relation with geometric series that was used in the preceding formula. Example 4.5 If the probability is 0.75 that an applicant for a driver's license will pass the road test on any given try, what is the probability that this applicant will finally pass the test on the fourth try? Solution Substituting x=4, and p=0.75 into the formula for the geometric distribution, we get 3 g ( 4 ; 0.75 ) = 0.75 (0.25) = 0.0117 Of course, this result is based on the assumption that the trials are all independent. Theorem 4.3 The M.G.F., the mean and the variance of the geometric distribution are given by p et 1 q M X (t ) = ,  = and  2 = 2 1 - q et p p It also follows from the properties of the geometric distribution series that the CDF of X is x x g ( j; p ) = p  j-1 x G ( x;p ) = q =1 - q x = 1 , 2 , 3 ,... j=1 j=1 from which we obtain; P(X > x) = qx x = 1, 2, 3,..... - 86 - IV- The Hypergeometric Distribution In chapter 2 we used sampling with and without replacement to illustrate the multiplication rules for independent and dependent events. To obtain a formula analogous to that of the binomial distribution which applies to sampling without replacement , in which case the trials are not independent, let us consider a set or collection consists of a finite number of items, say N, of which M are of type 1 and the remaining N-M items are of type 2. Suppose n items are drawn at random without replacement, and denote by X the number of items of type 1 that are drawn. Then the distribution of X is called the hypergeometric distribution and its p.m.f. is given by Definition 4.4 The r.v. X is said to have the hypergeometric distribution and it is referred to as a hypergeometric r.v. iff its p.m.f. is given by M N - M      n-x   x   for x = 0, 1, 2,...,n, x  M and n-x  N-M (4.5) h(x ; n, N, M) =  N    n Note that the possible values of X in (4.6), in general, are max(0 , n-N+M)  x  min(n , M) A special notation, which designates that X has the hypergeometric distribution (4.5) is X ~ HYP(n , N , M) Using the identity, k m  n  m + n       =    k  r =0  r  k - r   The hypergeometric distribution is important in applications such as deciding whether to accept a lot of manufactured items. Example 4.6 A box contains 100 microchips, 80 good and 20 defective. The number of defectives in the box is unknown to a purchaser, who decides to select 10 microchips at - 87 - random without replacement and to consider the microchips in the box acceptable if the 10 items selected include no more than 3 defectives. Find the probability of accepting this lot. Solution Let X be the number of defectives selected, then X has the hypergeometric distribution with n=10 , N= 100 , and M=20, and the probability of the lot being acceptable is  20   80      3  x    P[ X  3]=     10 - x  = 0.159  100    x=0  10    It can be shown by a straightforward but rather tedious derivation that the mean and the variance of the hypergeometric distribution are given by nM n M (N - M )(N - n ) = and  = 2 2 N N (N - 1) V- The Poisson Distribution Definition 4.5 The r.v. X is said to have the Poisson distribution with parameter λ > 0 iff its p.m.f. is given by - e  x f(x ;  ) = for x = 0, 1, 2,... (4.5) x! This distribution is named due to the French mathematician Simeon Poisson (1781- 1840). A special notation, which designates that X has the Poisson distribution (4.5) is X ~ POIS( λ ) The general properties of the p.m.f. f(x;λ) are satisfied by (4.5) since λ > 0 implies f(x;λ)  0 and    x  f(x ;  ) = e. -  x! = e- . e = 1 x=0 x=0 Now we shall show that the Poisson p.m.f. (4.5) is the limiting form of the binomial p.m.f., when n →  , p → 0 , while np remains constant. - 88 - Theorem 4.4 If X ~ BIN(n,p),, then for each value x = 0, 1, 2... and as n →  , p → 0 and np = λ remains constant, b(x ; n , p) → f(x ; λ) Proof Letting np = λ and hence, p =  , we can write n n!      x n-x b(x ; n , p ) =   1 -  x! (n- x)! n   n n(n - 1) (n - 2)... (n- x + 1)       x n-x =   1 -  x! n  n ( n) x Then, if we divide one of the x factors n in  into each factor of the product n(n -1)(n -2)... (n - x +1) we obtain  1   2   x -1 1  1 -   1 - ... 1 -  -  n n  n      - n/     - x ( )   1 -  1 -  x  x!   n    n Finally, if we let n →  while x and λ remain fixed, we find that - n/     -x  1  2   x - 1    1 1 -  1 - ... 1 - → 1 ,  1 -  →1,  1-  →e  n  n   n   n  n  and, hence, that the limiting distribution becomes the Poisson p.m.f. f(x;λ) given by (4.5). Thus, in the limit when n → , p → 0, and np = λ remains constant, the number of successes is a r.v. having a Poisson distribution with the parameter λ. In general, the Poisson distribution will provide a good approximation to binomial probabilities when n 20 and p 0.05. When n  100 and np

Probability Distributions PDF

Document Details

Tags

Related

Summary

Full Transcript