Probability - Chapter 2 PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document introduces the theory of probability, detailing its origins in gambling and its expansion to various fields like actuarial science and physics. Basic definitions like experiments, trials, outcomes, sample spaces, and events are clearly and concisely defined.
Full Transcript
Chapter 2 PROBABILITY 2.1 Introduction The theory of probability had its origin in gambling and games of chance in mid- eighteenth-century. The theory thus developed for "heads or tails" or "red or black" soon found applications in situations where the outcome...
Chapter 2 PROBABILITY 2.1 Introduction The theory of probability had its origin in gambling and games of chance in mid- eighteenth-century. The theory thus developed for "heads or tails" or "red or black" soon found applications in situations where the outcomes were "boy or girl", " life or death ", "pass or fail" and scholars began to apply probability theory to actuarial problems and some aspects of the social sciences. Later, probability and statistics were introduced into physics by L. Boltzmann, J. Gibbs, and J. Maxwell, and in this century they have found applications in all phases of human endeavor which in some way involve an element of uncertainty or risk. The names which are connected most prominently with the growth of probability and mathematical statistics in the first half of this century are those A. N. Kolmogorov, R. A. Fisher, J. Neyman, E. S. Pearson, and A. Wald. The concept of probability is frequently encountered in everyday communication. For example, we may hear a physician say that a patient has a 50-50 chance of surviving a certain operation. Another physician may say that he is 95 percent certain that a patient has a particular disease. Thus we are accustomed to measuring the probability of the occurrence of some event by a number between zero and one. The more likely the event, the closer the number is to one; and the more unlikely the event, the closer the number is to zero. An event that cannot occur has a probability of zero, and an event that is certain to occur has a probability of one. 2.2 Basic Definitions. We begin our development of probability by defining some basic key words. (A) Experiment The term experiment refers to the process of obtaining an observed result of some phenomenon, and a performance of an experiment, is called a trial of the experiment. An observed result, on a trial of the experiment, is called an outcome. This terminology is rather general; our interest will be in situations, called random experiments, where there is uncertainty about which outcome will occur when the experiment is performed. By a random experiment we will mean any procedure that: 1- all possible outcomes can be completely defined in advance. - 19 - 2- can be repeated, theoretically, any number of times under identical conditions. 3- any performance of the experiment results in an outcome that is not known to certain occur in advance. For example, a coin is tossed, assuming that the coin does not land on the side, there are two possible outcomes of the experiment: heads (denoted by H) and tails (denoted by T). On any performance of this experiment one does not know what the outcome will be. The coin can be tossed as many times as desired. (B) Sample Space The Sample Space, denoted by S, associated with a random experiment is the set of all possible outcomes of that experiment. Note that one and only one of the possible outcomes will occur on any given trial of the experiment. Example 2.1 An experiment consists of tossing two coins, and the observed face of each coin is of interest. The set of all possible outcomes may be represented by the sample space; S = { HH , HT , TH , TT } An alternate way of representing such a sample space is to list all possible ordered pairs of the numbers 1 and 0, S ={ (1,1) , (1, 0 ) , ( 0 ,1) , ( 0 , 0 ) } where, for example, (1,0) indicates that the first coin landed heads up and the second coin landed tails up. If we are interested in the total number of heads obtained from the two coins, an appropriate sample space could then be written as S ={ 0 , 1 , 2 } Thus, different sample spaces may be appropriated for the same experiment, depending on the characteristic of interest. In tossing the coin three times (or 3 coins), the sample space consists of 8 outcomes, S = { HHH , HHT , HTH , THH , HTT , THT , TTH , TTT } Note that, in tossing the coin n times the sample space consists of 2n outcomes. Example 2.2 If a coin is tossed repeatedly until a head occurs, then the natural sample space is S = {H, TH, TTH...}. If one is interested in the number of tosses required to obtain a head, then a possible sample space for this experiment would be the set of all positive integers, S* = {1, 2, 3...}, - 20 - and the outcomes would correspond directly to the number of tosses required to obtain the first head. It will be shown in the next chapter that an outcome corresponding to a sequence of tosses in which a head is never obtained need not be included in the sample space. Example 2.3 A light bulb is placed in service and the time of operation until it burns out is measured. At least conceptually, the sample space for this experiment can be taken to be the set of nonnegative real numbers, S = { t : 0 t < } Note that if the actual failure time could be measured only to the nearest hour, then the sample space for the actual observed failure time would be the set of nonnegative integers, S* = { 0, 1, 2, 3,... }. Even though S* may be the observable sample space, one might prefer to describe the properties and behavior of light bulbs in terms of the conceptual sample space S. In cases of this type, the discreetness imposed by measurement limitations is sufficiently negligible that it can be ignored, and both the measured response and the conceptual response can be discussed relative to the conceptual sample space S. A sample S space is said to be finite if it consists of a finite number of outcomes, say S = {e1 , e2 ,..., eN } and it is said to be countably infinite if its outcomes can be put into a one correspondence with the positive integers, say S = {e1 , e2 ,... }. Definition If a sample space S is either finite or countably infinite, then it is called a discrete sample space. A set that is either finite or countably infinite is also said to be countable. This is the case in the first two examples. It is also true for the last example when failure times are recorded to the nearest hour, but not for the conceptual sample space. Since the conceptual space involves outcomes that may assume any value in some interval of real numbers (i.e., the set of nonnegative real numbers), it could be termed a continuous sample space, and it provides an example where a discrete sample space is not an appropriate model. If the sample space consists of a continuum, such as all the points of a line segment or all the points in a plane (i.e. an uncountable infinite number of outcomes), it is said to be continuous. Continuous sample spaces arise in practice whenever the outcomes of - 21 - experiments are measurements of physical properties such as temperature, speed, pressure, length, that are measured on continuous scales. (C) Event An event A is a collection of some of the possible outcomes of the random experiment. In the language of the set theory, A is a subset of the sample space S i.e. AS S In particular, we may regard each individual outcome as an both and S may be event; since each outcome is a subset of S. (These are A as events. is regarded sometimes called elementary events). Also, the null set known as the impossible (i.e. the set which consists of no outcomes) and the complete event and S as the certain sample space S are each special cases of subsets of S, and thus event. For example, in tossing a die, we might consider the following as typical events; A1 : Score is 1, i.e. {1} A2 : Score is even, i.e. {2,4,6} A3 : Score is less than 5, i.e. {1,2,3,4}. (D) Occurrence of an event. We say that an event A had occurred if the outcome of the experiment belongs to the set A. For example, if the outcome in throwing a die was a score 6, then, as described above, A2 occurred, but A1 did not occur and A3 did not occur. S (E) Union of Events. Given two events A & B, the union of A & B, written A AB B AB is defined as the set of outcomes which belong to either A or B or both. Thus, the event (A B) occurs whenever A or B occurs or both occur. - 22 - (F) Intersection of events. S The intersection of A & B written as A B is defined as the set of outcomes which belong to both A and B. The set (A B) occurs when both A & B occur. A B A B S (G) Mutually exclusive Events Two events A and B are called mutually exclusive if both A and B cannot occur in the same time i.e. if A B AB= so that A and B are disjoint. A B= (H) Complimentary event Given an event A, the complimentary, A A (or A/ ), is defined as that subset of S which do not belong to A. Thus the event A A occurs whenever A does not occur and vice versa. (I) Equally Likely Events. If no outcome of a random experiment is any more likely to occur than any other, The outcomes are said to be equally likely. 2.3 Classical Probability Until fairly recently, probability was thought of by statisticians and mathematicians only as an objective phenomenon derived from objective processes. The concept of objective probability may be categorized further under the headings of: 1- Classical, or a priori, probability and 2- The relative frequency, or a posteriori, concept of probability. - 23 - We may define probability in the classical sense as follows: If a random experiment can result in N mutually exclusive and equally likely outcomes, n(A) of which corresponds to the occurrence of some event A, then the probability that the event A will occur, denoted by P(A), will defined as the ratio n(A)/N, symbolically. n(A) number of outcomes belong to A P(A) = = (2.1) N total number of outcomes belong to S The relative frequency approach to probability depends on the repeatability of some process and the ability to count the number of repetitions, as well as the number of times that some event of interest occurs. We may define the probability of observing some characteristic, A, of an event as follows: If some process is repeated a large number of times, N, and if some resulting event with the characteristic, A, occurs n times, the relative frequency of occurrence of A, n/N, will be approximately equal to the probability of A. For instance, consider the fair die. If our experiment consists of throwing a fair (unbiased) die, then there are 6 equally likely outcomes. We decide that the outcomes are equally likely by reasoning that if the die fair and a face (a number) is turned up at random there is certainly no reason to expect any face to be more likely to be turned up than any other. On any throw of a die, just one face and not other will turned up, so we conclude that the outcomes are mutually exclusive. Example 2.4 Throw an unbiased coin three times and observe the sequence of heads and tails. Here the sample space is the collection of all possible sequences, S = { HHH, HHT, HTH, THH, HTT, THT, TTH, TTT} Since the outcomes are equally likely and mutually exclusive then the probability of each outcome is 1/8. Let A be the event that two or more heads appear consecutively, and B that all the tosses are the same. Then A = { HHH, HHT, THH } and B = { HHH, TTT } Therefore n( A) 3 n(B) 2 1 P( A)= = , P(B)= = = N 8 N 8 4 - 20 - n( AB) 1 P(AB)= = P ( { HHH } ) = , and N 8 n( AB) 1 P(AB)= = P ( { HHH , HHT , THH , TTT } ) = N 4 Example 2.5 There are 15 balls, numbered from 1 to 15 in a bag. If a person selects one at random, what is the probability that the number printed on the ball will be i- a prime number greater than 5. ii- an odd number less than 11. Solution Let A1 = { a prime number greater than 5 } A2 = { an odd number less than 9 } then A1 = { 7, 11, 13 } , A2 = { 1, 3, 5, 7 }, N = 15 , n( A1 ) = 3 , n( A2 ) = 4 n ( A1 ) 3 1 n ( A2 ) 4 P ( A1 ) = = = and P ( A2 ) = = N 15 5 N 15 Find P(A1 A2) and P(A1 A2) ? Example 2.6 A fair die is tossed twice. What is the probability that the sum of the upturned faces is 9? Solution The sample space for this experiment is S = { (i , j) : i = 1, 2,...,6 ; j = 1, 2,...,6 } Since the die is fair (unbiased), each of the 36 possible outcomes would be equally likely to occur. If A represents the event that the sum of the upturned faces is 9 then, A = {(3, 6) , (4, 5) , (5, 4) , (6, 3)} hence, P(A) = 4/36 = 1/9 - 21 - 2.4 Axiomatic Approach To Probability For a given experiment, S denotes the sample space, a set real function that associates a real value P(A) with each event A is called a probability function (or a probability measure), and P(A) is called the probability of A, if the following properties are satisfied: Axiom I. P(A) ≥ 0, for every event A S. Axiom II. P(S) = 1, Axiom III. If A1, A2, A3,.... , is a finite or infinite sequence of mutually exclusive event of S, then P(A1 A2 A3 ...) = P(A1) + P(A2) + P(A3) +... These axioms all seem to agree with our intuitive concept of probability and these few axioms are sufficient to allow a mathematical structure to be developed. Note, if A and B are two mutually exclusive events of S, then P(A B) = P(A) + P(B) Example 2.7 Three horses A, B and C are in a race. A is twice as likely to win as B and B is twice as likely to win as C. What is their respective probabilities of winning, i.e. P(A), P(B) and P(C)? Solution Let P(C) = p, since B is twice as likely to win as C, then P(B) = 2p, and since A is twice as likely to win as B, then P(C) = 4p Now the sum of the probabilities must be one (by property II), hence P(A) + P(B) + P(C) = 1, i.e. 4p + 2p + p = 1 Therefore, p=1/7, and accordingly - 22 - P(A) = 4p = 4/7, P(B) = 2p =2/7 and P(C) = p = 1/7. Probability in Discrete Spaces If A is an event in a discrete sample space S, then P(A) equals the sum of the probabilities of the individual outcomes comprising A. In particular if the sample space is given by S = { e1 , e2 , e3 ,...} and we suppose that to each elementary event {ei} we assign a real number pi, so that P({ei}) = pi such that; pi 0 for all i ; and pi = 1 (2.2) i Then the probability of an event A is given by P ( A ) = pi i : eiA It is understood , with this notation, that the summation is taken over all indices i such that ei is an outcome in A. Example 2.8 A die is loaded in such a way that each odd number is twice as likely to occur as each even number. Find P(G), where G is the event that a number greater than 3 occurs on a single roll of the die. Solution The sample space is S = {1, 2, 3, 4, 5, 6}. Hence, if we assign probability p to each even number and probability 2p to each odd number, we find that 2p + p + 2p + p + 2p + p = 9 p = 1 1 in accordance with axiom II. It follows that p = 9 1 2 1 4 P( G ) = + + = 9 9 9 9 If a sample space is accountably infinite, probabilities will have to be assigned to the individual outcomes by means of a mathematical rule, preferably by means of a formula or equation. Example 2.9 - 23 - If, for a given experiment, e1 , e2 , e3 ,...., is an infinite sequence of outcomes, verify that i 1 P ( { ei } ) = for i = 1 , 2 , 3 ,.... 2 is, indeed, a probability measure. Solution Since the probabilities are all positive, it remains to be shown that P(S) = 1. Getting 1 1 1 1 P( S ) = + + + + 2 4 8 16 and making use of the formula for the sum of the terms of an infinite geometric progression, we find that 1/2 P( S ) = =1 1 - 1/2 If ei is the event that a person flipping a balanced coin will get a tail for the first time on the ith flip of the coin, then the appropriate probability is i 1 P ( { ei } ) = for i = 1 , 2 , 3 ,.... 2 Thus, the probability that the first tail will come on the third, fourth, or fifth flip of the coin is 3 4 5 1 1 1 7 + + = 2 2 2 32 and the probability that the first tail will come on an odd-numbered flop of the coin is 1 3 5 1 1 1 1/2 2 + + += = 2 2 2 1 - 1/4 3 Here again we made use of the formula for the sum of the terms of an infinite geometric progression. 2.5 Some Rules of Probability Based on the three axioms of probability, we can derive many other rules which have important applications. Among them, the next four theorems are immediate consequences of the axioms. Theorem 2.1 For any sample space S, p( ) = 0 Proof - 24 - Since S and φ are mutually exclusive and S = S in accordance with definition of the empty set φ, it follows that P( S )=P( S ) =P( S )+P( ) ( by Axiom III ) and, hence, that P( ) = 0. Theorem 2.2 If A and A/ are complementary events in a sample space S, then P(A ) = 1 - P(A) Proof In the second and third steps of the proof that follows, we make use of the definition of a complement, according to which A and A/ are mutually exclusive and A A/ = S. Thus, we write 1= P ( S ) ( by Axiom II ) =P ( A A ) =P ( A )+P( A ) ( by Axiom III ) and it follows that P( A ) = 1 - P(A). Theorem 2.3 If A and B are events in S and A B, then P(A) P(B). Proof Since A B, we can write B=A ( A B) as can easily be verified by means of a Venn diagram. Then, since A and A/B are mutually exclusive, we get P ( B ) = P ( A ) + P ( A B ) ( by Axiom III ) P ( A ) ( by Axiom I ) In words, this theorem states that if A is a subset of B, then P(A) cannot be greater than P(B). Theorem 2.4 For any event A in S, 0 P(A) 1. Proof - 25 - Using Theorem 2.3 and the fact that AS for any event A in S, we have P ( ) P ( A ) P ( S ) Then, P(φ) = 0 and P (S) = 1 leads to the result that 0 P (A) 1 Theorem 2.5 If A and B are any two events in S, then the probability of occurrence of A and non- occurrence of B is given by P( A B ) = P( A \ B ) = P(A) - P(A B) Proof The approach will be to express the event A as unions of mutually exclusive events. From set properties we have A = ( A B ) ( A B) It follows that A B and A B are mutually exclusive, so that P(A) = P(A B ) + P(A B) Hence the result. Theorem 2.6 If A and B are any two events in a sample space S, then P(A B) = P(A) + P(B) - P(A B) Proof Expressing the event A B as union of mutually exclusive events as AB=(A B)B It follows that A B/ and B are mutually exclusive, so that P(A B) = P(A B ) + P(B) = [P(A) - P(A B)] + P(B) = P(A) + P(B) - P(A B) Repeatedly using the formula of theorem 2.6, we can generalize this addition rule so that it will apply to any number of events. For instance, for three events we get, - 26 - P(A B C) = P(A) + P(B) + P(C) - P(A B) - P(A C) - P(A C) + P(A B C) Example 2.10 The probability that Jaillan Gabr passes mathematics is 0.75, and passes English is 0.85. If the probability of passing at least one course is 0.9, what is the probability that Jaillan will pass both courses? Solution If M is the event "passing mathematics" and E the event "passing English", then by transposing the terms in the Additive Rule, ) we have P( M E )=P( M )+P( E )-P( M E) 0.9 = 0.75 + 0.85 - P ( M E ) From which the probability that Jaillan will pass both courses is P(M E) = 0.7 Questions: what is the probability that Jaillan will fail both courses? what is the probability that Jaillan will pass one and only one course? 2.6 Conditional Probability A major objective of probability modeling is to determine how likely it is that an event A will occur when a certain experiment is performed. However, there are numerous cases in which the probability assigned to A will be affected by knowledge of the occurrence or nonoccurrence of another event B. In such an example we will use the terminology "conditional probability of A given B", and the notation P(A/B) will be used to distinguish between this new concept and ordinary probability P(A). As an illustration, consider the event B of getting a 4 when a fair die is tossed. Based 1 on the sample space S = {1,2,3,4,5,6} the probability of B occurring is. Now suppose that 6 it is known that the toss of the die resulted in a number greater than 3. We are now dealing with a reduced sample space A= {4,5,6}, which is a subset of S. Relative to the sample space A, we find that the probability that B occurs is 1/3. This example illustrates that events may have different probabilities when considered relative different sample spaces, and we would write - 27 - 1 P(B / A)= , 3 which can also be written 1 1/6 P( AB) P(B / A)= = = 3 3/6 P( A) where P(A B) and P(A) are found from the original sample space S. In other words, a conditional probability relative to a subspace A of S may be calculated directly from the sample space S itself. Definition If A and B are any two events in S and P(A) > 0, the conditional probability of B given A is P( A B ) P( B / A )= (2.3) P( A ) Example 2.11 A box contains 100 microchips, some of which were produced by factory 1 and the rest by factory 2. Some of the microchips are defective and some are good (non-defective). An experiment consists of choosing one microchip at random from the box and testing whether it is good or defective. Let A be the event "obtaining a defective microchip"; consequently, and B be the event "the microchip was produced by factory 1" , then A is the event "obtaining a good microchip". B is the event "the microchip was produced by factory 2". B B Totals A 15 5 20 A 45 35 80 Totals 60 40 100 Table 2.1 Numbers of defective and nondefective microchips from two different factories Table 2.1 gives the number of microchips in each category. The probability of obtaining a defective microchip is n ( A ) 20 P( A)= = = 0.20 n ( S ) 100 - 28 - Now suppose that each microchip has a number stamped on it that identifies which factory produced it. Thus, prior to testing whether it is defective, it can be determined whether B has occurred (produced by factory 1) or B/ has occurred affects likelihood that a defective microchip is selected, and the use of conditional probability is appropriate. For example, if the event B has occurred, then the only microchips we should consider are those in the first column of Table 2.1, and the total number is n(B) = 60. Furthermore, the only defective chips to consider are those in both the first column and the first row, and the total number is n(AB) = 15. Thus, the conditional probability of A given B is n ( A B ) 15 P( A/B)= = = 0.25 n(B) 60 Notice that if we divide both the numerator and denominator by n(S) = 100, we can express conditional probability in terms of some ordinary unconditional probabilities, n( AB)/n(S) P( AB) P( A/B)= = n(B)/n(S) P(B) Example 2.12 The probability that a student, selected at random from certain college, will pass economics is 0.8 and will pass in both economics and religion is 0.5. What is the probability that he will pass religion if it is known that he had passed economics? Solution If E is the event "passing economics" and R the event "passing religion", then P ( E R ) 0.5 5 P(R/ E)= = = = 0.625 P(E) 0.8 8 Multiplication Rule If we multiply the expressions on both sides of (2.3) by P(A), we obtain the following multiplication rule Theorem 2.7 If A and B are two events in S, then P( A B )=P( A ) P( B / A )=P( B ) P( A/B ) (2.4) - 29 - Example 2.13 If we randomly pick two television tubes in succession from a shipment of 240 television tubes of which 15 are defective, what is the probability that they will both defective? Solution If we assume equal probabilities for each selection (which is what we mean by "randomly" picking the tubes), the probability that the first tube will be defective is 15 240 and the probability that the second tube will be defective given that the first tube is defective is 14 239 Thus, the probability that both tubes will be defective is 15 14 7. = 240 239 1 , 912 This assumes that we are sampling without replacement, namely, that the first tube is not replaced before the second tube is selected. Theorem 2.7 can easily be generalized so that it applies to more than two events; for instance, for three events we have Corollary: If A, B, and C are any three events in a sample space S such that P(A B) > 0, then P( A B C ) = P( A ). P( B / A ). P( C / A B ) Proof Writing A B C as ( A B ) C and using the formula of Theorem 2.7 twice, we get P( A B C ) = P[( A B ) C] = P( A B ). P( C / A B] = P(A).P(B/A).P(C/AB) Example 2.14 A box of fuses contains 20 fuses, of which 5 are defective. If 3 of the fuses are selected at random and removed from the box in succession without replacement, what is the - 30 - probability that all three fuses are defective? Solution If D1 is the event that the first fuse is defective, D2 is the event that the second fuse is defective, and D3 is the event that the third fuse is defective, then 5 4 3 P ( D1 ) = , P( D 2 / D1 ) = , P( D 3 / D 2 D1 ) = 20 19 18 and substitution into the formula yields 5 4 3 1 P( D1 D2 D3 ) =.. = 20 19 18 114 Further generalization of Theorem 2.7 and its corollary to k events is straightforward, and the resulting formula can be proved by mathematical induction. 2.7 Total Probability and Bayes' Theorem There are many situations where the outcome of an experiment depends on what happens in various intermediate stages. Suppose, in general the intermediate stage permits k different alternatives (whose occurrence is denoted by B1, B2,..., Bk). The events B1, B2,..., Bk, in this case, constitute a partition of the sample space S (i.e. the B's are pairwise mutually exclusive and their union equals S ). To find the probability of occurrence of an event A that can occurs with one of the B's events, the following theorem is required , sometimes called the law or the rule of total probability. Theorem 2.8 : Total Probability If the events B1, B2,..., Bk constitute a partition of the sample space S and P(Bi) > 0 for i = 1, 2,..., k, then for any event A in S k P ( A ) = P ( Bi ). P ( A/ Bi ) (2.5) i =1 - 31 - Proof S Since the events B1, B2,..., Bk are exhaustive (i.e B1 their union equals S), then A A = A S = (A B1) (A B2) ... (A Bk) Since the events B1, B2,..., Bk are mutually exclusive, then A B1 , A B2 ,..., A Bk are also mutually B2 Bk exclusive. It follows that; k P ( A ) = P (A B i ) i =1 and the theorem results from applying (2.4) to each term in this summation. Example 2.15 The members of a consulting firm rent cars from three rental agencies: 60% from agency 1, 30 % from agency 2, and 10 % from agency 3. If 9% of the cars from agency 1 need a tune-up, 20 % of the cars from agency 2 need a tune-up, and 6% of the cars from agency 3 need a tune-up, what is the probability that a rental car delivered to the firm will need a tune- up? Solution If A is the event that the car needs a tune-up, and B1, B2, and B3 are the events that the car comes from rental agencies 1, 2, or 3, we have P(B1) = 0.60, P(B2) = 0.30, P(B3) = 0.10, and P(A/B1) = 0.09, P(A/B2) = 0.20, and P(A/B3) = 0.06. Substituting these values into the formula of Theorem 2.8, we get P(A) = (0.60)(0.09) + (0.30)(0.20) + (0.10)(0.06) = 0.12 Thus, 12% of all the rental cars delivered to this firm will need a tune-up. Tree diagram It is sometimes helpful to illustrate this result with a tree diagram given in figure 2.1. The probability associated with branch Bi is P(Bi), and the probability associated with each branch labeled A is a conditional probability P(A/Bi ), which may be different depending on which branch, Bi , it follows. In order for A to occur, it must occur jointly with one and only - 32 - one of the events Bi. B1 A B2 A ∶ ∶ ∶ BK A With reference to the preceding example, suppose that we are interested in the following question: If a rental car delivered to the consulting firm needs a tune-up, what is the probability that it came from rental agency 2? To answer questions of this kind, we need the following theorem, called Bayes' theorem: Theorem 2.9 : Bayes' Formula If B1 , B2 ,..., Bk constitute a partition of the sample space S and P(Bi) > 0 for i = 1, 2,..., k, then for any event A in S such that P(A) > 0, P ( Br ). P ( A / Br ) P ( Br / A ) = k for r = 1, 2,..., k (2.6) P ( Bi ). P ( A / Bi ) i =1 In words, the probability that event A was reached via the rah branch of the tree diagram of Figure 2.1, given that it was reached via one of its k branches, is the ratio of the probability associated with the rah branch to the sum of the probabilities associated with all k branches of the tree. Proof Writing in accordance with the definition of conditional probability, we have only to substitute P( Br ). P( A / Br ) for P( A Br ) and the formula (2.5) for P( A ). P ( A Br ) P ( Br / A ) = P( A) Example 2.16 With reference to Example 2.15, if a rental car delivered to the consulting firm needs a - 33 - tune-up, what is the probability that it came from rental agency 2? Solution Substituting the probabilities given in Example 2.15 into the formula of Theorem 2.9, we get ( 0.30 ) ( 0.20 ) P ( B2 / A ) = ( 0.60 ) ( 0.09 ) + ( 0.30 )( 0.20 ) + ( 0.10 ) ( 0.06 ) 0.060 = = 0.5 0.120 Observe that although only 30% of the cars delivered to the firm come from agency 2, 50% of those requiring a tune-up come from that agency. Example 2.17 We are given three similar boxes of microchips as follows: Box B1 contains 20 microchips of which 5 are defective, Box B2 contains 35 microchips of which 7 are defective, and Box B3 contains 40 microchips of which 5 are defective. A box selected at random, then a microchip is selected at random from the box. If the component obtained is defective, find the probability that it came from box 2. - 34 - Solution Defining the events B1, B2 and B3 to be respectively choosing "box 1", "box 2" and "box 3". Let A be the event "obtaining a defective microchip". Thus, we have P(B1) = P(B2) = P(B3 ) = 1/3 Box Microchip 5/20 A B1 15/20.A 1/3 1/3 7/35 A. B2 28/35 A 1/3 5/40 A B3 35/40 A Therefore; 1 5 1 7 1 5 23 P ( A )= + + = 3 20 3 35 3 40 120 Now applying Bayes' formula (2.6), with k=3 and r=2 to find the probability that the component came from box 2 given that it is defective as follows 1 7 P(B 2 ) P(A / B2 ) 3 35 8 P(B 2 /A) = = = = 0.348 P(A) 23 23 120 The tree diagram, given above, describes this process and gives the probability of each branch of the tree. 2.8 Independent Events Informally speaking, two events A and B are said to be independent if the occurrence or nonoccurrence of either one does not affect the probability of the occurrence of the other. Symbolically, two events A and B are independent if - 35 - P(B /A) = P(B ) and P (A/B) = P(A) and it can be shown that either of these equalities implies the other when both of the conditional probabilities exist, namely, when neither P(A) nor P(B) equals zero. Now, if we substitute P(B) for P(B/A) into formula (2.3), we get P (A B ) = P (A) P(B /A) = P(A) P (B) and this is used as the formal definition of independence. Definition Two events A and B are independent iff P( AB)=P( A ) P( B ) otherwise, A and B are dependent. It can be shown that if A and B are events such that P(A) > 0 and P(B) > 0, then A and B are independent iff either of the following holds P ( B / A ) = P ( B ) and P ( A / B ) = P ( A ) Example 2.18 A fair coin is tossed three times. If A is the event that a head occurs on each of the first two tosses, B is the event that a tail occurs on the third toss, and C is the event that exactly two tails occur in the tree tosses, show that a- events A and B are independent; b- events B and C are dependent. Solution Since the coin is fair then the eight possible outcomes, HHH, HHT, HTH, THH, HTT, THT, TTH, and TTT, are equally likely, and A = {HHH, HHT} B = {HHT, HTT, THT, TTT} C = {HTT, THT, TTH} A B = {HHT} B C = {HTT, THT} the assumption that the eight possible outcomes are all equiprobable yields P(A) = 1/4 , P(B) = 1/2 , P(C) = 3/8, P(A B) = 1/8 and P(B C) = 1/4. - 36 - a- Since P(A). P(B) = 1/4.1/2 = P(A B), the events A and B are independent. b- Since P(B). P(C) = 1/2.3/8 P(B C), the events B and C are not independent. In connection with the definition of independence, given above, it can be shown that if A and B are independent, then so are A and B , A and B, and A and B. For instance: Theorem 2.10 If A and B are independent, then A and B are also independent. Proof Since A = ( A B) (A B ), A B and A B are mutually exclusive, and A and B are independent by assumption, we have P(A) = P[(A B) (A B )] = P(A B) + (A B ) = P(A). P(B) + P(A B ) It follows that P(A B ) = P(A) - P(A). P(B) = P(A). [1 - P(B)] = P(A). P( B ) and hence that A and B are independent. The reader is asked to show that if A and B are independent, then A/ and B are independent and so are A/ and B/, and if A and B are dependent, then A and B/ are dependent. To extend the concept of independence to more than two events, let us make the following definition. Definition Events A1, A2,...., Ak are independent iff the probability of the intersection of any 2, 3,... or k of these events equals the product of their respective probabilities. For three events A, B, and C, for example, independence requires that; P(A B) = P(A). P(B) P(A C) = P(A). P(C) P(B C) = P(B). P(C) - 37 - and P(A B C) = P(A). P(B). P(C) It is of interest to note that three or more events can be pairwise independent without being independent. A common example of dependent events occurs in connection with repeated sampling without replacement. In example 2.13 we considered the results of drawing fuses in succession. Suppose that, instead, the outcome of the first fuse is tested and then the fuse is replaced in the box before the second draw is made and the second fuse is replaced in the box before the third draw is made. This type of sampling is referred to as sampling with replacement, and it would reasonable to assume that the draws are independent trials. In this case the probability that all three fuses are defective is 5 5 5 1 P( A B C ) = P(A). P(B). P(C) =.. = 20 20 20 64 - 38 -