ELEC 2600 Probability and Random Processes Fall 2024 PDF

ELEC 2600 Probability and Random Processes in Engineering Fall 2024 Semester Teaching Team Instructor Teaching Associate Prof. Ling PAN...

ELEC 2600 Probability and Random Processes in Engineering Fall 2024 Semester Teaching Team Instructor Teaching Associate Prof. Ling PAN MY CHANG Office: 2447 Office: 2395 Email: [email protected] Email: [email protected] 1 ELEC 2600 Probability and Random Processes in Engineering Fall 2024 Semester Teaching Team Teaching Assistants Haoran HE Sijia LI Hang ZHAO [email protected] [email protected] [email protected] 2 ELEC 2600 Probability and Random Processes in Engineering Fall 2024 Semester Lecture 1: Course Introduction 3 Elec 2600: Lecture 1 ❑ Course Details ❑ Models in Engineering o Their role o Types of models ❑ Relative Frequency ❑ Applications of Probability ❑ An Interesting Problem – Winning an iPad 4 Course Details ❑ Course resources available through ❑ Textbook: canvas o Probability, Statistics and Random o Description Processes for Electrical Engineering, o Schedule Alberto Leon-Garcia, Addison Wesley, 3rd ed., 2009. o Syllabus o Lecture notes o Video links o Grading policy o Contact information all available through canvas 5 Course Details ❑ Course Structure o Part I: Basic probability theory. (~2-3 weeks) o Part II: Single random variables. (~3-4 weeks) o Part III: Multiple random variables. (~5 weeks) o Part IV: Stochastic processes. (~3 weeks) ❑ Objective of the course o Basic concepts of probability theory required to understand probability models used in engineering. o Basic techniques required to develop probability models. 6 ELEC 2600: Probability and Random Processes in Engineering Part I: Basic Probability Theory Lecture 1: Course Introduction Lecture 2: Build a Probability Model Lecture 3: Conditional Probability & Independence Lecture 4: Sequential Experiments Out-of-Class Reading: Counting Methods 7 Elec 2600: Lecture 1 ❑ Course details ❑ Models in Engineering o Their role o Types of models ❑ Relative Frequency ❑ Applications of Probability ❑ An Interesting Problem – Winning an iPad 8 Modelling ❑ In engineering as well as in daily life, we are faced with choices ❑ Our decisions are based on our belief – our model – of how things will behave ❑ A model is an approximate representation of a physical situation which attempts to explain observed behavior ❑ Good models o are simple and understandable o predict all relevant aspects of the situation ❑ Models can help us avoid costly (in time and money) experimentation 9 Mathematical and Computer Simulation Models ❑ Mathematical model o A set of assumptions about how a system or physical process works. o Assumptions are stated in the form of mathematical relations between the important parameters and variables. o By solving these relations, we can predict the outcome of an experiment, e.g. V = I*R o The simpler the model, the easier it is to solve, understand and analyze. o However, simple models may not accurately describe the actual system of interest. ❑ Computer simulation model o Assumptions are stated in the form of a computer program. o By running the program, we can predict the outcome of an experiment, e.g. Spice. o Computer models can represent systems in greater detail, and more accurately predict performance, but are generally more difficult to analyze precisely. 10 Deterministic versus Probabilistic Models ❑ Deterministic Model o The conditions of the experiment determine the exact outcome o If I do A, then B will happen. o Example: Ohm’s Law (V = I*R), Kirchoff’s current and voltage laws. o In practice, there will always be random deviations if you repeat the same experiment twice, but a deterministic model is OK if these are small. ❑ Probabilistic Model o The conditions of the experiment determine the probability of different outcomes. o If I do A, then B will probably happen, but it is possible that C might happen instead. o Example: tossing of a coin, rolling a die o In many cases, probabilistic models are used because we cannot model all the relevant variables required by a deterministic model. 11 More on Probabilistic Models ❑ Allow designers to model very complex systems ❑ Nearly all practical systems cannot be described exactly, and have some sort of randomness ❑ Systems can be designed and optimized “off-line”, without physical implementation – save time and $$$ ❑ A complex system may involve MANY models! 12 Other Examples Search engines Face verification system Digital camera Fingerprint scanner Radar system 13 Elec 2600: Lecture 1 ❑ Course details ❑ Models in Engineering o Their role o Types of models ❑ Relative Frequency ❑ Applications of Probability ❑ An Interesting Problem – Winning an iPad 14 Example: Ball picking from a Urn ❑ An urn contains 3 balls, labeled 1, 2 or 3. ❑ A ball is selected, its number recorded (outcome) and the ball returned. ❑ This process is repeated for n (e.g. 100) times (trials). o RANDOM Experiment: Outcome cannot be exactly determined! 5 4 3 2 1 0 -1 15 Relative Frequency ❑ Let N1(n), N2(n) and N3(n) be the number of times that we pick balls 1, 2, and 3 in n trials (events). N k (n ) ❑ Define the relative frequency of the outcome k as f k (n ) = n ❑ This experiment exhibits statistical regularity: as n increases, the relative frequency approaches a constant value lim f k (n ) = pk n → where pk indicates the probability of outcome k. Provides a key connection between measurement of physical quantities and probability models! 16 Properties of the relative frequency (1) ❑ Since 0 ≤ Nk(n) ≤ n, N k (n ) 0  f k (n ) = 1 n The relative frequencies are between zero and one. K ❑ Let K = the number of possible outcomes. Since  N (n ) = n k =1 k K K N k (n )  f (n ) =  k =1 k k =1 n =1 The relative frequencies sum to one. 17 Properties of the relative frequency (2) ❑ Events are groupings of the outcomes of an experiment (sets): o ONE = “the ball picked is labeled 1” = {1} o NOT_THREE = “the ball picked is not labeled 3” = {1 or 2} o ODD = “the ball picked is labelled with an odd number” = {1 or 3} ❑ We can often derive the relative frequency of one event from the relative frequency of other events. Example: Since NODD(n) = N1(n) + N3(n), N ODD (n ) N1 (n ) + N 3 (n ) f ODD (n ) = = = f1 (n ) + f 3 (n ) n n ❑ More generally, if C = {A or B} and A and B cannot occur simultaneously (mutually exclusive), f C (n ) = f A (n ) + f B (n ) Above 3 properties coincide with AXIOMS of probability (discussed next lecture) 18 Elec 2600: Lecture 1 ❑ Course details ❑ Models in Engineering o Their role o Types of models ❑ Relative Frequency ❑ Applications of Probability ❑ An Interesting Problem – Winning an iPad 19 Machine Learning and AI RL Reference: https://whiteswami.wordpress.com/machine-learning/ 20 What is AI? The science of making machines that: Think like people Think rationally Act like people Act rationally Rational Decisions ❑ We’ll use the term rational in a very specific, technical way: ▪ Rational: maximally achieving pre-defined goals ▪ Rationality only concerns what decisions are made (not the thought process behind them) ▪ Goals are expressed in terms of the utility of outcomes ▪ Being rational means maximizing your expected utility Maximize Your Expected Utility NLP - ChatGPT CV - Stable Diffusion 25 RL - Robotics 26 Recommender Systems 27 Recommender Systems ❑ Beneficial to both service providers and users. ❑ Reduce transaction costs of finding and selecting items in an online shopping environment. ❑ Improve decision making process and quality. ❑ Enhance revenues, e.g., 35% of Amazon.com’s revenue is generated by its recommendation engine. ❑ Three types of recommendation engines: o Collaborative filtering o Content-Based filtering o Hybrid recommendation systems Reference: 1. https://www.sciencedirect.com/science/article/pii/S1110866515000341 2. https://www.youtube.com/watch?v=BKCAkHn8jqA 3. https://www.marutitech.com/recommendation-engine-benefits/ 28 Recommender Systems ❑ Collaborative Filtering (Amazon) Users’ Compute the Groups Recommend items that behaviors, Pearson correlation users by were positively rated Activities, coefficients between their by users from the preferences users similarities same group ❑ Content-Based Filtering (LIBRA) Collect features Naïve Bayes Classifier extracted from Given sets of Categorize an Recommend a items that has features, compute the item to the ranked list of evaluated probability that an most possible items positively by the item belongs to each class user class ❑ Hybrid recommendation systems: combine collaborative filtering and content-based filtering (Netflix) Reference: 1. https://www.sciencedirect.com/science/article/pii/S1110866515000341 2. https://www.youtube.com/watch?v=BKCAkHn8jqA 3. https://www.marutitech.com/recommendation-engine-benefits/ 29 Medical Diagnosis ❑ An efficient and accurate medical diagnosis can reduce healthcare cost. ❑ A medical diagnosis system will train itself based on patients’ data, such as symptom, blood pressure, etc. ❑ The system can learn the probability distribution of a disease based on patients data. ❑ For a potential patient, the system computes the probability of getting a disease based on that patient’s data. ❑ Example diagnosis systems include nephritis disease (腎炎), heart disease, diabetes, etc. Reference: 1. https://arxiv.org/pdf/1803.10019.pdf 2. http://www.zsf.jcu.cz/jab_old/11_2/havel.pdf 30 More Examples ❑ Casino strategy: bet $1, $2 if losing the 1st one, $4 if losing the 2nd one, and so on. o What is the winning/losing probability? ❑ Communications over unreliable channels: bit “0” and bit “1” may be interpreted incorrectly with an error probability ❑ Prediction of signals: use available signal portion to predict for the future, or to analyze and synthesize signals 31 More Examples ❑ System reliability: failure probability, average time to fail ❑ Resource-sharing systems: banks, telephone, etc. ❑ Internet systems: simple client-server, peer-to-peer ❑ Many, many more... ❑ Indeed, probability theory is at the heart of many technologies we rely on daily. 32 Elec 2600: Lecture 1 ❑ Course Details ❑ Models in Engineering o Their role o Types of models ❑ Relative Frequency ❑ Applications of Probability ❑ An Interesting Problem – Winning an iPad 33 34 Something to think about (i): ❑ Suppose exactly 1 out of 3 boxes contains an iPad, and the remaining two boxes contain a homework 35 Something to think about (ii): ❑ You pick one randomly 36 Something to think about (iii): ❑ I take away one of the boxes which you did NOT select, and I open a box containing the homework 37 Something to think about (iv): ❑ I take away one of the boxes which you did NOT select, and I open a box containing the homework Will you change your choice? Why / Why not? 38 Something to think about (v): ❑ Case 1: You choose the ipad (1/3) ❑ Case 3: You choose homework B (1/3) I must open the box I open one of the boxes with homework with homework A Lose after changing Win after changing ❑ Case 2: You choose homework A (1/3) ❑ Result comparison: I must open the box with homework B o Original win rate: 1/3 o Win rate of changing: 2/3 Win after changing 39 Summary ❑ This lecture we have discussed: o Overall course details o The role and types of probability models in engineering o The concept of relative frequency o Example: Winning an iPad ❑ Next Lecture: Random Experiments and Probability Axioms (more)! 40 ELEC 2600: Probability and Random Processes in Engineering Fall 2024 Semester Lecture 2: Build a Probability Model 41 ELEC 2600: Probability and Random Processes in Engineering Part I: Basic Probability Theory Lecture 1: Course Introduction Lecture 2: Build a Probability Model Lecture 3: Conditional Probability & Independence Lecture 4: Sequential Experiments Out-of-Class Reading: Counting Method 42 Lecture 2: Build a Probability Model ❑ Specifying Random Experiments o Sample spaces and events ❑ Set Operations ❑ The Three Axioms of Probability o Corollaries ❑ Probability Laws for Assigning Probabilities 43 Specifying Random Experiments ❑ A random experiment is specified by stating an experimental procedure and one or more measurements or observations. ❑ The sample space S is the set of all possible outcomes. o An outcome is a result that cannot be decomposed into other results. Note that the sample space depends on the observations (compare Examples 1 and 2 or Examples 3 and 4). o S is a discrete sample space if the number of outcomes is countable (maps to the positive integers). o S is a continuous sample space if S is not countable. ❑ An event specifies certain conditions for an outcome. It can be represented by a subset, A, of the sample space S. o An event A occurs if the outcome is a member of A. o The certain event, S , consists of all outcomes. o The null event,  contains no outcomes. o An elementary event contains only one outcome. (Figure from CS188) 44 Discrete Sample Spaces ❑ Experiment 1 : Select a ball from an urn containing balls labeled 1 to 50. Note the number on the ball. S1 = {1, 2, 3, 4, ……,49, 50} A1 = “An even numbered ball is selected” = {2, 4, 6, …,48, 50} ❑ Experiment 2 : Select a ball from an urn containing balls labeled 1 to 50. Note the number in the tens place. S2 = {0, 1, 2, 3, 4, 5} A2 = “The number is more than 2” {3, 4, 5} ❑ Experiment 3a : Toss a coin in the air and note which side faces up when it lands. The two sides of a coin are often called “heads” (H) and “tails” (T). S3a = {H, T} A3a = “The coin lands with heads up” = {H} ❑ Experiment 3b : Toss a coin three times. Note the sequence of heads/tails. S3b = {HHH, HHT, HTH, THH, TTH, THT, HTT, TTT} A3b = “The three tosses give the same outcome” = {HHH, TTT} 45 Discrete Sample Spaces ❑ Experiment 4 : Toss a coin three times and note the number of heads. S4 = {0, 1, 2, 3} A4 = “The number of heads equals the number of tails” = 𝜙 ❑ Experiment 5 : A block of information is transmitted repeatedly over a noisy channel until an error-free block arrives at a receiver. Count the number of transmissions required. S5 = {1, 2, 3, 4, ………..} A5 = “Less than 10 transmissions are required.” = {1, 2,.., 8, 9} 46 Continuous Sample Space ❑ Experiment 6 : Pick a number at random between zero and one. S6 = {x : 0  x  1} = 0,1 A6 A6 =" The number is greater th an 0.5" 0 1 = {x : 0.5  x  1} = (0.5,1 S6 ❑ Experiment 7 : Measure the time between two typhoons. S7 = {t : t  0} = [0, ) S7 A7 =" Less than 2 months elapse" = t : 0  t  2 = [0,2) 0 2 A7 47 Elec 2600: Lecture 2 ❑ Specifying Random Experiments o Sample spaces and events ❑ Set Operations (Review) ❑ The Three Axioms of Probability o Corollaries ❑ Probability Laws for Assigning Probabilities 48 Set Theory ❑ We use set operations to construct more complex events. ❑ Interesting article(s) on the History of Set Theory http://www-groups.dcs.st-and.ac.uk/~history/HistTopics/Beginnings_of_set_theory.html http://www.absoluteastronomy.com/topics/Georg_Cantor ❑ The Father of Set Theory: Georg Cantor (1845-1918) 49 Set Operations Set Relationships ❑ We use set operations to construct more ❑ A = B if they contain the same outcomes complex events. ❑ Mutual Exclusivity 𝐴⋂𝐵 = 𝜙 ❑ Union 𝐴⋃𝐵 “If A occurs, then B does not”   “A, B or both occur” ❑ Intersection 𝐴⋂𝐵 “Both A and B occur” ❑ Subset or containment 𝐴 ⊂ 𝐵  “If A occurs, then B occurs”  ❑ Complement 𝐴𝑐 or 𝐴ҧ “A does not occur” A Ac 50 Example Let A = {x | 2 ≤ x ≤ 5}, B = {x | 4 ≤ x ≤ 8}, C = {x | 7 ≤ x ≤ 10} A B C 2 4 6 8 10 A  B = {x | 2 ≤ x ≤ 8} A  C = {x | 2 ≤ x ≤ 5 or 7 ≤ x ≤ 10} A  B = {x | 4 ≤ x ≤ 5} AC= A and C are mutually exclusive A and B are not mutually exclusive 51 Useful Properties of Set Operations ❑ Commutative Properties: A  B = B  A and A  B = B  A ❑ Associative Properties: A  (B  C ) = ( A  B)  C and A  (B  C ) = ( A  B)  C ❑ Distributive Properties: A  (B  C ) = ( A  B )  ( A  C ) and A  (B  C ) = ( A  B )  ( A  C ) ❑ DeMorgan’s Rule: ( A  B )c = Ac  B c and ( A  B )c = Ac  B c 52 Set Operations: Examples A B C 2 4 6 8 10 Distributive Properties: A  (B  C ) = ( A  B)  ( A  C ) A (B  C ) A  (B  C ) 2 4 6 8 10 ( A B) ( A C ) ( A  B)  ( A  C ) 2 4 6 8 10 53 Elec 2600: Lecture 2 ❑ Specifying Random Experiments o Sample spaces and events ❑ Set Operations ❑ The Three Axioms of Probability o Corollaries ❑ Probability Laws for Assigning Probabilities 54 Axiomatic Approach to Probability ❑ Assume a random experiment with sample space S. A probability law is a rule that assigns to each event A a number P[A] that satisfies o Axiom I: 0  PA o Axiom II : PS = 1 o Axiom III: if A  B =  , then PA  B = PA + PB ❑ We occasionally need a more general form of Axiom III: If A1, A2, A3,… is a sequence of events such that 𝐴𝑖 ⋂𝐴𝑗 = 𝜙 ∀ 𝑖 ≠ 𝑗, ∞ ∞ then 𝑃 ራ 𝐴𝑘 = ෍ 𝑃 𝐴𝑘 𝑘=1 𝑘=1 55 Mass Analogy ❑ The probability has attributes similar to physical mass. The axioms can be interpreted using this analogy. ❑ Axiom 1:0  PA The probability (mass) of any event (object) is non-negative. ❑ Axiom 2: PS = 1 The total probability (mass) is always equal to one. ❑ Axiom 3: if A  B =  , then PA  B = PA + PB The total probability (mass) of two disjoint events (objects) is equal to the sum of the individual probabilities (masses). 56 Corollary 1   P Ac = 1 − PA Proof: if A  B =  , then PA  B = PA + PB By Axiom III,   A  Ac =   P A  Ac = PA + P Ac   By Axiom II, PS = 1   P A  Ac = PS  = 1 Combining these equations,     PA + P Ac = 1  P Ac = 1 − PA 57 Corollary 2 PA 1 Proof: By Corollary 1,   P Ac = 1 − PA PA = 1 − PAc  By Axiom I 0  PA   P Ac  0 Therefore, PA 1 58 Corollary 3 P = 0 Proof:   Let A=S and Ac = 𝜙 in Corollary 1, P Ac = 1 − PA P  = 1 − PS  = 0 59 Corollary 4 n  n If A1, A2,…. An are pair-wise mutually exclusive, then P   Ak  =  P Ak  for n  2  k =1  k =1 Proof: Use mathematical induction. By Axiom III, the statement is true for n = 2. We assume the statement is true for n ≥ 2, and prove it for n+1: n +1   n   P   Ak  = P   Ak   An +1  n   k =1  k =1     Ak   An +1 =  k =1  n  = P   Ak  + PAn +1  by the distributi ve law k =1  and our assumption of n n +1 =  PAk  + PAn +1  =  PAk  pairwise mutual exclusion k =1 k =1 60 Corollary 5 For any events A and B, PA  B = PA + PB − PA  B Proof: Decompose A∪B as shown. if A  B =  , then PA  B = PA + PB A  B c A  B B  Ac By Axiom 3, 𝑃 𝐴 = 𝑃 𝐴 ∩ 𝐵c + 𝑃 𝐴 ∩ 𝐵 𝑃 𝐵 = 𝑃[𝐵 ∩ 𝐴c ] + 𝑃[𝐴 ∩ 𝐵] Thus, 𝑃 𝐴 + 𝑃 𝐵 = 𝑃 𝐴 ∩ 𝐵c + 𝑃 𝐴 ∩ 𝐵 + 𝑃[𝐵 ∩ 𝐴c ] + 𝑃[𝐴 ∩ 𝐵] By Corollary 4: 𝑃[𝐴 ∪ 𝐵] 61 Corollary 6 If A  B, then PA  PB Proof: B can be split into two mutually exclusive parts: B = A  ( A  B) c  By Axiom III, PB = PA + P Ac  B  B S   A By Axiom I, P Ac  B  0 Ac  B Thus, PA  PB  62 Elec 2600: Lecture 2 ❑ Specifying Random Experiments o Sample spaces and events ❑ Set Operations ❑ The Three Axioms of Probability o Corollaries ❑ Probability Laws for Assigning Probabilities 63 Probability Laws ❑ The axioms and corollaries provide with a set of rules for computing probabilities of events in terms of other events (mathematical manipulation). ❑ Note that any assignment of probabilities to events that satisfies the three axioms can be manipulated in this way. It is up to the engineer to determine a “reasonable” or “accurate” assignment of probabilities. ❑ Assigning probabilities to some sub-class of events gives you enough information to determine the probability of any event (by manipulating the axioms and corollaries). 64 Finite Discrete Sample Spaces ❑ In experiments with a finite discrete sample space, a probability law can be specified by giving the probabilities of the elementary events (events containing only one outcome). ❑ One commonly used probability law is the case of equally likely (equiprobable) outcomes: o Assume the sample space has n elements: S = {a1 , a2 ,, an } o Assign each elementary event probability: 1/n: Pai  = for all i 1 n ❑ For this case, the probability of any event that contains k outcomes is k/n. Thus, the probability of an event can be computed by counting the number of outcomes in it. The probability of rolling an even number? → {2, 4, 6} ❑ Equiprobable outcomes are often said to be chosen “at random.” 65 Example 2.9 (page 49) ❑ An urn contains 10 identical balls numbered 0, 1, 2, 3, ……, 9. A random experiment involves selecting a ball at random from the urn and noting the number on the ball. ❑ Consider the following events o A = “ball number selected is odd” o B = “ball number selected is a multiple of 3” Q: What is P[A∪B]? o C = “ball number selected is less than 5” 66 Example 2.9 (page 49) ❑ An urn contains 10 identical balls numbered 0, 1, 2, 3, ……, 9. A random experiment involves selecting a ball at random from the urn and noting the number on the ball. ❑ Consider the following events o A = “ball number selected is odd” o B = “ball number selected is a multiple of 3” Q: What is P[A∪B]? o C = “ball number selected is less than 5” ❑ Since o A = {1,3,5,7,9} ; P[A] = 5/10 o B = {3,6,9}; P[B] = 3/10 o C = {0,1,2,3,4}; P[C] = 5/10 ❑ By Corollary 5, PA  B  = P[ A] + P[ B] − PA  B   5 3 2 6 + − = 10 10 10 10 since A  B = 3,9  PA  B = 2 10 67 Example 2.10 (page 50) ❑ Suppose a coin is tossed three times and we observe the sequence of heads and tails: S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} If we assume equiprobable outcomes, then P"2 heads in 3 tosses" = PHHT, HTH, THH = 3 8 ❑ Suppose a coin is tossed three times and we observe the number of heads: S = {0, 1, 2, 3} If we assume equiprobable outcomes, then P"2 heads in 3 tosses" = P2 = 1 4 ❑ Both are mathematically valid, but one model is more realistic… 68 Continuous Sample Spaces ❑ For continuous sample spaces, there are an uncountably infinite number of outcomes. ❑ For the real line, the events of interest are intervals and their complements, unions and intersections. A = c, d  = X c  X  d  c d X ❑ Examples o Queuing time for a data packet (network) o Received signal power on your mobile phone/WiFi o Number of minutes a student is early / late (?) to class. o What about: Number of students that show up to class? o Can you think of more examples? 69 Continuous Sample Spaces 70 Equiprobable outcomes on the real line ❑ Consider an experiment where the outcome is real valued between a and b. S = a, b ❑ A probability law that captures the idea of equiprobable events is one where the probability of an event that is a subinterval of S is equal to the length of the subinterval divided by the length of S, i.e. d −c if A = c, d , then PA = b−a a c d b x ❑ It is easy to verify that the three axioms are satisfied. ❑ Note that the probability of any particular outcome is zero. x0 − x0 Px0  = Px0 , x0  = =0 b−a 71 Next Lecture Part I: Basic Probability Theory Lecture 1: Course Introduction Lecture 2: Build a Probability Model Lecture 3: Conditional Probability & Independence Lecture 4: Sequential Experiments Out-of-Class Reading: Counting Method 72 ELEC 2600: Probability and Random Processes in Engineering 73 ELEC 2600: Probability and Random Processes in Engineering Part I: Basic Probability Theory Lecture 1: Course Introduction Lecture 2: Build a Probability Model Lecture 3: Conditional Probability & Independence Lecture 4: Sequential Experiments Out-of-Class Reading: Counting Method 74 Elec 2600: Lecture 3 ❑ Conditional Probability Thus far, we have looked at the probability of events occurring o Properties “individually”, without regards to any o Total Probability Theorem other event. o Bayes’ Rule However, if we KNOW that a particular event occurs, how does this change the ❑ Independence probabilities of other events? ❑ Example: Medical Testing 75 Pizza Time ❑ You are blindfolded and choose a slice of pizza. What is the probability that it contains mushrooms? ❑ Suppose your friend tells you that your slice contains pepperoni. What is the probability that it has mushrooms? Journal of Statistics Education v.6, n.1 (1998) 76 Conditional Probability S ❑ The conditional probability of event A given A B an event B, is defined as: PA  B  PA | B  = PB  𝑃 𝐴 𝐵 > 𝑃[𝐴] where we assume that P[B] >0 Our belief that A has occurred has increased ❑ The conditional probability expresses how knowledge that an event B has occurred S A C alters the probability that an event A occurs. o P[A] is the a priori probability (before we know B occurs) o P[A|B] is the a posteriori probability 𝑃 𝐴 𝐶 < 𝑃[𝐴] Our belief that A has (after we know B occurs) occurred has decreased 77 Relative Frequency Interpretation ❑ If we interpret probability as relative frequency, then P[A|B] is the relative frequency of the event A∩B in experiments where B occurred. ❑ Suppose the experiment is performed n times, and suppose that the event B occurs nB times, and the event A∩B occurs nA∩B times, the relative frequency of interest is: n A B n A B P A  B  S = n → nB nB PB  B A n A B If B is known to have occurred, then A can occur only if 𝐴 ∩ 𝐵 occurs 78 Example ❑ Prior information: An integer between 1 and 10 is chosen at random. Suppose we are interested in two events: o A2 = “The integer is 2” o A8 = “The integer is 8” ❑ Since the integer is chosen at random, our prior probabilities are P[ A2 ] = 0.1 P[ A8 ] = 0.1 ❑ Suppose we learn that the integer chosen is greater than 5, how does that change our probabilities? ❑ Let B = “The number is greater than 5” = {6,7,8,9,10} P[ A2  B] P[{2}  {6, 7,8,9,10}] P[{}] 0 P[ A2 | B] = = = = =0 P[ B] P[{6, 7,8,9,10}] P[{6, 7,8,9,10}] 0.5 P[ A8  B] P[{8}  {6, 7,8,9,10}] P[{8}] 0.1 P[ A8 | B] = = = = = 0.2 P[ B] P[{6, 7,8,9,10}] P[{6, 7,8,9,10}] 0.5 79 Example An urn contains two black balls numbered 1 and 2 and three red balls numbered 3 through 5. One ball is selected at random ❑ Find the probability that the ball is numbered 2. ❑ Find the probability that the ball is numbered 2, given that the ball is black. ❑ Find the probability that the ball is numbered 2, given that the ball is red. 3 5 2 1 4 80 Elec 2600: Lecture 3 ❑ Conditional Probability o Properties o Total Probability Theorem o Bayes’ Rule ❑ Independence ❑ Example: Medical Testing 81 Properties of Conditional Probability 1. If B  A, then PA | B  = 1 PA  B Proof: B  A  PA  B = PB  =1 PB PA A  B then PA | B =  PA 2. If PB PA  B PA Proof: A  B  PA  B = PA  =  PA PB PB since PB  1 3. For fixed B, P[A|B] is a probability measure To prove property 3, we need to show that P[A|B] satisfies the three axioms for probability measures. 82 Proof of Properties 3 of Conditional Probability Axiom I: P[A|B] > 0 P[ A  B] Proof P[ A  B]  0 and P[ B]  0  P[ A | B] = 0 P[ B] Axiom II: P[S|B] = 1 P[ S  B] P[ B] Proof S  B = B, P[ S | B] = = =1 P[ B] P[ B] Axiom III: If A1 and A2 are mutually exclusive, then P[ A1  A2 | B] = P[ A1 | B] + P[ A2 | B] Proof P[ A1  A2 | B] = P[( A1  A2 )  B] = P[ A1B  A2 B] P[ B] P[ B] Since A1 and A2 are mutually exclusive, so are A1B and A2B, P[ A1 B] P[ A2 B] therefore P[ A1  A2 | B] = + = P[ A1 | B] + P[ A2 | B] P[ B] P[ B] 83 Why use Conditional Probability? ❑ To simplify the computation of desired probabilities o Joint probability: Example 2.25 o Total probability: Example 2.27 ❑ To change our model based on observations. o Example of Bayes’ theorem 84 Computing Joint Probability from Conditional Probability PA  B  ❑ Start with the definition of the conditional probability, PA | B  = PB  ❑ Multiplying both sides by P[B], 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐴 𝐵 × 𝑃[𝐵] ❑ This is known as the product rule. switch B and A ❑ Similarly, it can be shown that 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐵 𝐴 × 𝑃[𝐴] 85 Example 2.25 An urn contains two black balls and three red balls. Two balls are selected at random without replacement. Find the probability that both balls are black. Solution Let B1 = “first ball black” Let B2 = “second ball black” By the product rule: 1 2 1 𝑃 𝐵1 ∩ 𝐵2 = 𝑃 𝐵2 𝐵1 × 𝑃 𝐵1 = × = 4 5 10 At start After B1 86 Elec 2600: Lecture 3 ❑ Conditional Probability o Properties o Total Probability Theorem o Bayes’ Rule ❑ Independence ❑ Example: Medical Testing 87 Partitioning the Sample Space ❑ Definition: B1, B2, B3,….., Bn form a partition of the sample space S if both of the following are true o B1, B2, B3,…,Bn are mutually exclusive 𝑛 o Their union equals S, i.e., ራ 𝐵𝑖 = 𝑆 𝑖=1 B1 B9 S B3 B5 B7 B2 B6 B8 B 10 B4 88 Total Probability Theorem If B1, B2, B3,….., Bn partition the sample space, then for any event A, n P[ A] =  P[ A | Bi ]P[ Bi ] S i =1 Proof: 𝑛 𝑛 B1 B3 B5 B7 B9 A 𝐴 = 𝐴 ∩ 𝑆 = 𝐴 ∩ ራ 𝐵𝑖 = ራ 𝐴𝐵𝑖 B2 B6 B10 B4 B8 𝑖=1 𝑖=1 where AB1, AB2,…, ABn are mutually exclusive. ❑ By Corollary 4, 𝑃 𝐴 = σ𝑛 𝑖=1 𝑃 𝐴𝐵𝑖 = σ𝑛𝑖=1 𝑃 𝐴|𝐵𝑖 𝑃[𝐵𝑖 ] ❑ This theorem lets us split the problem of computing the probability of A into smaller (easier) sub-problems! 89 Example 2.27 An urn contains two black balls and three red balls. Two balls are selected at random without replacement. What is the probability of the event R2 = “the second ball is red”? Solution Let B1 = “the first ball is black” Let R1 = “the first ball is red” Let R2 = “the second ball is red” At start After B1 After R1 Since B1 and R1 partition the sample space, 3 2 2 3 12 3 𝑃 𝑅2 = 𝑃 𝑅2 𝐵1 𝑃 𝐵1 + 𝑃 𝑅2 𝑅1 𝑃 𝑅1 = × + × = = 4 5 4 5 20 5 90 Elec 2600: Lecture 3 ❑ Conditional Probability o Properties o Total Probability Theorem o Bayes’ Rule ❑ Independence ❑ Example: Medical Testing 91 Bayes’ Rule B1 B5 B9 S B3 B7 B2 A B6 B10 B4 B8 Let B1, B2, B3,…, Bn be a partition of the sample space S. For any event A, P[ AB j ] P[ A | B j ]P[ B j ] P[ B j | A] = = n  P[ A | B ]P[ B ] P[ A] k k k =1 ❑ Bayes’ rule lets us switch which event is the conditioning event. This simplifies the calculation when the 𝑃 𝐴 𝐵𝑘 are easy to compute. 92 Example of Bayes Theorem An urn contains two black balls and three red balls. Two balls are selected at random without replacement. What is the probability that we picked a black ball first, given that we picked a red ball second, P[B1|R2]? Solution 𝑃 𝑅2 𝐵1 𝑃 𝐵1 𝑃 𝐵1 𝑅2 = 𝑃 𝑅2 𝐵1 𝑃 𝐵1 + 𝑃 𝑅2 𝑅1 𝑃 𝑅1 3 2 ??? 4 × 1 = 5 = 3 2 2 3 2 At start Given B1 Given R1 Given R2 4×5+4×5 2 2 Since 𝑃 𝐵1 𝑅2 = > = 𝑃[𝐵1 ], knowing we picked a red ball second increases 4 5 our belief that we picked a black ball first. 93 Elec 2600: Lecture 3 ❑ Conditional Probability o Properties o Total Probability Theorem o Bayes’ Rule ❑ Independence ❑ Example: Medical Testing 94 Independence of Events ❑ Two events A and B are independent if P[AB] = P[A] x P[B]. ❑ An event A is independent of event B if knowledge that B occurs does not alter the probability of event A and vice versa: P[ A  B] P[ A]P[ B] P[ A | B] = = = P[ A] P[ B] P[ B] P[ A  B] P[ A]P[ B] P[ B | A] = = = P[ B] P[ A] P[ A] ❑ In other words, the a posteriori probability equals the a priori probability. A S The probability of each event is equal to its area. B 95 Example 2.31 ❑ A ball is selected from an urn containing two black balls, numbered 1 and 2, and two white balls, numbered 3 and 4. The event space is: S = {(1,b), (2,b), (3,w), (4,w)} Let events A, B and C be defined as follows: A = {(1,b), (2,b)} “a black ball is selected” B = {(2,b), (4,w)} “an even-numbered ball is selected” C = {(3,w), (4,w)} “number on ball selected is greater than 2” B ❑ Are A and B independent? Are A and C independent? 1 2 A 3 4 C 96 Example 2.31 - Solution B 1 2 A ❑ A, B and C have equal probabilities 1 P[ A] = P[ B ] = P[C ] = 3 4 C 2 ❑ A and B are independent 1 P[ A  B ] = P[{( 2, b)}] = = P[ A]P[ B ] 4 o Another way to look at it: The proportion of outcomes in S that leads to A is equal to the proportion of outcomes in B that leads to A. P[ A  B] P[{( 2, b)}] 1/ 4 1 P[ A | B] = = = = = P[ A] P[ B] P[{( 2, b), (4, w )}] 2 / 4 2 ❑ A and C are dependent (not independent) 1 P[ A  C ] = 0  = P[ A]P[C ] 4 o Intuition: Since A and C are mutually exclusive, knowledge that A occurs, rules out the possibility that C occurs. P[ A | C ] = 0  P[ A] 97 Example (Continuous Sample Space) ❑ Suppose two numbers (X and Y) are chosen at random on [0,1]. In this case, the probability of any event, is equal to its area in the (X,Y) plane. ❑ Consider the following events: A = { X  0.5} B = {Y  0.5} C = {X  Y } ❑ The probability of each event is equal to its area. All events have the same probability of 0.5. 98 Example (cont.) ❑ A and B are independent P[ AB] 1 1 P[ A | B] = = 4 = = P[ A] P[ B] 1 2 2 ❑ A and C are dependent P[ AC ] 3 3 P[ A | C ] = = 8 =  P[ A] P[C ] 1 2 4 ❑ B and C are dependent P[ BC ] 1 1 P[ B | C ] = = 8 =  P[ B] P[C ] 1 2 4 99 Independence of 3 or more events ❑ Events 𝐴1, 𝐴2 and 𝐴3 are independent if and only if o They are pairwise independent: 𝑃[𝐴1 𝐴2 ] = 𝑃[𝐴1] × 𝑃[𝐴2] 𝑃[𝐴2𝐴3] = 𝑃[𝐴2] × 𝑃[𝐴3] 𝑃[𝐴1𝐴3] = 𝑃[𝐴1] × 𝑃[𝐴3] o 𝑃[𝐴1𝐴2𝐴3] = 𝑃[𝐴1] × 𝑃[𝐴2] × 𝑃[𝐴3] ❑ Events 𝐴1 , 𝐴2 , … , 𝐴𝑛 are independent if and only if o Any group of k of them are independent for any 𝑘 < 𝑛. o 𝑃[𝐴1𝐴2 … 𝐴𝑛] = 𝑃[𝐴1] × 𝑃[𝐴2] × ⋯ × 𝑃[𝐴𝑛 ] 100 Example Toss two six sided die. Are the following events independent? A = “first die shows 3” B = “sum of die equals 7” C = “second die shows 2” Solution Note that P[A] = P[B] = P[C] = 1/6. Test pairwise independence: P[AB] = P[{3,4}] = 1/36 = P[A] × P[B] P[AC] = P[{3,2}] = 1/36 = P[A] × P[C] P[BC] = P[{5,2}] = 1/36 = P[B] × P[C] But, P[ABC] = 0 ≠ P[A] × P[B] × P[C] Thus, the events are not independent. 101 Example 2.35 (System Reliability) ❑ A system consists of a controller and three peripheral units. The system is said to be “up” if the controller and at least two of the peripherals are functioning. Find the probability that the system is up, assuming that the components fail independently. Define the following events A = “the controller is functioning” Bi = “peripheral i is functioning” Assume the failure probabilities are P[Ac] = p and P[Bic] = a. This implies that P[A] = (1-p) and P[Bi] = (1-a). 102 Example 2.35 (cont.) Let F = "at least two peripheral units are functioning" ="two units are functioning or three units are functioning" =( B1  B2  B3c )  ( B1  B2 c  B3 )  ( B1c  B2  B3 )  ( B1  B2  B3 ) independent mutually exclusive Thus, P["system up"] = P[ A  F ] = P[ A]P[ F ] = (1 − p){3(1 − a) 2 a + (1 − a)3} If a = 10% and p = 20%, then P[ F ] = 97.2% and P["system up"] = 77.8% 103 Elec 2600: Lecture 3 ❑ Conditional Probability o Properties o Total Probability Theorem o Bayes’ Rule ❑ Independence ❑ Example: Medical Testing 104 Medical Testing ❑ Tests are commonly used to determine whether or not a person has a disease. ❑ There are two measures commonly used to quantify test accuracy: sensitivity and specificity (True positive rate) (True negative rate) Sensitivity = P[test+ | has disease] Specificity = P[test- | no disease] TP + TN Accuracy = 105 TP + TN + FP + FN The usual case Actual condition Predicted condition FP TP false alarm hit overestimation FN TN miss, underestimation correct rejection 106 Example: Probability of a positive test result ❑ Suppose that 1% of the population has a disease. ❑ Suppose there is a test for the disease, which has o Sensitivity = P[test+ | has disease] = 99% o Specificity = P[test- | no disease] = 99% ❑ If random person is tested, what is P[test+]? (1%) Solution: 𝑃[test+] = 𝑃 test+ D × 𝑃 D + 𝑃 test+ ND × 𝑃 ND = 0.99 × 0.01 + 0.01 × 0.99 = 0.0099 + 0.0099 𝑃 test+ ND = 1 − 𝑃 test− ND = 0.0198 ≈ 2% What if the sensitivity increases? Does 𝑃[test+] increase or decrease? What if the specificity increases? Does 𝑃[test+] increase or decrease? 107 Example: Probability ❑ Suppose that 1% of the population has a disease. ❑ Suppose there is a test for the disease, which has o Sensitivity = P[test+ | has disease] = 99% o Specificity = P[test- | no disease] = 99% ❑ If random person tests positive, what is P[D|test+]? (99%) Solution: 𝑃 test+ D × 𝑃 D 𝑃[D|test+] = 𝑃 test+ D × 𝑃 D + 𝑃 test+ ND × 𝑃 ND 0.99 × 0.01 0.0099 = = = 50% 0.0198 0.0198 What if the sensitivity increases? Does 𝑃[D|test+] increase or decrease? What if the specificity increases? Does 𝑃[D|test+] increase or decrease? 108 Tradeoff between sensitivity and specificity ❑ Usually tests can be adjusted to increase sensitivity or specificity. ❑ However, if one goes up, the other usually goes down. ❑ This can be visualized through a curve called the receiver operating characteristic. Perfect test Test that always returns positive result Test that always returns negative result 109 Example: Probability ❑ Suppose that 1% of the population has a disease. ❑ Suppose you could choose from two tests: o Test 1: Sensitivity = 99%, Specificity = 97% o Test 2: Sensitivity = 97%, Specificity = 99% ❑ Which test has the higher P[D|test+], or are they both the same? Solution: 𝑃 test1+ D × 𝑃 D 𝑃[D|test1+] = 𝑃 test1+ D × 𝑃 D + 𝑃 test1+ ND × 𝑃 ND 0.99 × 0.01 = = 25% From the public health 0.99 × 0.01 + 0.03 × 0.99 perspective, what is the disadvantage of choosing 𝑃 test2+ D × 𝑃 D test 2? 𝑃[D|test2+] = 𝑃 test2+ D × 𝑃 D + 𝑃 test2+ ND × 𝑃 ND 0.97 × 0.01 = = 49.5% 0.97 × 0.01 + 0.01 × 0.99 110 Next Lecture Part I: Basic Probability Theory Lecture 1: Course Introduction Lecture 2: Build a Probability Model Lecture 3: Conditional Probability & Independence Lecture 4: Sequential Experiments Out-of-Class Reading: Counting Method 111 ELEC 2600: Probability and Random Processes in Engineering Fall 2024 Semester Lecture 4: Sequential Experiments 112 ELEC 2600: Probability and Random Processes in Engineering Part I: Basic Probability Theory Lecture 1: Course Introduction Lecture 2: Build a Probability Model Lecture 3: Conditional Probability & Independence Lecture 4: Sequential Experiments Out-of-Class Reading: Counting Method 113 Elec 2600: Lecture 4 ❑ Sequential Experiments o Bernoulli trials o Binomial probability law o *Multinomial probability law o Geometric probability law o Sequences of dependent experiments ❑ Example: Bean Machine Game! 114 Sequential Experiments ❑ Experiments that involve repetitions or multiple participants can often be viewed as a sequence of sub-experiments. ❑ The sub-experiments can be identical or non-identical, dependent or independent. The individual sample spaces can be identical or non-identical. ❑ If the experiments are identical, the individual sample spaces are identical but not vice versa. ❑ Examples: o Tossing a coin n times repetition independent identical sub-experiments identical individual sample spaces o Checking the number of students who are sleeping in class now multiple participants independent? identical individual sample spaces non-identical sub-experiments - Alice and Bob sleep with different probabilities 115 Sample Spaces Formed by Cartesian Products ❑ When an experiment consists of performing sub-experiments E1, E2,…, En. The outcome is an n-tuple s = (s1, s2,…. sn). ❑ The sample space can be denoted as the Cartesian product of the individual sample spaces. S1  S 2 .......  S n o Example: Toss a coin two times. The sample space is {H, T} × {H, T} = {HH, HT, TH, TT} o Example: Pick a random number X from [0, 1] and a random number Y from [0.2, 0.8]. The sample space is [0, 1] × [0.2, 0.8] Y 0.8 0.2 1 X 116 Sequences of Independent Sub-Experiments ❑ In many cases, the sub-experiments can be assumed to be independent, e.g. o a sequence of rolls of a die o a sequence of coin flips o a sequence of selections from a large collection of resistors. ❑ In this case, we can compute the probability of any event by exploiting independence. ❑ Let A1, A2,…, An be events such that Ak concerns only the outcome of the kth sub-experiment. If the sub-experiments are independent, the events Ak are independent and we have: P[ A1  A2 ......  An ] = P[ A1 ]P[ A2 ]....P[ An ] vice versa ? 117 Elec 2600: Lecture 4 ❑ Sequential Experiments o Bernoulli trials o Binomial probability law o *Multinomial probability law o Geometric probability law o Sequences of dependent experiments ❑ Example: Bean Machine Game! Sir Jacob Bernoulli 118 Bernoulli Trials ❑ A Bernoulli trial is an experiment that is performed once with two possible outcomes SUCCESS FAILURE Probability of success: p Probability of failure: q = 1 - p ❑ A Bernoulli trial is the simplest experiment o Binary sample space: {success, failure}, {1, 0}, {head, tail} ❑ Examples o Tossing a coin o Scoring above the mean in a test o Receive a message in the next second o Typhoon signal 8 on Wednesday 119 Example 2.37 ❑ Suppose that a coin is tossed three times. Assuming the outcomes of the tosses are independent and the probability of heads is always p, the probability for each sequence of heads and tails is P[{HHH}] = P[{H}]P[{H}]P[{H}] = p 3 P[{HHT}] = P[{H}]P[{H}]P[{T}] = p 2 (1 − p) P[{HTH}] = P[{H}]P[{T}]P[{H}] = p 2 (1 − p) P[{THH}] = P[{T}]P[{H}]P[{H}] = p 2 (1 − p) P[{HTT}] = P[{H}]P[{T}]P[{T}] = p(1 − p) 2 P[{THT}] = P[{T}]P[{H}]P[{T}] = p(1 − p) 2 P[{TTH}] = P[{T}]P[{T}]P[{H}] = p(1 − p) 2 P[{TTT}] = P[{T}]P[{T}]P[{T}] = (1 − p) 3 ❑ Note that the outcomes are equiprobable only if the coin is fair: P[{H}] = P[{T}]  p = 1-p  p = ½ 120 Example 2.37 (cont) ❑ Let k be the number of heads in three trials, then P[k = 0] = P[{TTT}] = (1 − p ) 3 Recall from previous page: P[k = 1] = P[{HTT,THT,TTH}] = 3 p(1 − p) 2 P[{HHH}] = P[{H}]P[{H}]P[{H}] = p 3 P[k = 2] = P[{HHT,HTH,THH}] = 3 p (1 − p) 2 P[{HHT}] = P[{H}]P[{H}]P[{T}] = p 2 (1 − p) P[k = 3] = P[{HHH}] = p 3 P[{HTH}] = P[{H}]P[{T}]P[{H}] = p 2 (1 − p) P[{THH}] = P[{T}]P[{H}]P[{H}] = p 2 (1 − p) P[{HTT}] = P[{H}]P[{T}]P[{T}] = p(1 − p) 2 P[{THT}] = P[{T}]P[{H}]P[{T}] = p(1 − p) 2 P[{TTH}] = P[{T}]P[{T}]P[{H}] = p(1 − p) 2 P[{TTT}] = P[{T}]P[{T}]P[{T}] = (1 − p) 3 121 Elec 2600: Lecture 4 Previous examples suggest that the ❑ Sequential Experiments probability of getting k successes (or failures) after performing multiple o Bernoulli trials Bernoulli trials follows a set pattern. o Binomial probability law What is this pattern for ARBITRARY o *Multinomial probability law numbers of trials and arbitrary k? o Geometric probability law o Sequences of dependent experiments ❑ Example: Bean Machine Game! 122 Binomial Probability Law ❑ If we conduct n independent Bernoulli trials, the number of successes follows a binomial probability law: the probability of k successes in n trials is: n pn (k ) =   p k (1 − p ) n−k k  ❑ Proof: p k (1 − p ) n−k Each particular sequence of events with k successes (n-k failures) has probability For example, the sequence of 2 successes and 3 failures: T F T F F (T = success / F = failure) has probability 𝑝 × 1 − 𝑝 × 𝑝 × 1 − 𝑝 × 1 − 𝑝 = 𝑝2 1 − 𝑝 3 where T denotes success and F denote failure. n n! n(n − 1) (n − k + 1)  The number of sequences of n trials with k successes is    = =  k  k!(n − k )! k (k − 1) (2)(1) n k Thus, the probability is n p (k ) =   k  p (1 − p )n−k See notes on   counting methods 123 Example 2.38 ❑ Suppose that a coin is tossed n=3 times. Assuming the outcomes of the tosses are independent and the probability of heads is always p, find the probability of observing k heads. 3! 0 ❑ Solution: P[k = 0] = p3 (0) = p (1 − p)3 = (1 − p)3 0!3! 3! 1 P[k = 1] = p3 (1) = p (1 − p ) 2 = 3 p (1 − p ) 2 1!2! 3! 2 P[k = 2] = p3 (2) = p (1 − p)1 = 3 p 2 (1 − p) 2!1! 3! 3 P[k = 3] = p3 (3) = p (1 − p )0 = p 3 3!0! These are consistent with the results of Example 2.37 124 Binomial Theorem ❑ Since the number of successes in n trials can range from 0 to n, n n n k  pn (k ) =    p (1 − p ) = 1 n−k   k =0 k =0  k  ❑ Proof:  n  k n−k n This can be proven using the binomial theorem: (a + b) =   a b n k =0  k  n n k Letting 𝑎 = 𝑝 and 𝑏 = (1 − 𝑝), ( p + 1 − p ) =    p (1 − p ) n − k n k =0  k  n 1 =  pn (k ) k =0 125 Example 2.40 A communication system transmits binary information over a channel that introduces random bit errors. To reduce the effect of these errors, the transmitter transmits each bit 3 times and the receiver takes a majority vote to decide which bit was transmitted. 111 001 1 transmitter channel receiver 0 P[error] = 0.1 Find the probability that a bit is received incorrectly. Solution A bit is incorrect if two or three errors occur during transmission: P[incorrec t] = p3 (2) + p3 (3)  3  3 =  (0.1) (0.9) +  (0.1)3 = 0.028 2  2  3 126 Elec 2600: Lecture 4 ❑ Sequential Experiments What about if there are o Bernoulli trials more than 2 events for o Binomial probability law each sub-experiment? o *Multinomial probability law o Geometric probability law o Sequences of dependent experiments ❑ Example: Bean Machine Game! 127 The Multinomial Probability Law ❑ The binomial probability law can be generalized. ❑ Suppose that an experiment consisting of n independent repetitions of the sub-experiment are performed. ❑ Let B1, B2, B3,….., BM partition the sample space S of the sub-experiment and let P[ B j ] = p j ❑ Since the events are mutually exclusive, p1 + p2 +..... + pM = 1 ❑ Let 𝑘𝑗 be the number of times that event Bj occurs. ❑ The probability of the vector (k1, k2, k3,…, kM ) that specifies the number of times each event occurs is given by the multinomial probability law: n! P[( k1 , k 2 ,....., k M )] = p1k1 p2k 2.... pM kM k1! k 2 !....k M ! 128 Example 2.41 A dart is thrown nine times at a target consisting of three areas. Each throw has a probability of 0.2, 0.3, and 0.5 of landing in areas 1, 2, and 3. Find the probability that the dart lands three times in each area. 9! P[(3,3,3)] = (0.2) 3 (0.3) 3 (0.5) 3 = 0.04536 3!3!3! 3 p1 = 0.2 p2 = 0.3 1 p3 = 0.5 2 129 Example 2.19 Suppose that we have 12 balls, each of which can fall randomly and independently into one of 12 cells. Note that each cell might contain more than one ball. What is the probability that each cell contains only one ball? Solution: Here each ball represents the repetition of an experiment which has 12 equally probable outcomes (the cell). Thus, n = 12 and 1 p1 = p2 = p3 = p4 = p5 = p6 = p7 = p8 = p9 = p10 = p11 = p12 = 12 The outcome that each cell contains only one ball is specified by k1 = k2 = k3 = k4 = k5 = k6 = k7 = k8 = k9 = k10 = k11 = k12 = 1 12! Using the multinomial probability law: P[(1,1,.....,1)] = ( 12 ) ( 12 ) ( 12 ) 1 1 1 1... 1 1 1!1!....1! 12! = 12  5.37  10−5 12 130 Elec 2600: Lecture 4 What about if, instead of ❑ Sequential Experiments worrying about how often o Bernoulli trials something happens, we are o Binomial probability law interested in how long o *Multinomial probability law something takes to happen? o Geometric probability law o Sequences of dependent experiments ❑ Example: Bean Machine Game! 131 The Geometric Probability Law (i) ❑ Assume that we conduct an independent sequence of Bernoulli trials. ❑ Let M be the number of trials until the first success. ❑ The probability that M=m is given by the geometric probability law: pM (m) = P[ A1c A2c...... Amc −1 Am ] = p(1 − p) m −1 m = 1,2,.... o 𝐴𝑘 : Event that 𝑘th trial resulted in a success o 𝐴𝑐𝑘 : Complement of 𝐴𝑘 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 M 132 The Geometric Probability Law (ii) ❑ But…. ❑ Let N be the number of trials before the first success. ❑ The probability that N=n is given by p(n) = P[ A1c A2c...... Anc An +1 ] = p(1 − p) n n = 0,1,2,.... This is also called the geometric probability law ! ❑ Be careful which version you are dealing with!! N 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 M 133 Properties of Geometric Probability Law   ❑ The probabilities sum to one:   m −1 p ( m ) = p (1 − p ) m =1 m =1  = p  (1 − p ) m change of variables: m = m + 1 m = 0 p = =1 Geometric series 1 − (1 − p) σ∞ 𝑟 𝑘 = 1 (for 𝑟 < 1) 𝑘=0 1−𝑟 ❑ The probability that more than K trials are required until the first success is  P[ M  K ] = p  (1 − p) m = K +1 m −1  = p(1 − p) K  (1 − p ) m change of variables : m = m + (K + 1) m = 0 p(1 − p) K = = (1 − p) K 1 − (1 − p) 134 Example Computer A transmits a message to computer B. If B detects an error in the message, it asks A to retransmit. If the probability of an error is 0.1, what is the probability the message needs to be sent more than twice? retransmit Y N message error? Computer A Computer B Solution: The probability of a successful transmission is p = 0.9. The probability that more than 2 transmissions are required until a success is P[ M  2] = (1 − p) 2 = 0.12 = 0.01 The probability that more than 𝐾 trials are required until the first success is 𝑃 𝑀 > 𝐾 = 1 − 𝑝 𝐾. 135 Elec 2600: Lecture 4 ❑ Sequential Experiments So far, the outcomes of all sub- o Bernoulli trials experiments have independent. What if they are NOT independent? o Binomial probability law o *Multinomial probability law o Geometric probability law o Sequences of dependent experiments ❑ Example: Bean Machine Game! 136 Sequences of Dependent Experiments ❑ In some cases, the outcome of a sub-experiment in a sequence may depend upon the outcome of a previous sub-experiment. ❑ Example 2.44: We have two urns labeled 0 and 1, each containing balls also labeled 0 and 1. To start, we pick one of the urns at random (flip a coin) and pick a ball from it at random. The label of the ball is used to determine the urn which is picked from in the next sub-experiment, and so on… ❑ The sample space consists of sequences of 0’s and 1’s. ❑ Any outcome corresponds to a path through the “trellis” 0 diagram shown below: 1 0 Urn 0 1 1 1 0 1 1 Urn 1 137 Computing Probabilities ❑ With sequences of dependent experiments, we rely heavily upon conditional probabilities. ❑ Let 𝑠𝑛 be the outcome of the 𝑛th sub-experiment. Since 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐴 𝐵 × 𝑃𝐵, P[ s0  s1  s2 ] = P[ s2 | s1  s0 ]  P[ s1  s0 ] Always true! = P[ s2 | s1  s0 ]  P[ s1 | s0 ]  P[ s0 ] ❑ In this case, the urn we pick from depends only upon the last sub experiment. This implies that P[ s2 | s1  s0 ] = P[ s2 | s1 ] If this is true ∀𝑛, we have ❑ More generally, P[ sn | sn −1  K  s0 ] = P[ sn | sn −1 ] a Markov Chain ❑ Thus, P[ s0  s1  s2 ] = P[ s2 | s1 ]  P[ s1 | s0 ]  P[ s0 ] 138 Example 2.45 Find the probability of the urn sequence 0011 for the urn experiment introduced in the previous example. 0 1 0 Solution: Urn 0 We can label the branches of the trellis diagram with the conditional probabilities: 1 1 1 0 1 1 Urn 1 Using the conditional probability, we obtain P = P[1|1]  P[1| 0]  P[0 | 0]  P 5 1 2 1 5 =    = 6 3 3 2 54 139 Trellis Coded Modulation Trellis modulation was invented by Gottfried Ungerboeck working for IBM in the 1970s. Improvement: Sharing a floppy disk via a BBS could be done in just a few minutes, instead of an hour. (How lucky we are now!) 140 Summary Bernoulli trials Binomial probability law *Multinomial probability law Geometric probability law Sequences of dependent experiments 141 Elec 2600: Lecture 4 ❑ Sequential Experiments o Bernoulli trials o Binomial probability law o *Multinomial probability law o Geometric probability law o Sequences of dependent experiments ❑ Example: Bean Machine Game! 142 Bean Machine (Quincunx or Galton Board) The bean machine, also known as the quincunx or Galton board, is a device invented by Sir Francis Galton. The machine consists of a vertical board with interleaved rows of pins. Balls are dropped from the top, and bounce left and right as they hit the pins. Sir Francis Galton 143 Bean Machine (Quincunx or Galton Board) For every 10$, you get 6 Balls. If you can hit the four bins, you win 100$! What is the chance to win? 144 Bean Machine (Quincunx or Galton Board) A simpler question: For a Galton board with seven rows of pins, what is the probability that the ball will fall into the bin located second to the left? Assumption: the ball will bounce to each side with equal probability. 145 Bean Machine (Quincunx or Galton Board) ❑ Hints o How many paths are there? o What is the probability for each path? o Are different paths independent? o Are different bounce independent? ❑ Other conclusions o What are the probabilities for all bins? o What is the shape of the distribution curve? http://www.mathsisfun.com/data/quincunx.html 146 147 Next Lecture Part I: Basic Probability Theory Part II: Single Random Variables Lecture 1: Course Introduction Lecture 5: Discrete Random Variables Lecture 2: Build a Probability Model Lecture 6: Expected Value and Moments Lecture 3: Conditional Probability & Lecture 7: Important Discrete Random Independence Variables Lecture 4: Sequential Experiments Lecture 8: Continuous Random Variables Out-of-Class Reading: Counting Method Lecture 9: Expectation of Continuous Random Variables Lecture 10: Conditional PMF/CDF/PDF Lecture 11: Function of a Random Variable 148

ELEC 2600 Probability and Random Processes Fall 2024 PDF

Document Details

Tags

Related

Summary

Full Transcript