Probability and Statistics Course Outline (2024/2025)

Summary

This document is a course outline for Probability and Statistics, part of a computer science program at Tanta University in Egypt for the 2024/2025 academic year. It details course aims, content, learning methods, and assessment. Topics include probability axioms, random variables, random sampling, point estimation, hypothesis testing, and statistical methods. This course syllabus is suitable for first-year computer science students.

Full Transcript

[email protected] [email protected] [email protected] Probability and Statistics ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ (1st year)...

[email protected] [email protected] [email protected] Probability and Statistics ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ (1st year) BY Dr. Hanaa Abd Elhady Lecturer of computer science 2023/2024 faculty of computers and informatics 2023/2024 2023/2024 tanta university 2024-2025 [email protected] [email protected] [email protected] 1. Aims Course Title Probability and Statistics Course Code BS116 Academic Year 2024/2025 Course type Obligatory Coordinator Dr. Hanaa abdelhady ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Other Staff Level Level 1 Semester Semester 1 Pre-Requisite ------ Course Delivery Credit 3h Lecture 14 x 2h lectures Practical 14 x 1h practical Parent Department ---------- Date of Approval 2, 2024 Upon completion of this course, the student will be able to: 1. Know the Axioms of Probability. 2. Explain the Details of discrete and continuous Random Variables. 3. Know random sampling and its distributions. 4. Show methods of point estimation. 2023/2024 2023/2024 2023/2024 5- Show testing hypotheses. 6- know Chi-square tests and Correlation and Regression.. A. Knowledge and understanding: A1. Understand the Axioms of Probability, Conditional Probability, and Bayes' Theorem. A2. Know the Details of discrete and continuous Random Variables and Correlation and Regression. A3. Recognize methods of point estimation. A4. Understand testing hypotheses and Chi-square tests. B. B. Intellectual skills: B1. Apply methods of point estimation and the Axioms of Probability. B2. Classify Random Variables to discrete and continuous. B3. Examine testing hypotheses and Chi-square tests. C. Professional and practical skills: [email protected] [email protected] [email protected] C1. Use theAxioms of Probability, Conditional Probability, Chi-square tests, testing hypotheses, and Bayes' Theorem. C2. Choose the appropriate methods of point estimation. D. General and transferable skills: D1. Use the probability theory to solve problems. ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ D2. Follow methods of point estimation and testing hypotheses.. 3- Content Week 1: Introduction to PROBABILITY Week 2: Conditional Probability and Bayes' Theorem Week 3: Discrete Random Variables Week 4: Continuous Random Variables Week 5: Introduction to Random sampling Week 6: The application of the central limit theorem Week 7: mid-trem exam Week 8: Methods of point estimation Week 9: Confidence interval for a population mean Week 10: Introduction to Hypotheses testing 2023/2024 2023/2024 Week 11: Hypotheses testing: the difference 2023/2024 between two population means Week 12: Chi-square tests Week 13: The least squares method Week 14: The correlation analysis 4. T eac hi n g a nd L ea rn in g Me t h o ds - Lectures. - Library and net search - assignments and problem solving. - Laboratory sessions. 5. S tu d e n t Ass ess m en t Assessment Method Skills assessed* Assessment Length Schedule Proportion th Written final Examination KU, I 2 Hour Examination The 16 Week 50% Med-term Ku, I 1 Hour Examination The 7 th week 25% Oral Assessment KU, I Assessment Session The 14 th week 10% Practical exam KU, T Continuous Assessment The 15 th week 15% *KU: Knowledge and Understanding, I: Intellectual, P: Professional, T: Transferable 6. L is t of r ef er e nc es Co ur se note s: - There are lectures notes prepared in the form of a book authorized by the department Essential Books: F.M. Dekking C. Kraaikamp, H.P. Lopuhaa¨ L.E. Meester A Modern Introduction to Probability and Statistics 7. Facilities required for teaching and learning [email protected] [email protected] [email protected] Projectors: Video, Overhead and Slide. Computer Presentations and Writing Boards. Library. Course Coordinator Head of Department Name Dr. hanaa essa Name (Arabic) ‫ هناء عيسى‬.‫د‬ ----- Signature ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Date 2/2025 2/2024 Course contents – Course ILOs Matrix Course Code / Course Title: BS116 Probability and Statistics Course outcomes ILOs Knowledge and Course Contents Intellectual Practical Transferable Understanding A1 A2 A3 A4 B1 B2 B3 C1 C2 D1 D2 Introduction to PROBABILITY √ √ √ √ Conditional Probability and Bayes' √ √ √ Theorem 2023/2024 2023/2024 2023/2024 Discrete Random Variables √ √ √ Continuous Random Variables √ √ Introduction to Random sampling The application of the central limit theorem mid-trem exam Methods of point estimation √ √ √ Confidence interval for a population √ mean Introduction to Hypotheses testing √ √ √ √ Hypotheses testing: the difference √ √ √ √ between two population means Chi-square tests √ √ √ The least squares method √ The correlation analysis √ Learning and Teaching Methods Learning Method Course outcomes ILOs [email protected] [email protected] [email protected] General and Professional and Knowledge and Understanding Intellectuel Skills Transferable Practical Skills Skills D1 D2 A1 A2 A3 A4 B1 B2 B3 C1 C2 √ Lecture √ √ √ √ √ √ √ √ √ √ Discussion ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ √ √ √ √ (Brain Storming) Self-Learning √ √ √ √ √ (Essay) Practice √ √ √ √ √ Assessment Methods 2023/2024 2023/2024 2023/2024 Course outcomes ILOs Professional General and Assessment Knowledge and Understanding Intellectual Skills and Practical Transferable Methods Skills Skills D2 A1 A2 A3 A4 B1 B2 B3 C1 C2 D1 Written √ √ √ √ √ √ √ √ √ √ √ Examination Oral √ √ √ √ Assessment Mid-Term √ √ √ √ √ Examination Semester work √ √ √ √ √ Applied Exam √ √ √ √ √ √ √ √ √ Course coordinator Dr. Hanaa essa Head of Department: ------ [email protected] [email protected] [email protected] Contents Chapter 1 Introduction to Probability 1.1 Introduction 1.2 The Axioms of Probability 1.3 Finite Sample Spaces 1.4 Conditional Probability ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ 1.5 Bayes' Theorem Chapter 2 Random Variables 2.1 Introduction 2.2 Discrete Random Variables 2.3 Mean and Variance for Discrete RVs 2023/2024 2023/2024 2023/2024 2.4 Continuous Random Variables 2.5 Mean and Variance for Continuous RVs Chapter 3 RANDOM SAMPLING AND ITS DISTRIBUTIONS 3-1 Random sampling 3.2 Sampling distribution 3.2.1 Distribution of the sample mean 3.3 The central limit theorem 1 [email protected] [email protected] [email protected] 3.4. Distribution of the difference between two sample means (non- normal population) 3.5 Distribution of the difference between two sample means (normal population) CHAPTER 4 ESTIMATION 4.1 Introduction ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ 4.2 Point estimation 4.3 Methods of point estimation 4.3.1 Method of moments ETHOD OF MOMENTS 4.3.2 Method of maximum likelihood 4.4 Confidence interval for a population 2023/2024 2023/2024 mean 2023/2024 4.4.1 The case when 𝜎1and 𝜎2 is known 4.4.2 The case when 𝜎1and 𝜎2 is unknown 4.5 Confidence interval for difference between two population means 4.5.1 The case when 𝜎1and 𝜎2 is known 4.5.2 The case when 𝜎1and 𝜎2 is unknown CHAPTER 5 TESTING HYPOTHESES 5.1 Introduction 2 [email protected] [email protected] [email protected] 5.2 Hypotheses testing: A single population mean 5.3 Hypotheses testing: the difference between two population means CHAPTER 6 Chi-square tests 6.1 Introduction ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ 6.2 Tests of Independence 6.3 Test of goodness-of-fit Chapter 7 SIMPLE LINEAR REGRESSION AND CORRELARION ANALYSIS 7.1 Introduction 7.2 The least squares method 2023/2024 2023/2024 2023/2024 7.3 The correlation analysis 6-2 TESTS OF INDEPENDENCE 6-1 INTRODUCTION 3 [email protected] [email protected] [email protected] Chapter 1 PROBABILITY 1.1 INTRODUCTION In earlier Classes, we have studied the probability as a measure of uncertainty of events in a random experiment. We ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ discussed the axiomatic approach formulated by Russian Mathematician, A.N. Kolmogorov (1903-1987) and treated probability as a function of outcomes of the experiment. We have also established equivalence between the axiomatic theory and the classical theory of probability in case of equally likely outcomes. On the basis of this relationship, we obtained 2023/2024 2023/2024 probabilities of events associated with discrete sample2023/2024 spaces. We have also studied the addition rule of probability. In this chapter, we shall discuss the important concept of conditional probability of an event given that another event has occurred, which will be helpful in understanding the Bayes' theorem, multiplication rule of probability and independence of events. We shall also learn an important concept of random variable and its probability distribution and also the mean and variance of a probability distribution. In the last section of the chapter, we shall study an important discrete probability distribution called Binomial distribution. Throughout this chapter, we shall take up the experiments having equally likely outcomes, unless stated 4 [email protected] [email protected] [email protected] otherwise. Probability theory is a mathematical theory to describe and analyze situations where randomness or uncertainty are present. Any specific such situation will be referred to as a random experiment. We use the term “experiment” in a wide sense here; it could mean an actual physical experiment such as flipping a coin or rolling a die, but it could also be a situation where we ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ simply observe something, such as the price of a stock at a given time, the amount of rain in Houston in September, or the number of spam emails we receive in a day. After the experiment is over, we call the result an outcome. For any given experiment, there is a set of possible outcomes, and we state the following definition. The set of all possible outcomes 2023/2024 in a random experiment 2023/2024 is 2023/2024 called the sample space, denoted S. A subset of S, A ⊆ S, is called an event. Here are some examples of random experiments and their associated sample spaces. Example 1 Roll a die and observe the number. Here we can get the numbers 1 through 6, and hence the sample space is S = {1, 2, 3, 4, 5, 6} 5 [email protected] [email protected] [email protected] Example 2 Flip a coin twice and observe the sequence of heads and tails. With H denoting heads and T denoting tails, one possible outcome is HT, which means that we get heads in the first flip and tails in the second. Arguing like this, there are four possible outcomes and the sample space is S = {HH, HT, TH, TT} ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ 1.2 THE AXIOMS OF PROBABILITY Suppose we have a sample space S. If S is discrete, all subsets correspond to events and conversely; if S is non-discrete, only special subsets (called measurable) correspond to events. To 2023/2024 2023/2024 2023/2024 each event A in the class C of events, we associate a real number P(A). The P is called a probability function, and P(A) the probability of the event, if the following axioms are satisfied. Axiom 1. For every event A in class C, P(A) ≥ 0 Axiom 2. For the sure or certain event S in the class C, P(S) = 1 Axiom 3. For any number of mutually exclusive events A1, A2, …, in the class C, P(A1 ∪ A2 ∪ … ) = P(A1) + P(A2) + … In particular, for two mutually exclusive events A1 and A2 , P(A1 ∪ A2 ) = P(A1) + P(A2) 6 [email protected] [email protected] [email protected] SOME IMPORTANT THEOREMS ON PROBABILITY From the above axioms we can now prove various theorems on probability that are important in further work. Theorem 1: If A1 ⊂ A2 , then P(A1) ≤ P(A2) and P(A2 − A1) = P(A2) − P(A1) , where A2 − A1= A2∩ A1', A1' being the complement of the event A1. ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Theorem 2: For every event A, 0 ≤ P(A) ≤ 1, i.e., a probability between 0 and 1. Theorem 3: For ∅, the empty set, P(∅) = 0 i.e., the impossible event has probability zero. 2023/2024 2023/2024 2023/2024 Theorem 4: If A' is the complement of A, then P(A' ) = 1 – P(A) Theorem 5: If A = A1 ∪ A2 ∪ … ∪ An , where A1, A2, … , An are mutually exclusive events, then P(A) = P(A1) + P(A2) + … + P(An) Theorem 6: If A and B are any two events, then P(A ∪ B) = P(A) + P(B) – P(A ∩ B) More generally, if A1, A2, A3 are any three events, then P(A1 ∪ A2 ∪ A3) = P(A1) + P(A2) + P(A3) – P(A1 ∩ A2) – P(A2 ∩ A3) – P(A3 ∩ A1) + P(A1 ∩ A2 ∩ A3). 7 [email protected] [email protected] [email protected] Generalizations to n events can also be made. Theorem 7: For any events A and B, P(A) = P(A ∩ B) + P(A ∩ B') Proposition Let A,B, and C be events. Then (a) (Distributive Laws) (A ∩ B) ∪ C = (A ∪ C) ∩ (B ∪ C) (A ∪ B) ∩ C = (A ∩ C) ∪ (B ∩ C) ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ (b) (De Morgan’s Laws) (A ∪ B)' = A' ∩ B' (A ∩ B)' = A' ∪ B' 1.3 FINITE SAMPLE SPACES The results in the previous section 2023/2024 2023/2024 hold for an arbitrary sample 2023/2024 space S. In this section we will assume that S is finite, S = {s1,..., sn}, say. In this case, we can always define the probability measure by assigning probabilities to the individual outcomes. Proposition Suppose that p1,..., pn are numbers such that (a) pk ≥ 0, k = 1,..., n (b) ∑ and for any event A ⊆ S, define P(A) =∑ Then P is a probability measure. 8 [email protected] [email protected] [email protected] Hence, when dealing with finite sample spaces, we do not need to explicitly give the probability of every event, only for each outcome. We refer to the numbers p1,..., pn as a probability distribution on S. Example 3 Consider the experiment of flipping a fair coin twice and ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ counting the number of heads. We can take the sample space S = {HH, HT, TH, TT} and let p1 =... = p4 =. Alternatively, since all we are interested in is the number of heads and this can be 0, 1, or 2, we can use the sample space S = {0, 1, 2} 2023/2024 2023/2024 2023/2024 and let p0 = , p1 = , p2 =. Of particular interest is the case when all outcomes are equally likely. If S has n equally likely outcomes, then p1 = p2 = · · · = pn = , which is called a uniform distribution on S. The formula for the probability of an event A now simplifies to P(A) = ∑ where #A denotes the number of elements in A. This formula is often referred to as the classical definition of probability, since historically this was the first context in which probabilities were 9 [email protected] [email protected] [email protected] studied. The outcomes in the event A can be described as favorable to A and we get the following formulation. Corollary In a finite sample space with uniform probability distribution P(A) = ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Example 4 Roll a fair die 3 times. What is the probability that all numbers are the same? The sample space is the set of the 216 ordered triples (i, j, k), and since the die is fair, these are all equally probable and we have a uniform probability distribution. The event of interest is A = {(1, 1, 1), (2, 2, 2),..., (6, 6, 6)} 2023/2024 2023/2024 2023/2024 which has six outcomes and probability P(A) = Example 5 Consider a randomly chosen family with three children. What is the probability that they have exactly one daughter? There are eight possible sequences of boys and girls (in order of birth), and we get the sample space S = {bbb, bbg, bgb, bgg, gbb, gbg, ggb, ggg} where, for example, bbg means that the oldest child is a boy, the middle child a boy, and the youngest child a girl. If we assume 10 [email protected] [email protected] [email protected] that all outcomes are equally likely, we get a uniform probability distribution on S, and since there are three outcomes with one girl, we get P(one daughter) = 1.4 CONDITIONAL PROBABILITY ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Up till now in probability, we have discussed the methods of finding the probability of events. If we have two events from the same sample space, does the information about the occurrence of one of the events affect the probability of the other event? Let us try to answer this question by taking up a random experiment in which the outcomes are equally likely to occur. Consider the experiment of tossing three fair coins. The sample 2023/2024 2023/2024 2023/2024 space of the experiment is S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT} Since the coins are fair, we can assign the probability to each sample point. Let E is the event „at least two heads appear‟ and F be the event „first coin shows tail‟. Then E = {HHH, HHT, HTH, THH} and F = {THH, THT, TTH, TTT} Therefore P(E) = P ({HHH}) + P ({HHT}) + P ({HTH}) + P ({THH}) 11 [email protected] [email protected] [email protected] and P(F) = P ({THH}) + P ({THT}) + P ({TTH}) + P ({TTT}) Also E ∩ F = {THH} With P(E ∩ F) = P({THH}) = ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Now, suppose we are given that the first coin shows tail, i.e. F occurs, then what is the probability of occurrence of E? With the information of occurrence of F, we are sure that the cases in which first coin does not result into a tail should not be considered while finding the probability of E. This information reduces our sample space from the set S to its subset F for the event E. In other words, the additional 2023/2024 2023/2024 information really 2023/2024 amounts to telling us that the situation may be considered as being that of a new random experiment for which the sample space consists of all those outcomes only which are favorable to the occurrence of the event F. Now, the sample point of F which is favorable to event E is THH. Thus, Probability of E considering F as the sample space = , Or Probability of E given that the event F has occurred = This probability of the event E is called the conditional probability of E given that F has already occurred, and is denoted by P (E|F). 12 [email protected] [email protected] [email protected] Thus P (E|F) = Note that the elements of F which favor the event E are the common elements of E and F, i.e. the sample points of E ∩ F. Thus, we can also write the conditional probability of E given that F has occurred as P (E|F) = ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Dividing the numerator and the denominator by total number of elementary events of the sample space, we see that P (E|F) can also be written as 2023/2024 | 2023/2024 (1) 2023/2024 Note that (1) is valid only when P (F) ≠ 0 i.e., F ≠ φ (Why?) Thus, we can define the conditional probability as follows: Definition 1 If E and F are two events associated with the same sample space of a random experiment, the conditional probability of the event E given that F has occurred, i.e. P (E|F) is given by | , provided P(F) ≠ 0 13 [email protected] [email protected] [email protected] PROPERTIES OF CONDITIONAL PROBABILITY Let E and F be events of a sample space S of an experiment, then we have Property 1 P (S|F) = P (F|F) = 1 We know that | ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Also | Thus | | Property 2 If A and B are any two events of a sample space S 2023/2024 2023/2024 2023/2024 and F is an event of S such that P (F) ≠ 0, then P ((A ∪ B) |F) = P (A|F) + P (B|F) – P ((A ∩ B) |F) In particular, if A and B are disjoint events, then P ((A∪B) |F) = P (A|F) + P (B|F) ( ∪ ) We have P((A∪B)|F) = ∪ (by distributive law of union of sets over intersection) ( ) 14 [email protected] [email protected] [email protected] = P (A|F) + P(B|F) – P((A ∩B)|F) When A and B are disjoint events, then P ((A ∩ B)|F) = 0 ⇒ P ((A ∪ B)|F) = P(A|F) + P(B|F) Property 3 P(E′|F) = 1 − P(E|F) From Property 1, we know that P(S|F) = 1 ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ ⇒ P (E ∪ E′|F) = 1 since S = E ∪ E′ ⇒ P (E|F) + P (E′|F) = 1 since E and E′ are disjoint events Thus, P (E′|F) = 1 − P(E|F) Let us now take up some examples. Example 8 2023/2024 2023/2024 2023/2024 If P(A) = , P(B) = and P(A ∩ B) = , evaluate P(A|B). Solution We have | Example 9 A family has two children. What is the probability that both the children are boys given that at least one of them is a boy? Solution Let b stand for boy and g for girl. The sample space of the experiment is S = {(b, b), (g, b), (b, g), (g, g)} Let E and F denote the following events: 15 [email protected] [email protected] [email protected] E: „both the children are boys‟ F: „at least one of the children is a boy‟ Then E = {(b,b)} and F = {(b,b), (g,b), (b,g)} Now E ∩ F = {(b,b)} Thus P (F) = and P (E ∩ F )= Therefore | ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Example 10 Ten cards numbered 1 to 10 are placed in a box, mixed up thoroughly and then one card is drawn randomly. If it is known that the number on the drawn card is more than 3, what is the probability that it is an even number? 2023/2024 2023/2024 2023/2024 Solution Let A be the event „the number on the card drawn is even‟ and B be the event „the number on the card drawn is greater than 3‟. We have to find P (A|B). Now, the sample space of the experiment is S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} Then A = {2, 4, 6, 8, 10}, B = {4, 5, 6, 7, 8, 9, 10} and A ∩ B = {4, 6, 8, 10} Also P (A) = , P(B)= and P (A∩B) = Then | 16 [email protected] [email protected] [email protected] Example 11 In a school, there are 1000 students, out of which 430 are girls. It is known that out of 430, 10% of the girls study in class XII. What is the probability that a student chosen randomly studies in Class XII given that the chosen student is a girl? Solution ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Let E denote the event that a student chosen randomly studies in Class XII and F be the event that the randomly chosen student is a girl. We have to find P (E|F) Now P (F) = and P (E∩ F)= =0.043 Then | 2023/2024 2023/2024 2023/2024 Example 12 A die is thrown three times. Events A and B are defined as below: A: 4 on the third throw B: 6 on the first and 5 on the second throw Find the probability of A given that B has already occurred. Solution The sample space has 216 outcomes. Now A = {(1,1,4) (1,2,4)... (1,6,4) (2,1,4) (2,2,4)... (2,6,4) (3,1,4) (3,2,4)... (3,6,4) (4,1,4) (4,2,4)... (4,6,4) 17 [email protected] [email protected] [email protected] (5,1,4) (5,2,4)... (5,6,4) (6,1,4) (6,2,4)... (6,6,4)} B = {(6,5,1), (6,5,2), (6,5,3), (6,5,4), (6,5,5), (6,5,6)} and A ∩ B = {(6,5,4)} Now P (B) = and P (A ∩ B) = Then | ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Example 13 A die is thrown twice and the sum of the numbers appearing is observed to be 6. What is the conditional probability that the number 4 has appeared at least once? Solution Let E be the event that „number 4 appears at least once‟ and F be 2023/2024 2023/2024 2023/2024 the event that „the sum of the numbers appearing is 6‟. Then, E = {(4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (1,4), (2,4), (3,4), (5,4), (6,4)} and F = {(1,5), (2,4), (3,3), (4,2), (5,1)} We have P (E) = and P(F) = Also E∩F = {(2,4), (4,2)} Therefore P(E∩F) = Hence, the required probability | 18 [email protected] [email protected] [email protected] For the conditional probability discussed above, we have considered the elementary events of the experiment to be equally likely and the corresponding definition of the probability of an event was used. However, the same definition can also be used in the general case where the elementary events of the sample space are not equally likely, the probabilities P (E∩F) and P (F) being calculated accordingly. Let us take up the ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ following example Example 14 Consider the experiment of tossing a coin. If the coin shows head, toss it again but if it shows tail, then throw a die. Find the conditional probability of the event that „the die shows a number greater than 4‟ given that „there 2023/2024 is at least one tail‟. 2023/2024 2023/2024 Solution The outcomes of the experiment can be represented in following diagrammatic manner called the „tree diagram‟. The sample space of the experiment may be described as S = {(H,H), (H,T), (T,1), (T,2), (T,3), (T,4), (T,5), (T,6)} Where (H, H) denotes that both the tosses result into head and (T, i) denote the first toss result into a tail and the number i appeared on the die for i = 1,2,3,4,5,6. Thus, the probabilities assigned to the 8 elementary events (H, H), (H, T), (T, 1), (T, 2), (T, 3) (T, 4), (T, 5), (T, 6) are 19 [email protected] [email protected] [email protected] respectively which is clear from the following figure: ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Let F be the event that „there is at least one tail‟ and E be the event „the die shows a number greater than 4‟. Then F = {(H,T), (T,1), (T,2), (T,3), (T,4), (T,5), (T,6)} E = {(T,5), (T,6)} and E ∩ F = {(T,5), (T,6)} 2023/2024 2023/2024 2023/2024 Now P(F) = P({(H,T)}) + P ({(T,1)}) + P ({(T,2)}) + P ({(T,3)}) + P ({(T,4)})+ P({(T,5)}) + P({(T,6)}) and P (E ∩ F) = P ({(T,5)}) + P ({(T,6)}) = Hence, the required probability | THE MULTIPLICATION RULE Let E and F be two events associated with a sample space S. Clearly, the set E ∩ F denotes the event that both E and F have 20 [email protected] [email protected] [email protected] occurred. In other words, E ∩ F denotes the simultaneous occurrence of the events E and F. The event E ∩ F is also written as EF. Very often we need to find the probability of the event EF. For example, in the experiment of drawing two cards one after the other, we may be interested in finding the probability of the event „a king and a queen‟. The probability of event EF is ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ obtained by using the conditional probability as obtained below: We know that the conditional probability of event E given that F has occurred is denoted by P(E|F) and is given by | 2023/2024 P (E ∩ F) = P(F). P(E|F) From this result, we can write 2023/2024 2023/2024 (1) Also, we know that | Or | Thus, P(E ∩ F) = P(E). P(F|E) (2) Combining (1) and (2), we find that P(E ∩ F) = P(E) P(F|E) = P(F) P(E|F) provided P(E) ≠ 0 and P(F) ≠ 0. The above result is known as the multiplication rule of probability. Let us now take up an example. 21 [email protected] [email protected] [email protected] Example 15 An urn contains 10 black and 5 white balls. Two balls are drawn from the urn one after the other without replacement. What is the probability that both drawn balls are black? Solution Let E and F denote respectively the events that first and second ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ ball drawn are black. We have to find P (E ∩ F) or P (EF). Now P (E) = P (black ball in first draw) = Also given that the first ball drawn is black, i.e., event E has occurred, now there are 9 black balls and five white balls left in the urn. Therefore, the probability that the second ball drawn is black, given that the ball in the first draw is black, is nothing but 2023/2024 2023/2024 2023/2024 the conditional probability of F given that E has occurred. i.e. P (F|E) = By multiplication rule of probability, we have P (E ∩ F) = P (E)* P (F|E) = MULTIPLICATION RULE OF PROBABILITY FOR MORE THAN two events If E, F and G are three events of sample space, we have P(E ∩ F ∩ G) = P(E) P(F|E) P(G|(E ∩ F)) = P(E) P(F|E) P(G|EF) 22 [email protected] [email protected] [email protected] Similarly, the multiplication rule of probability can be extended for four or more events. The following example illustrates the extension of multiplication rule of probability for three events. Example 16 ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Three cards are drawn successively, without replacement from a pack of 52 well shuffled cards. What is the probability that first two cards are kings and the third card drawn is an ace? Solution Let K denote the event that the card drawn is king and A be the event that the card drawn is an2023/2024 2023/2024 ace. Clearly, we have to find P 2023/2024 (KKA) Now P (K) = Also, P (K|K) is the probability of second king with the condition that one king has already been drawn. Now there are three kings in (52 − 1) = 51 cards. Therefore P (K|K) = Lastly, P (A|KK) is the probability of third drawn card to be an ace, with the condition that two kings have already been drawn. Now there are four aces in left 50 cards Therefore P (A|KK) = 23 [email protected] [email protected] [email protected] By multiplication law of probability, we have P(KKA) = P(K)* P(K|K)* P(A|KK) =. 1.5 BAYES' THEOREM Consider that there are two bags I and II. Bag I contains 2 white and 3 red balls and Bag II contains 4 white and 5 red ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ balls. One ball is drawn at random from one of the bags. We can find the probability of selecting any of the bags (i.e. ) or probability of drawing a ball of a particular color (say white) from a particular bag (say Bag I). In other words, we can find the probability that the ball drawn is of a particular color, if we are given the bag from which the ball is drawn. But, can we find the probability that the ball drawn 2023/2024 is from a particular2023/2024 2023/2024 bag (say Bag II), if the color of the ball drawn is given? Here, we have to find the reverse probability of Bag II to be selected when an event occurred after it is known. Famous mathematician, John Bayes' solved the problem of finding reverse probability by using conditional probability. The formula developed by him is known as „Bayes theorem‟ which was published posthumously in 1763. Before stating and proving the Bayes' theorem, let us first take up a definition and some preliminary results. 1.5.1 PARTITION OF A SAMPLE SPACE 24 [email protected] [email protected] [email protected] A set of events E1 , E2 ,..., En is said to represent a partition of the sample space S if (a) Ei ∩ Ej = φ, i ≠ j, i, j = 1, 2, 3,..., n (b) E1 ∪ Ε2 ∪... ∪ En = S and (c) P(Ei ) > 0 for all i = 1, 2,..., n. In other words, the events E1, E2,..., En represent a partition of the sample space S if they are pairwise disjoint, exhaustive and ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ have nonzero probabilities. As an example, we see that any nonempty event E and its complement E′ form a partition of the sample space S since they satisfy E ∩ E′ = φ and E ∪ E′ = S. From the Venn diagram in Fig 13.3, one can easily observe that if E and F are any two events associated with a sample space S, then the set {E ∩ F′, E ∩ F, E′2023/2024 2023/2024 ∩ F, E′ ∩ F′} is a partition of the 2023/2024 sample space S. It may be mentioned that the partition of a sample space is not unique. There can be several partitions of the same sample space. We shall now prove a theorem known as Theorem of total probability. 1.5.2.THEOREM OF TOTAL PRBABILITY Let {E1 , E2 ,...,En } be a partition of the sample space S, and suppose that each of the events E1 , E2 ,..., En has nonzero 25 [email protected] [email protected] [email protected] probability of occurrence. Let A be any event associated with S, then P(A) = P(E1 ) P(A|E1 ) + P(E2 ) P(A|E2 ) +... + P(En ) P(A|En ) = ∑ 𝑝( ) | Proof Given that E1 , E2 ,..., En is a partition of the sample space S. Therefore, ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ S = E1 ∪ E2 ∪... ∪ En (1) and Ei ∩ Ej = φ, i ≠ j, i, j = 1, 2,..., n Now, we know that for any event A, A=A∩S = A ∩ (E1 ∪ E2 ∪... ∪ En ) = (A ∩ E1 ) ∪ (A ∩ E2 ) ∪...∪ (A ∩ En ) 2023/2024 2023/2024 2023/2024 Also A ∩ Ei and A ∩ Ej are respectively the subsets of Ei and Ej. We know that Ei and Ej are disjoint, for i j ≠ , therefore, A ∩ Ei and A ∩ Ej are also disjoint for all i ≠ j, i, j = 1,2,..., n. Thus, P(A) = P [(A ∩ E1 ) ∪ (A ∩ E2 )∪.....∪ (A ∩ En )] = P (A ∩ E1 ) + P (A ∩ E2 ) +... + P (A ∩ En ) Now, by multiplication rule of probability, we have P(A ∩ Ei ) = P(Ei ) P(A|Ei ) as P (Ei ) ≠ 0∀i = 1,2,..., n Therefore, P (A) = P (E1 ) P (A|E1 ) + P (E2 ) P (A|E2 ) +... + P (En )P(A|En ) or P(A) = ∑ ( ) |. Example 17 26 [email protected] [email protected] [email protected] A person has undertaken a construction job. The probabilities are 0.65 that there will be strike, 0.80 that the construction job will be completed on time if there is no strike, and 0.32 that the construction job will be completed on time if there is a strike. Determine the probability that the construction job will be completed on time. Solution ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Let A be the event that the construction job will be completed on time, and B be the event that there will be a strike. We have to find P(A). We have P(B) = 0.65, P(no strike) = P(B′) = 1 − P(B) = 1 − 0.65 = 0.35 P(A|B) = 0.32, P(A|B′) = 0.80 Since events B and B′ form a partition of the sample space S, therefore, by theorem on total 2023/2024 2023/2024 probability, we have 2023/2024 P(A) = P(B) P(A|B) + P(B′) P(A|B′) = 0.65 × 0.32 + 0.35 × 0.8 = 0.208 + 0.28 = 0.488 Thus, the probability that the construction job will be completed in time is 0.488. We shall now state and prove the Bayes' theorem. Bayes’ Theorem If E1, E2 ,..., En are n non empty events which constitute a partition of sample space S, i.e. E1 , E2 ,..., En are pairwise 27 [email protected] [email protected] [email protected] disjoint and E1 ∪ E2 ∪... ∪ En = S and A is any event of nonzero probability, then | P(Ei |A) = ∑ for any i = 1, 2, 3,..., n. ( ) | Proof By formula of conditional probability, we know that | P(Ei |A) = (by multiplication rule of ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ probability) | =∑ (by the result of theorem of total probability) ( ) | Remark The following terminology is generally used when Bayes' theorem is applied. The events E1 , E2 ,..., En are called hypotheses. The probability P(Ei ) is called2023/2024 2023/2024 the priori probability of the 2023/2024 hypothesis Ei. The conditional probability P(Ei |A) is called a posteriori probability of the hypothesis Ei. Bayes' theorem is also called the formula for the probability of "causes". Since the Ei 's are a partition of the sample space S, one and only one of the events Ei occurs (i.e. one of the events Ei must occur and only one can occur). Hence, the above formula gives us the probability of a particular Ei (i.e. a "Cause"), given that the event A has occurred. The Bayes' theorem has its applications in variety of situations, few of which are illustrated in following examples. 28 [email protected] [email protected] [email protected] Example 18 Bag I contains 3 red and 4 black balls while another Bag II contains 5 red and 6 black balls. One ball is drawn at random from one of the bags and it is found to be red. Find the probability that it was drawn from Bag II. Solution Let E1 be the event of choosing the bag I, E2 the event of ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ choosing the bag II and A be the event of drawing a red ball. Then P(E1 ) = P(E2 ) = Also P(A|E1 ) = P(drawing a red ball from Bag I) = and P(A|E2 ) = P(drawing a red ball from Bag II) = Now, the probability of drawing a ball from Bag II, being given that it is red, is P(E2 |A) 2023/2024 2023/2024 2023/2024 By using Bayes' theorem, we have | P(E2 |A) = |. | Example 19 Given three identical boxes I, II and III, each containing two coins. In box I, both coins are gold coins, in box II, both are silver coins and in the box III, there is one gold and one silver coin. A person chooses a box at random and takes out a coin. If the coin is of gold, what is the probability that the other coin in the box is also of gold? Solution 29 [email protected] [email protected] [email protected] Let E1, E2 and E3 be the events that boxes I, II and III are chosen, respectively. Then P(E1 ) = P(E2 ) = P(E3 ) = Also, let A be the event that „the coin drawn is of gold‟ Then P(A|E1 ) = P(a gold coin from bag I) = P(A|E2 ) = P(a gold coin from bag II) = 0 ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ P(A|E3 ) = P(a gold coin from bag III) = Now, the probability that the other coin in the box is of gold = the probability that gold coin is drawn from the box I. = P(E1 |A) By Bayes' theorem, we know that | P(E1 |A) = | |. | 2023/2024 2023/2024 2023/2024 Example 20 Suppose that the reliability of a HIV test is specified as follows: Of people having HIV, 90% of the test detect the disease but 10% go undetected. Of people free of HIV, 99% of the test are judged HIV–ive but 1% are diagnosed as showing HIV+ive. From a large population of which only 0.1% have HIV, one person is selected at random, given the HIV test, and the pathologist reports him/her as HIV+ive. What is the probability that the person actually has HIV? Solution 30 [email protected] [email protected] [email protected] Let E denote the event that the person selected is actually having HIV and A the event that the person's HIV test is diagnosed as +ive. We need to find P(E|A). Also E′ denotes the event that the person selected is actually not having HIV. Clearly, {E, E′} is a partition of the sample space of all people in the population. We are given that P(E) = 0.1% = ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ P(E′) = 1 – P(E) = 0.999 P(A|E) = P(Person tested as HIV+ive given that he/she is actually having HIV) = 90% = = 0.9 and P(A|E′) = P(Person tested as HIV +ive given that he/she is actually not having HIV) = 1% = = 0.01 Now, by Bayes' theorem 2023/2024 2023/2024 2023/2024 | P(E|A) = | = 0.083 approx. | Thus, the probability that a person selected at random is actually having HIV given that he/she is tested HIV+ive is 0.083. 31 [email protected] [email protected] [email protected] Chapter 2 Random Variables 2.1.INTRODUCTION We have already learnt about random experiments and formation of sample spaces. In most of these experiments, we ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ were not only interested in the particular outcome that occurs but rather in some number associated with that outcomes as shown in following examples /experiments. (i) In tossing two dice, we may be interested in the sum of the numbers on the two dice. (ii) In tossing a coin 50 times, we may want the number of heads obtained. 2023/2024 2023/2024 2023/2024 (iii) In the experiment of taking out four articles (one after the other) at random from a lot of 20 articles in which 6 are defective, we want to know the number of defectives in the sample of four and not in the particular sequence of defective and no defective articles. In all of the above experiments, we have a rule which assigns to each outcome of the experiment a single real number. This single real number may vary with different outcomes of the experiment. Hence, it is a variable. Also its value depends upon the outcome of a random experiment and, hence, is called random variable. 32 [email protected] [email protected] [email protected] A random variable is usually denoted by X. If you recall the definition of a function, you will realize that the random variable X is really speaking a function whose domain is the set of outcomes (or sample space) of a random experiment. A random variable can take any real value, therefore, its co-domain is the set of real numbers. Hence, a ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ random variable can be defined as follows: Definition A random variable is a real valued function whose domain is the sample space of a random experiment. 2023/2024 2023/2024 2023/2024 For example, let us consider the experiment of tossing a coin two times in succession. The sample space of the experiment is S = {HH, HT, TH, TT}. If X denotes the number of heads obtained, then X is a random variable and for each outcome, its value is as given below: X(HH) = 2, X (HT) = 1, X (TH) = 1, X (TT) = 0. More than one random variables can be defined on the same sample space. For example, let Y denote the number of heads minus the number of tails for each outcome of the above sample space S. Then Y(HH) = 2, Y (HT) = 0, Y (TH) = 0, Y (TT) = – 2. 33 [email protected] [email protected] [email protected] Thus, X and Y are two different random variables defined on the same sample space S. Example 1 A person plays a game of tossing a coin thrice. For each head, he is given Rs 2 by the organizer of the game and for each tail, he has to give Rs 1.50 to the organizer. Let X denote the amount ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ gained or lost by the person. Show that X is a random variable and exhibit it as a function on the sample space of the experiment. Solution X is a number whose values are defined on the outcomes of a random experiment. Therefore,2023/2024 2023/2024 X is a random variable. 2023/2024 Now, sample space of the experiment is S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT} Then X (HHH) = Rs (2 × 3) = Rs 6 X(HHT) = X (HTH) = X(THH) = Rs (2 × 2 − 1 × 1.50) = Rs 2.50 X(HTT) = X(THT) = (TTH) = Rs (1 × 2) – (2 × 1.50) = – Re 1 and X(TTT) = − Rs (3 × 1.50) = − Rs 4.50 where, minus sign shows the loss to the player. Thus, for each element of the sample space, X takes a unique value, hence, X is a function on the sample space whose range is {–1, 2.50, – 4.50, 6} 34 [email protected] [email protected] [email protected] Example 2 A bag contains 2 white and 1 red balls. One ball is drawn at random and then put back in the box after noting its color. The process is repeated again. If X denotes the number of red balls recorded in the two draws, describe X. Solution ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Let the balls in the bag be denoted by w1 , w2 , r. Then the sample space is S = {w1 w1 , w1 w2 , w2 w2 , w2 w1 , w1 r, w2 r, r w1 , r w2 , r r} Now, for ω S X (ω) = number of red balls Therefore X({w1 w1 }) = X({w1 w2 }) =2023/2024 2023/2024 X({w2 w2 }) = X({w22023/2024 w1 }) = 0 X({w1 r}) = X({w2 r}) = X({r w1 }) = X({r w2 }) = 1 and X({r r}) = 2 Thus, X is a random variable which can take values 0, 1 or 2. 2.2 DISCRETE RANDOM VARIBLES We distinguish primarily between random variables that have countable range and those that have uncountable range. Let us examine the first case and start with a definition. Definition If the range of X is countable, then X is called a discrete random variable. For a discrete random variable X, we are interested in computing probabilities of the type P(X = xk) for various values 35 [email protected] [email protected] [email protected] of xk in the range of X. As we vary xk, the probability P(X = xk) changes, so it is natural to view P(X = xk) as a function of xk. We now formally define and name this function. Definition Let X be a discrete random variable with range {x1, x2,...} (finite or countably infinite). The function p(xk) = P(X = xk), k = 1, 2,... is called the probability mass function (pmf) of X. ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Sometimes we also use the notation pX for the pmf, if it is needed to stress which the random variable is. When we represent a pmf as a bar chart, the height of a bar equals the probability of the corresponding value on the x axis. Since X cannot take on values other than x1, x2,..., we can imagine bars of height 0 at all other values and 2023/2024 could thus view the 2023/2024 2023/2024 pmf as a function on all of R (the real numbers) if we wish. The numbers x1, x2,... do not have to be integers, but they often are. Example 3 Let X be the number of daughters in a family with three children. There are eight possible sequences of boys and girls (in order of birth), and we get the sample space S = {bbb, bbg, bgb, bgg, gbb, gbg, ggb, ggg} where, for example, bbg means that the oldest child is a boy, the middle child a boy, and the youngest child a girl. 36 [email protected] [email protected] [email protected] The range of X is {0, 1, 2, 3} and the values of the pmf are p(0) = , p(1) = , p(2) = , p(3) =. The pmf is illustrated in Fig 3.1. ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ Fig. 2.1 Bar chart for the number of daughters By the properties of probability 2023/2024 measures, we have the2023/2024 2023/2024 following proposition. Proposition A function p is a possible pmf of a discrete random variable on the range {x1, x2,...} if and only if (a) p(xk) ≥ 0 for k = 1, 2,... (b) ∑ =1 If the range of X is finite, the sum in (b) is finite. So far we have considered events of the type {X = k}, the event that X equals a particular value k. We could also look at events of the type {X ≤ k}, the event that X is less than or equal to k. For example, when we roll a die we might ask for the probability that we get at most 37 [email protected] [email protected] [email protected] 1, at most 2, and so on. This leads to another function to be defined next. Definition Let X be any random variable. The function F(x) = P(X ≤ x), x R is called the (cumulative) distribution function (cdf) of X. Example 4 ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ In a family with three children, let X be the number of daughters. The range of X is {0, 1, 2, 3}, so let us start by computing F(k) for these values. Solution We get F(0) = P(X ≤ 0) = p(0) = 2023/2024 2023/2024 2023/2024 since the only way to be less than or equal to 0 is to be equal to 0. For k = 1, we first note that being less than or equal to 1 means being equal to 0 or 1. In terms of events, we have {X ≤ 1} = {X = 0} ∪ {X = 1} and since the events are disjoint, we get F(1) = P(X ≤ 1) = P(X = 0) + P(X = 1) = p(0) + p(1) = Continuing like this, we also get F(2) = and F(3) = 1. Now we have the values of F at the points in the range of X. What about other values? Let us, for example, consider the point 0.5. Noting that X ≤ 0.5 means that X ≤ 0, we get 38 [email protected] [email protected] [email protected] F(0.5) = F(0) = and we realize that F(x) = F(0) for all points x [0, 1). Similarly, all points x [1, 2) have F(x) = F(1) and all points x [2, 3) have F(x) = F(2). Finally, since X cannot take on any negative values, we have F(x) = 0 for x < 0, and since X is always at most 3, we have F(x) = 1 for x ≥ 3. This gives the final form of F, which is illustrated ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ in Figure 3.2 2023/2024 2023/2024 2023/2024 Fig. 2.2 Graph of the cdf of the number of daughters The graph in Fig 2.2 is typical for the cdf of a discrete random variable. It jumps at each value that X can assume, and the size of the jump is the probability of that value. Between points in the range of X, the cdf is constant. Note how it is always nondecreasing and how it ranges from 0 to 1. Also note in which way the points are filled where F jumps, which can be expressed by saying that F is right-continuous. Some of these observed properties turn out to be true for any cdf, not only for that of a discrete random variable. In the next section we shall return to this. 39 [email protected] [email protected] [email protected] 2.3 MEAN AND VARIANCE FOR DISCRETE RVS In many problems, it is desirable to describe some feature of the random variable by means of a single number that can be computed from its probability distribution. Few such numbers are mean, median and mode. In this section, we shall discuss mean only. Mean is a measure of location or central tendency in ‫ﻫﺒﻪ ﻣﺘﻮﻟﻰ ﻣﺤﻤﺪ ﺍﺑﺮﺍﻫﻴﻢ ﺍﻟﺴﻴﺪ‬ the sense that it roughly locates a middle or average value of the random variable. Definition Let X be a random variable whose possible values x1 , x2 , x3 ,..., xn occur with probabilities p1 , p2 , p3 ,..., pn, respectively. The mean of X, denoted by μ, is the number ∑ 𝑝 i.e. the mean of X is the weighted average of the possible values of X, each value 2023/2024 being weighted by its2023/2024 2023/2024 probability with which it occurs.

Use Quizgecko on...
Browser
Browser