D&D_Unit 1_Slide Deck PDF
Document Details
Uploaded by CommodiousNephrite7842
PES University
Arvind L N
Tags
Summary
This document is a slide deck titled "Data and Decisions" that introduces concepts of probability and sampling distributions. The deck includes various types of sampling methods, probability distributions, and mathematical expectation, with examples and diagrams.
Full Transcript
DATA AND DECISIONS UNIT 1 Probability and Sampling Distributions Arvind L N Department of Management...
DATA AND DECISIONS UNIT 1 Probability and Sampling Distributions Arvind L N Department of Management Studies DATA AND DECISIONS Probability and Sampling Distributions Session 01 Arvind L N Department of Management Studies DATA AND DECISIONS Probability and Sampling Distributions Topics Covered i. Discrete and Continuous; Discrete Probability Distributions; ii. Continuous probability distributions; Normal Probability Distribution, iii. Descriptive statistics and inferential statistics.; iv. Measurement scales, data collection, data visualization; v. Sampling Distributions; DATA AND DECISIONS Probability and Sampling Distributions Topics Covered vi. Sampling Methods: Simple Random Sampling, a. Stratified Random Sampling, b. Cluster Sampling, c. Systematic Sampling, d. Convenience Sampling, e. Judgment Sampling; Two Case Studies. DATA AND DECISIONS Probability and Sampling Distributions Experiment Consider the experiment of tossing a coin two times. Sample Space An associated sample space is S = {HH, HT, TH, TT}. Now suppose that we are interested in those outcomes corresponding to the occurrence of exactly one head. Elements We find that HT and TH are the only elements of S corresponding to the occurrence of this happening (event). These two elements form the set E = { HT, TH} DATA AND DECISIONS Probability and Sampling Distributions Description of events Corresponding subset of ‘S’ Number of tails is exactly 2 A = {TT} Number of tails is atleast one B = {HT, TH, TT} Number of heads is atmost one C = {HT, TH, TT} Second toss is not head D = { HT, TT} Number of tails is atmost two S = {HH, HT, TH, TT} Number of tails is more than two φ DATA AND DECISIONS Probability and Sampling Distributions Random Experiments Experiments where: a. we are not able to ascertain or control the value of certain variables b. due to which results will vary from one performance of the experiment to the next c. even though most of the conditions are the same. These experiments are described as random. DATA AND DECISIONS Probability and Sampling Distributions Random Experiments Example 1.1 If we toss a coin, the result of the experiment is that it will either come up “tails,” symbolized by T (or 0), or “heads,” symbolized by H (or 1), i.e., one of the elements of the set {H, T} (or {0, 1}). Example 1.2 If we toss a die, the result of the experiment is that it will come up with one of the numbers in the set {1, 2, 3, 4, 5, 6}. DATA AND DECISIONS Probability and Sampling Distributions Random Experiments Example 1.3 If we toss a coin twice, there are four results possible, as indicated by {HH, HT, TH, TT}, i.e., both heads, heads on first and tails on second, etc. Example 1.4 If we are making bolts with a machine, the result of the experiment is that some may be defective. Thus when a bolt is made, it will be a member of the set {defective, non- defective}. DATA AND DECISIONS Probability and Sampling Distributions Random Experiments Example 1.5 If an experiment consists of measuring “lifetimes” of electric light bulbs produced by a company, then the result of the experiment is a time t in hours that lies in some interval—say, 0 ≤ t ≤ 4000 where we assume that no bulb lasts more than 4000 hours. DATA AND DECISIONS Probability and Sampling Distributions Sample Spaces A set S that consists of all possible outcomes of a random experiment is called a sample space, and each outcome is called a sample point. Often there will be more than one sample space that can describe outcomes of an experiment, but there is usually only one that will provide the most information. DATA AND DECISIONS Probability and Sampling Distributions Sample Spaces Example 1.6 If we toss a die, one sample space, or set of all possible outcomes, is given by {1, 2, 3, 4, 5, 6} while another is {odd, even}. It is clear, however, that the latter would not be adequate to determine, for example, whether an outcome is divisible by 3. Note: It is often useful to portray a sample space graphically. In such cases it is desirable to use numbers in place of letters whenever possible. DATA AND DECISIONS Probability and Sampling Distributions Sample Spaces Example 1.7 If we toss a coin twice and use 0 to represent tails and 1 to represent heads, the sample space (see Example 1.3) can be portrayed by points as in Fig. 1-1 where, for example, (0, 1) represents tails on first toss and heads on second toss, i.e., TH. DATA AND DECISIONS Probability and Sampling Distributions Sample Spaces If a sample space has a finite number of points, as in Example 1.7, it is called a finite sample space. If it has as many points as there are natural numbers 1, 2, 3,... , it is called a countably infinite sample space. If it has as many points as there are in some interval on the x axis, such as 0 x 1, it is called a uncountably infinite sample space. DATA AND DECISIONS Probability and Sampling Distributions Sample Spaces A sample space that is finite or countably infinite is often called a discrete sample space, while one that is uncountably infinite is called a non-discrete (or continuous) sample space. DATA AND DECISIONS Probability and Sampling Distributions Events An event is a subset A of the sample space S, i.e., it is a set of possible outcomes. If the outcome of an experiment is an element of A, we say that the event A has occurred. An event consisting of a single point of S is often called a simple or elementary event. DATA AND DECISIONS Probability and Sampling Distributions Events Example 1.8 If we toss a coin twice, the event that only one head comes up is the subset of the sample space that consists of points (0, 1) and (1, 0), as indicated in Fig. 1-2. DATA AND DECISIONS Probability and Sampling Distributions Events As particular events, we have S itself, which is the sure or certain event since an element of S must occur, and the empty set Φ (phi), which is called the impossible event because an element of Φ cannot occur. DATA AND DECISIONS Probability and Sampling Distributions Events By using set operations on events in S, we can obtain other events in S. For example, if A and B are events, then: 1. A ∪ Bis the event “either A or B or both.” A ∪ B is called the union of A and B. 2. A ∩ B is the event “both A and B.” A ∩ B is called the intersection of A and B. 3. A’ is the event “not A.” A’ is called the complement of A. 4. A - B = A ∩ B’ is the event “A but not B.” In particular, A’ = S - A. DATA AND DECISIONS Probability and Sampling Distributions Events If the sets corresponding to events A and B are disjoint, i.e., A∩B= Φ , we often say that the events are mutually exclusive. This means that they cannot both occur. We say that a collection of events A1, A2,.. ,An is mutually exclusive if every pair in the collection is mutually exclusive. DATA AND DECISIONS Probability and Sampling Distributions Example 1.9 Referring to the experiment of tossing a coin twice, let A be the event “at least one head occurs” and B the event “the second toss results in a tail.” Then A = {HT, TH, HH}, B = {HT, TT }, and so we have A ∪ B = {HT, TT, HH, TT} = S A ∩ B = {HT} A’ = {TT} A - B = {TH, HH} DATA AND DECISIONS Probability and Sampling Distributions The Concept of Probability In any random experiment there is always uncertainty as to whether a particular event will or will not occur. As a measure of the chance, or probability, with which we can expect the event to occur, it is convenient to assign a number between 0 and 1. If we are sure or certain that the event will occur, we say that its probability is 100% or 1, but if we are sure that the event will not occur, we say that its probability is zero. If, for example, the probability is we would say that there is a 25% chance it will occur and a 75% chance that it will not occur. Equivalently, we can say that the odds against its occurrence are 75% to 25%, or 3 to 1. DATA AND DECISIONS Probability and Sampling Distributions The Concept of Probability There are two important procedures by means of which we can estimate the probability of an event. 1. Classical Approach. If an event can occur in h different ways out of a total number of n possible ways, all of which are equally likely, then the probability of the event is h / n. Example 1.10 Suppose we want to know the probability that a head will turn up in a single toss of a coin. Since there are two equally likely ways in which the coin can come up—namely, heads and tails (assuming it does not roll away or stand on its edge)—and of these two ways a head can arise in only one way, we reason that the required probability is 1 / 2. In arriving at this, we assume that the coin is fair, i.e., not loaded in any way***. DATA AND DECISIONS Probability and Sampling Distributions The Concept of Probability 2. Frequency Approach. If after n repetitions of an experiment, where n is very large, an event is observed to occur in h of these, then the probability of the event is h n. This is also called the empirical probability of the event. Example 1.11 If we toss a coin 1,000 times and find that it comes up heads 532 times, we estimate the probability of a head coming up to be 532 / 1000 = 0.532. DATA AND DECISIONS Probability and Sampling Distributions The Concept of Probability Both the classical and frequency approaches have serious drawbacks, the first because the words “equally likely” are vague and the second because the “large number” involved is vague. Because of these difficulties, mathematicians have been led to an axiomatic approach to probability. DATA AND DECISIONS Probability and Sampling Distributions The Axioms of Probability 0 ≤ P (A) ≤ 1 P(S) = 1 P (A ∪ B) = P (A) + P (B) IFF (A ∩ B) = Φ Generalised P (A ∪ B) = P (A) + P (B) - (A ∩ B) DATA AND DECISIONS Probability and Sampling Distributions The Axioms of Probability Example 1.12 A single die is tossed once. Find the probability of a 2 or 5 turning up. The sample space is S {1, 2, 3, 4, 5, 6}. If we assign equal probabilities to the sample points, i.e., if we assume that the die is fair, then The event that either 2 or 5 turns up is indicated by 2 ∪ 5. Therefore, DATA AND DECISIONS Probability and Sampling Distributions Example 1.13 DATA AND DECISIONS Probability and Sampling Distributions Types of events Events can be classified into various types on the basis of the elements they have. 1. Impossible and Sure Events The empty set φ and the sample space S describe events. In fact φ is called an impossible event and S, i.e., the whole sample space is called the sure event. To understand these let us consider the experiment of rolling a die. The associated sample space is S = {1, 2, 3, 4, 5, 6} DATA AND DECISIONS Probability and Sampling Distributions Types of events Let E be the event “ the number appears on the die is a multiple of 7”. Can you write the subset associated with the event E? Thus, the event E = φ is an impossible event. Now let us take up another event F “the number turns up is odd or even”. Clearly F = {1, 2, 3, 4, 5, 6,} = S, i.e., all outcomes of the experiment ensure the occurrence of the event F. Thus, the event F = S is a sure event. DATA AND DECISIONS Probability and Sampling Distributions Types of events 2. Simple Event If an event E has only one sample point of a sample space, it is called a simple (or elementary) event. In a sample space containing n distinct elements, there are exactly n simple events. For example in the experiment of tossing two coins, a sample space is S={HH, HT, TH, TT} There are four simple events corresponding to this sample space. These are E1= {HH}, E2={HT}, E3= { TH} and E4={TT}. DATA AND DECISIONS Probability and Sampling Distributions Types of events 3. Compound Event If an event has more than one sample point, it is called a Compound event. For example, in the experiment of “tossing a coin thrice” the events E: ‘Exactly one head appeared’ F: ‘Atleast one head appeared’ G: ‘Atmost one head appeared’ etc. are all compound events. The subsets of S associated with these events are E={HTT,THT,TTH} F={HTT,THT, TTH, HHT, HTH, THH, HHH} G= {TTT, THT, HTT, TTH} Each of the above subsets contain more than one sample point, hence they are all compound events. DATA AND DECISIONS Probability and Sampling Distributions Algebra of events Complementary Event For every event A, there corresponds another event A′ called the complementary event to A. It is also called the event ‘not A’. For example, take the experiment ‘of tossing three coins’. An associated sample space is S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT} Let A={HTH, HHT, THH} be the event ‘only one tail appears’ Thus the complementary event ‘not A’ to the event A is A′ = {HHH, HTT, THT, TTH, TTT} DATA AND DECISIONS Probability and Sampling Distributions Algebra of events The Event ‘A or B’ Recall that union of two sets A and B denoted by A ∪ B contains all those elements which are either in A or in B or in both. When the sets A and B are two events associated with a sample space, then ‘A ∪ B’ is the event ‘either A or B or both’. This event ‘A ∪ B’ is also called ‘A or B’. Therefore Event ‘A or B’ = A ∪ B DATA AND DECISIONS Probability and Sampling Distributions Algebra of events The Event ‘A and B’ We know that intersection of two sets A ∩ B is the set of those elements which are common to both A and B. i.e., which belong to both ‘A and B’. If A and B are two events, then the set A ∩ B denotes the event ‘A and B’. For example, in the experiment of ‘throwing a die twice’ Let A be the event ‘score on the first throw is six’ and B is the event ‘sum of two scores is atleast 11’ then A = {(6,1), (6,2), (6,3), (6,4), (6,5), (6,6)}, and B = {(5,6), (6,5), (6,6)} so A ∩ B = {(6,5), (6,6)} DATA AND DECISIONS Probability and Sampling Distributions Algebra of events The Event ‘A but not B’ We know that A–B is the set of all those elements which are in A but not in B. Therefore, the set A–B may denote the event ‘A but not B’. We know that A – B = A ∩ B´ DATA AND DECISIONS Probability and Sampling Distributions Algebra of events The Event ‘A but not B’ Example 1 Consider the experiment of rolling a die. Let A be the event ‘getting a prime number’, B be the event ‘getting an odd number’. Write the sets representing the events (i) Aor B (ii) A and B (iii) A but not B (iv) ‘not A’. Solution Here S = {1, 2, 3, 4, 5, 6}, A = {2, 3, 5} and B = {1, 3, 5} Obviously (i) ‘A or B’ = A ∪ B = {1, 2, 3, 5} (ii) ‘A and B’ = A ∩ B = {3,5} (iii) ‘A but not B’ = A – B = {2} (iv) ‘not A’ = A′ = {1,4,6} DATA AND DECISIONS Probability and Sampling Distributions A card is drawn at random from an ordinary deck of 52 playing cards. Find the probability that it is (a) an ace, (b) a jack of hearts, (c) a three of clubs or a six of diamonds, (d) a heart, (e) any suit except hearts, (f) a ten or a spade, (g) neither a four nor a club. DATA AND DECISIONS Probability and Sampling Distributions Solution Let us use for brevity H, S, D, C to indicate heart, spade, diamond, club, respectively, and 1, 2,…..,13 for ace, two, , king. Then 3∩H means three of hearts, while 3∪H means three or heart. For example, P(6 ∩ C) 1/52. DATA AND DECISIONS Probability and Sampling Distributions Solution A card is drawn at random from an ordinary deck of 52 playing cards. Find the probability that it is (a) an ace, DATA AND DECISIONS Probability and Sampling Distributions Solution A card is drawn at random from an ordinary deck of 52 playing cards. Find the probability that it is b) a jack of hearts, DATA AND DECISIONS Probability and Sampling Distributions Solution A card is drawn at random from an ordinary deck of 52 playing cards. Find the probability that it is c) a three of clubs or a six of diamonds, DATA AND DECISIONS Probability and Sampling Distributions Solution A card is drawn at random from an ordinary deck of 52 playing cards. Find the probability that it is d) a heart, DATA AND DECISIONS Probability and Sampling Distributions Solution A card is drawn at random from an ordinary deck of 52 playing cards. Find the probability that it is e) any suit except hearts, DATA AND DECISIONS Probability and Sampling Distributions Solution A card is drawn at random from an ordinary deck of 52 playing cards. Find the probability that it is f) a ten or a spade, DATA AND DECISIONS Probability and Sampling Distributions Solution A card is drawn at random from an ordinary deck of 52 playing cards. Find the probability that it is g) neither a four nor a club. DATA AND DECISIONS Probability and Sampling Distributions Question A ball is drawn at random from a box containing 6 red balls, 4 white balls, and 5 blue balls. Determine the probability that it is (a) red, (b) white, (c) blue, (d) not red, (e) red or white. DATA AND DECISIONS Probability and Sampling Distributions DATA AND DECISIONS Probability and Sampling Distributions DATA AND DECISIONS Probability and Sampling Distributions Question A fair die is tossed twice. Find the probability of getting a 4, 5, or 6 on the first toss and a 1, 2, 3, or 4 on the second toss. Let A1 be the event “4, 5, or 6 on first toss,” and A2 be the event “1, 2, 3, or 4 on second toss.” Then we are looking for P(A1 and A2) which is P(A1 ∩ A2). DATA AND DECISIONS Probability and Sampling Distributions Question Find the probability of not getting a 7 or 11 total on either of two tosses of a pair of fair dice. DATA AND DECISIONS Probability and Sampling Distributions Question Find the probability of not getting a 7 or 11 total on either of two tosses of a pair of fair dice. DATA AND DECISIONS Probability and Sampling Distributions Question Box I contains 3 red and 2 blue marbles while Box II contains 2 red and 8 blue marbles. A fair coin is tossed. If the coin turns up heads, a marble is chosen from Box I; if it turns up tails, a marble is chosen from Box II. Find the probability that a red marble is chosen. Let R denote the event “a red marble is chosen” while I and II denote the events that Box I and Box II are chosen, respectively. Since a red marble can result by choosing either Box I or II DATA AND DECISIONS Probability and Sampling Distributions Question Suppose in the previous Problem, the one who tosses the coin does not reveal whether it has turned up heads or tails (so that the box from which a marble was chosen is not revealed) but does reveal that a red marble was chosen. What is the probability that Box I was chosen (i.e., the coin turned up heads)? DATA AND DECISIONS Probability and Sampling Distributions Question Let us use the same terminology i.e., A=R, A1=I, A2=II. DATA AND DECISIONS Probability and Sampling Distributions DATA AND DECISIONS Probability and Sampling Distributions DATA AND DECISIONS Probability and Sampling Distributions DATA AND DECISIONS Probability and Sampling Distributions DATA AND DECISIONS Probability and Sampling Distributions Question Suppose 75% of the students in a University lives on campus, and 80% of the students living off-campus and 50% of the students living on-campus owns a vehicle. What is the probability that a student owning a vehicle lives on campus? DATA AND DECISIONS Probability and Sampling Distributions Solution Here we have two mutually exclusive and exhaustive states of nature A1 and A2 denoting a student living “on” and “off” campus respectively with living “on campus” = P(A1) = 0.75 and living “off campus” = P(A2) = 0.25. Let B be the event of a student owning a vehicle. Then it is given that P(B|A1) = 0.5 and P(B|A2) = 0.8 and we are to find P(A1|B). DATA AND DECISIONS Probability and Sampling Distributions Solution Bayes’ theorem (The prob of B wen A1occurs + the probability of B when A2 occurs) DATA AND DECISIONS Probability and Sampling Distributions Solution Bayes’ theorem DATA AND DECISIONS Probability and Sampling Distributions Random Variables Suppose that to each point of a sample space we assign a number. We then have a function defined on the sample space. This function is called a random variable (or stochastic variable) or more precisely a random function (stochastic function). It is usually denoted by a capital letter such as X or Y. In general, a random variable has some specified physical, geometrical, or other significance. DATA AND DECISIONS Probability and Sampling Distributions Random Variables Suppose that a coin is tossed twice so that the sample space is S {HH, HT, TH, TT}. Let X represent the number of heads that can come up. With each sample point we can associate a number for X as shown in Table 2-1. Thus, for example, in the case of HH (i.e., 2 heads), X=2 while for TH (1 head), X=1. It follows that X is a random variable. DATA AND DECISIONS Probability and Sampling Distributions Random Variables A random variable that takes on a finite or countably infinite number of is called a discrete random variable while one which takes on a non-countably infinite number of values is called a non-discrete random variable / continuous random variable. DATA AND DECISIONS Probability and Sampling Distributions Example Find the probability function corresponding to the random variable X (represent the number of heads). Assuming that the coin is fair, we have: DATA AND DECISIONS Probability and Sampling Distributions Example ? X = Number of heads, this capital X can take any value of ‘x’ at any given point of time. DATA AND DECISIONS Probability and Sampling Distributions DATA AND DECISIONS Probability and Sampling Distributions Example Find the probability function corresponding to the random variable X (represent the number of heads). Assuming that the coin is fair, we have: The probability function is thus given as: DATA AND DECISIONS Probability and Sampling Distributions Distribution Functions for Random Variables The cumulative distribution function, or briefly the distribution function, for a random variable X is defined by where x is any real number, i.e., -∞ < x < ∞ Lets recall the values of previous question: DATA AND DECISIONS Probability and Sampling Distributions Distribution Functions for Random Variables Two coins are tossed S = (HH, HT, TH, TT) X = No of heads. x = 0,1,2 DATA AND DECISIONS Probability and Sampling Distributions Distribution Functions for Random Variables x = 0,1,2 P(x=0) = ¼ P(x=1) = 2/4 or ½ P(x=2) = ¼ P(x=3) = 0 P(x=-1) = 0 Probability function P(x≤0) = P(x