Probability I Past Paper PDF
Document Details
Uploaded by RefreshingParable7206
Bowen University
Daniel Akinboro
Tags
Summary
This document appears to be lecture notes on probability for a university statistics program at Bowen University, in Nigeria. It covers topics such as probability distribution functions, mass functions, density functions, and various discrete and continuous probability distributions, including Bernoulli, Binomial, Hypergeometric, Poisson and normal distributions. The content, however, does not have any question attached, thus, is not a past exam paper.
Full Transcript
BOWEN UNIVERSITY, IWO, NIGERIA COLLEGE OF AGRICULTURE, ENGINEERING AND SCIENCES STATISTICS PROGRAMME STA 112 PROBABILITY I Credits: 3 Daniel AKINBORO Probability distribution Funct...
BOWEN UNIVERSITY, IWO, NIGERIA COLLEGE OF AGRICULTURE, ENGINEERING AND SCIENCES STATISTICS PROGRAMME STA 112 PROBABILITY I Credits: 3 Daniel AKINBORO Probability distribution Function Let X be a random variable with probability density function f(x). The probability distribution function of X denoted f(x) is defined by 𝐹 𝑥 = 𝑝 𝑋 ≤ 𝑥 , 𝑓𝑜𝑟 𝑥 𝑟𝑒𝑎𝑙 = 𝑓(𝑦) 𝑦≤𝑥 Properties of the Probability distribution function F is a non decreasing function, that is if a < b, then F(a) < F(b). lim F(b) = 1, lim F(b) = 0. 𝑏 → ∞. F is right continuous. That is F(b + ) = F(b). Probability Mass Function and Probability Density Function Probability Mass Function (PMF): The PMF is used to describe the probability distribution of a discrete random variable. The PMF gives the probability that a discrete random variable takes on a specific value. Probability Density Function (PDF): The PDF is used to describe the probability distribution of a continuous random variable. The PDF gives the probability density of a continuous random variable at a specific point. Note: PDF can also be used to represent the probability distribution of a discrete random variable. CDF is the cumulative pdf, denoted F(x). Probability Mass Function (PMF) In this course we would look at the basic probability distribution of a discrete random variable and a single probability distribution of a continuous random variable. These Include: Bernoulli distribution Binomial distribution Hypergeometric distribution Poisson distribution and Normal distribution (continuous probability distribution). Bernoulli Random variables A random trail or experiment is which the outcome can be classified into one of two mutually exclusive and exhaustive ways usually called success or failure is called a Bernoulli trail. The random variable associated with Bernoulli trail is called a Bernoulli random variable (X). Let X=0 if outcome is a failure and X=1 if the outcome is a success. That is, any variable assuming only two values is called a Bernoulli random variable. Suppose that we toss a coin once. Let the probability of it landing head 1 be p. 𝑝 = , if the coin is fair and let X denote the outcome of the toss. 2 Then there are two possible values for X, Heads or tails. These two values are mutually, exclusive and exhaustive and we may associate the two possible outcomes of the toss with values 1, 0 of the random variables X. That X = 1 when a head appears and X = 0 when a tail appears. Bernoulli Random variables (cont.) P 𝑋 = 1 = 𝑝, 𝑃 𝑋 = 0 = 1 − 𝑝 The pdf of X is 𝑿 0 1 𝑓(𝑥) 1−𝑝 𝑝 Or in functional form f 𝑥 = 𝑝 𝑥 (1 − 𝑝)1−𝑥 , 𝑥 = 0,1 0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒 f(x) as defined above is called the Bernoulli probability density function and any variable X having the above f(x) has its probability density function called a Bernoulli random variable and is said to have the Bernoulli distribution. Bernoulli Random variables (cont.) Expectation, Variance and Standard Deviation of a Bernoulli Random Variable. The probability density function of a bernoulli variable X is defined by 𝑃 𝑋 = 𝑥 = 𝑝 𝑥 (1 − 𝑝)1−𝑥 ; 𝑥 = 0,1, 0 ≤ 𝑝 ≤ 1 the mean or expected value of X is 𝐸 𝑋 =𝑝 The variance and standard deviation (SD) of X is 𝑉𝑎𝑟 𝑋 = 𝑝 1 − 𝑝 𝑆𝐷 𝑋 = 𝑉𝑎𝑟(𝑋) = 𝑝(1 − 𝑝) The Binomial Random Variable This is one of the most important random variables in statistics and the most important discrete random variable. Consider n independent repetitions of Bernoulli trails. Let 𝑋𝑖 , 𝑖 = 0,1, … , 𝑛 be Bernoulli random variables associated with the trails. The random variables 𝑋1 , 𝑋2 , … , 𝑋𝑛 are independent Bernoulli random variables. Let us assume the probability of success is p and failure 1-p and 𝑃(𝑋1 = 𝑝). Then, the sum 𝑆𝑛 = 𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 is the number of successes in n Bemoulli trials. That is, 𝑆𝑛 is a counting- variable counting the number of successes in n repeated trials. This random variable 𝑆𝑛 is called the Binomial random variable. The possible values of 𝑆𝑛 are 0,1,2, … , 𝑛. The Binomial Random Variable (cont.) Definition: A discrete random variable X denoting total number of successes in n trails is said to have the binomial distribution if 𝑃 𝑋 = 𝑥 = 𝑛∁𝑥 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 ; 𝑥 = 0,1, … , 𝑛, 0 ≤ 𝑝 ≤ 1 The conditions under which binomial distribution will arise are (i) The number of trails is fixed (ii) There are only two possible outcome ‘success’ or ‘failure’ at each trial. (iii) The trails are independent (iv) The probability p of success at each trail is constant (v) The variable is the total number of successes in n trails. The Binomial Random Variable (cont.) Expectation, Variance and Standard Deviation of a Binomial Random Variable. The probability density function of a binomial variable X is defined by 𝑃 𝑋 = 𝑥 = 𝑛∁𝑥 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 ; 𝑥 = 0,1, … , 𝑛, 0 ≤ 𝑝 ≤ 1 the mean or expected value of X is 𝐸 𝑋 = 𝑛𝑝 The variance and standard deviation (SD) of X is 𝑉𝑎𝑟 𝑋 = 𝑛𝑝 1 − 𝑝 𝑆𝐷 𝑋 = 𝑉𝑎𝑟(𝑋) = 𝑛𝑝(1 − 𝑝) The Binomial Random Variable (cont.) Example: A soldier fires 10 independently at a target. Find the probability that he hits the target. (i) once (ii) at least 9 times (iii) at most two times. If he has probability 0.8 of hitting the target at any given time? Let X denote the number of times he hits the target. Then X is a binomial variable with n =10 and p = 0.8. Recall, 𝑃 𝑋 = 𝑥 = 𝑛∁𝑥 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 ; 𝑥 = 0,1, … , 𝑛, 0 ≤ 𝑝 ≤ 1 (i) 𝑃 𝑋 = 1 = 10∁1 0.81 (1 − 0.8)10−1 = 0.000004096 (ii) 𝑃 𝐻𝑒 ℎ𝑖𝑡𝑠 𝑡ℎ𝑒 𝑡𝑎𝑟𝑔𝑒𝑡 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 9 𝑡𝑖𝑚𝑒𝑠 = 𝑃 𝑋 ≥ 9 = 𝑃 𝑋 = 9 + 𝑃 𝑋 = 10 = 10∁9 0.89 (0.2)10−9 +10∁10 0.810 (0.2)10−10 = 0.3758 (iii) 𝑃 𝑎𝑡 𝑚𝑜𝑠𝑡 𝑡𝑤𝑖𝑐𝑒 = 𝑃 𝑋 ≤ 2 = 𝑃 𝑋 = 0 + 𝑃 𝑋 = 1 + 𝑃 𝑋 = 2 = 0.000004272128 The Binomial Random Variable (cont.) Assignment (1) Suppose that a certain type of electric bulb has a probability of 0.3 of functioning more than 800hours. Out of 50 bulbs, what is probability that less than 3 will function more than 800 hours. (2) A fair die is rolled four times. Find the probability of getting 2 sixes. The Hyper geometric Random Variable Suppose a population consists of N items, k of which are successes. And a random sample drawn from that population consists of n items, x of which are successes. Then the hypergeometric probability is: 𝑘∁𝑥 𝑁 − 𝑘∁𝑛−𝑥 𝑃 𝑋 = 𝑥; 𝑁, 𝑛, 𝑘 = , 𝑥 = 0,1,2,3, … , 𝑘 𝑁∁𝑛 Any random variable having its probability density function as given by the above is called hyper geometric random variable and is said to have hyper geometric distribution. The Hyper geometric Random Variable (cont.) Example: Suppose we select 5 cards from an ordinary deck of playing cards. What is the probability of obtaining 2 or fewer hearts? Solution: This is a hypergeometric experiment since we know the following: N = 52; since there are 52 cards in a deck. k = 13; since there are 13 hearts in a deck. n = 5; since we randomly select 5 cards from the deck. x = 0 to 2; since our selection includes 0, 1, or 2 hearts. 𝑃 𝑋 = 𝑥; 𝑁, 𝑛, 𝑘 = 𝑃(𝑋 ≤ 2; 52,5,13) = 𝑃 𝑋 = 0,52,5,13 + 𝑃(𝑋 = 1,52,5,13)+ 𝑃 𝑋 = 2,52,5,13 = 0.9072 The probability of randomly selecting at most 2 hearts is 0.9072. The Hyper geometric Random Variable (cont.) Assignment Suppose we randomly select 5 cards without replacement from an ordinary deck of playing cards. What is the probability of getting exactly 2 red cards (i.e., hearts or diamonds)? The Hyper geometric Random Variable (cont.) Expectation, Variance and Standard Deviation of a Hypergeometric Random Variable. The probability density function of a Hypergeometric variable X is defined by 𝑘∁𝑥 𝑁 − 𝑘∁𝑛−𝑥 𝑃 𝑋 = 𝑥; 𝑁, 𝑛, 𝑘 = , 𝑥 = 0,1,2,3, … , 𝑘 𝑁∁𝑛 the mean or expected value of X is 𝑛𝑘 𝐸 𝑋 = 𝑁 The variance and standard deviation (SD) of X is 𝑛𝑘(𝑁 − 𝑘)(𝑁 − 𝑛) 𝑉𝑎𝑟 𝑋 = 𝑁 2 (𝑁 − 1) 𝑛𝑘(𝑁 − 𝑘)(𝑁 − 𝑛) 𝑆𝐷 𝑋 = 𝑉𝑎𝑟(𝑋) = 𝑁 2 (𝑁 − 1) The Poisson Random Variable The Poisson Random Variable Definition: Rare Event An event is said to be rare if the probability p of observing the event is very small. Consider n repeated Bernoulli trails, where n is very large and P very small, let X be the number of successes in n trials. Then 𝑃 𝑋 = 𝑥 = 𝑛∁𝑥 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥. λ𝑥 𝑒 −λ Setting λ = 𝑛𝑝 and lim 𝑛∁𝑥 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 = 𝑛→∞ 𝑥! The above gives an approximation to the binomial distribution with λ = 𝑛𝑝 when n is large and p is small, where e = 2.71828 is the base of natural logarithms. The Poisson Random Variable Expectation, Variance and Standard Deviation of a Poisson Random Variable. The probability density function of a Poisson variable X is defined by λ𝑥 𝑒 −λ 𝑃 𝑋=𝑥 = , 𝑥 = 0,1, … 𝑥! the mean or expected value of X is 𝐸 𝑋 =λ The variance and standard deviation (SD) of X is 𝑉𝑎𝑟 𝑋 = λ 𝑆𝐷 𝑋 = 𝑉𝑎𝑟(𝑋) = λ The Poisson Random Variable (cont.) Example: Suppose a rare disease occurs in 2 percent of a large population. A random sample of 10,000 people are chosen at random from this population and tested for the disease. Calculate the probability that at least two people have the rate disease. Solution: For a binomial distribution with parameters 𝑛 (number of trials) and 𝑝 (probability of success), if 𝑛 is large and 𝑝 is small, it can be approximated by a Poisson distribution with parameter λ = 𝑛𝑝 Where n = 10,000 and p = 0.02,λ = 10000 × 0.02 = 200 𝑃 𝑋 ≥2 =1−𝑃 𝑋 2}. 𝑋−𝜇 Solution: Recall, 𝑧 = 𝜎 𝑋−𝜇 6−5 1 (i) 𝑃 𝑋 < 6 =𝑃 4 ≤ 4 =𝑃 𝑧≤ = 4 Ф(0.25) = 0.5987 3−5 𝑋−5 7−5 (ii) 𝑃 3 < 𝑋 2} = P(X – 5 > 2 or X – 5 < -2) = P(X > 7 or X < 3} = P(X > 7) + 7−5 3−5 P(X +𝑃 𝑧 < = P(Y > 0.5) + P(Y < −0.5) 4 4 =Ф 0.5 + Ф(−0.5)= 0.3085 + 0.3085 = 0.6170. Notation: When we write X is 𝑁(𝜇, 𝜎 2 ) we mean that X has a Normal probability distribution with mean 𝜇 and variance 𝜎 2. The Normal Random Variable (cont.) Exploratory data analysis (EDA) This methodology involves analyzing and visualizing data to understand its underlying structure, patterns, and relationships. EDA techniques, such as histograms, scatter plots, and box plots, help identify outliers, trends, and potential variables for inclusion in the model.