Summary

This document provides an overview of probability distributions, including observed, binomial, and Poisson distributions, for a statistics for biology I course. It presents concepts and examples.

Full Transcript

STF1093 Statistics for Biology I LU2 Probability Distributions (1) Observed and Standard Distribution. (2) Binomial Distribution. (3) Poisson Distribution. Introduction General types of distributions: Observed Distribution - derived from data collected. Standard D...

STF1093 Statistics for Biology I LU2 Probability Distributions (1) Observed and Standard Distribution. (2) Binomial Distribution. (3) Poisson Distribution. Introduction General types of distributions: Observed Distribution - derived from data collected. Standard Distributions - generated mathematically or theoretically. Discrete Distributions Values are whole numbers, isolated along the number line. – Binomial Distribution & Poisson Distribution Continuous Distributions Values are continuous numbers, along the number line. – Normal Distribution & t-Distribution Observed Distribution (with reference to LU1) Derived from data collected; distribution is peculiar to that one situation. The numbers collected are all measurements of a variable which varies when observations are made of it. To make sense of all the data collected, first sort the data in an ordered array, then classify the data using frequency table. Use frequency histogram for greater visual impact (i.e. descriptive statistics). When analytical objectives are required, a probability histogram is developed. P(number lies in class x) = Frequency class x Total frequency The frequency for a range covering several classes can be calculated by adding the probabilities of the individual classes. A graph of the cumulative frequency diagram is commonly known as Ogives. – A Less than Ogive and A More than Ogive) Example 1: Estimate of length of line (in mm) Length (mm) Frequency % Probability 0-9 1 2.5 0.025 10-19 2 5 0.05 20-29 4 10 0.1 30-39 10 25 0.25 40-49 16 40 0.4 50-59 6 15 0.15 60-69 1 2.5 0.025 40 100 1.000 a) P(length 20-29) = 0.1 b) P(length 20-49) = 0.1 + 0.25 + 0.4 = 0.75 Example 2: Cumulative less than Length Frequency Cumulative % Probability, (X) P(length < X)  9 1 2.5 0.025  19 3 7.5 0.075  29 7 17.5 0.175  39 17 42.5 0.425  49 33 82.5 0.825  59 39 97.5 0.975  69 40 100.0 1.000 Example 3: Cumulative more than Length Frequency Cumulative % Probability, (X) P(length > X)  0 40 100.0 1.000  10 39 97.5 0.975  20 37 92.5 0.925  30 33 82.5 0.825  40 23 57.5 0.575  50 7 17.5 0.175  60 1 2.5 0.025 Standard Distributions Standard distribution implies that the situations resemble closely a theoretical situation for which the distribution has been constructed mathematically. Standard Distributions can be divided into: – Discrete Distributions Values are whole numbers, isolated along the number line. Binomial Distribution & Poisson Distribution – Continuous Distributions Values are continuous numbers, along the number line. Normal Distribution & t–Distribution One of the earliest standard distribution to be developed was the binomial distribution. The most used and best known standard distribution is probably the normal distribution. Binomial Distribution Constructed mathematically. Distribution is discrete and variables are distinct, giving rise to stepped shapes. Shape can vary from right-skewed, symmetrical, to left-skewed depending upon the situation. Right skewed Symmetric Left skewed A binomial situation can be recognized by the following characteristics: 1) The existence of an independent trial (of an experiment), which is defined in terms of: Type 1 elements (success) = p (p as the proportion p of the population) Type 2 elements (failure) = (1 - p) (in some books, denoted as q ) 2) Identical trials are repeated a number of times, yield a number of successes (p) Actual situations when binomial is used: – Inspection schemes e.g defective / non-defective, tree with / without flowers – Opinion polls e.g. agreement / disagreement – Selling e.g a sale / no sale Binomial probabilities are calculated as: n−r P(r of Type 1 in sample) = n C r. p.(1 − p) r where n = sample size p = proportion of Type 1 in population (1 -p) = proportion of Type 2 in population nC = mathematical abbreviation for a 'combination' r = n!/(r! x (n - r)!) n! = n x (n - 1) x (n - 2) x.......... x 2 x 1 Example 4: From past records, the probability that a machine will need correcting adjustments during a day’s production run is 0.2. If there are 6 of these machines running on a particular day, find the probability that: a) no machines need correcting; b) just one machine needs correcting; c) exactly two machines need correcting; d) more than two machines need correcting. Answer First, it is necessary to identify the binomial situation precisely. Trial = identifying a particular machine; Trial success = the machine needs a correcting adjustment; Number of trials = 6 (the number of machines in use). Thus, n = 6 and we are given that p = P(success) = P(a machine needs correcting) = 0.2. The binomial probability formula is P( x)= nCr. p r.(1 − p)n − r a) Putting x = 0 (together with n = 6 and p = 0.2) in the above formula gives: P(0) = P(no machines need correcting) = 6C0(0.2)0(1-0.2)6 = (0.8)6 = 0.2621 b) Putting x=l in the formula gives: P(1) = P(one machine needs correcting) = 6C1(0.2)1 (1-0.2)5 = 6(0.2)(0.8)5 = 0.3932 c) Putting x =2 in the formula gives: P(2) = P(two machines need correcting) = 6C2(0.2)2(1-0.2)4 = 15(0.2)2(0.8)4 = 0.2458 d) P(more than 2 machines need correcting) = 1 - P(2 or less machines need correcting) = 1 - Pr(0 or 1 or 2 machines need correcting) = 1 - (Pr(0) + Pr(l) + Pr(2)] = 1 - [0.2621 + 0.3932 + 0.2458] from (a), (b) and (c). = 1 - 0.9011 = 0.0989 Finding probabilities of the Binomial Distribution using the Binomial Table Table A.1, Binomial Probabilities ( 4 pages) Find the same answers to the above Example question, using the Binomial Table Cjlaman & I-Jusoh Note: Two parameters affect the skewness of the distribution: the size n and the population proportion of elements of the first type p. Distribution having different parameters while displaying the same broad characteristic, will be different. Distribution having the same n and p will be identical. Right-skewed shapes occur when p is small (close to 0). Left-skewed shapes occur when p is large (close to 1). Symmetrical shapes occur when p is 0.5 or when n is large. Poisson Distribution A discrete distribution. It describes the occurrence of "isolated events within a continuum", – e.g., road accidents over a given period of time. (The road accidents are the isolated events and time is the continuum). Similar to the binomial but with an infinite sample size. The sample population has two elements, occurrence of events (e.g.: phone calls received) and the non-occurrence of events (e.g.: non- arrival of calls). The Poisson distribution is derived from the binomial, using the binomial formula for probabilities. n−r P(r of type 1) = n C r. p.(1 − p) r Poisson Distribution Its shape varies from right skewed to almost symmetrical. Poisson Distribution Formula is given as, P(r events) = e −. r r! where, λ (lambda) = the average number of events per sample (and the parameter of the distribution). e = a constants equal to 2.718. r = the variable numbers of events. Example 5: The safety of a dangerous intersection was being investigated. Past police records indicate a mean of three accidents per month happened at this intersection. The number of accidents is distributed according to a Poisson distribution. (a) The Highway Safety Division wants us to calculate the probability in any month of exactly 0, 1, 2, 3, 4, 5, or more accidents happening. (b) Action would be taken if the probability of more than three accidents per month exceeds 0.30. Should we act? − P(r events) = e. r r! r = 0, 1, 2, 3, 4, 5  = 3.0 e = a constant, 2.718. P(r events) = e − .r r! e −3.30 2.718 −3.1 (a) The probability of no accident: P(0) = = = 0.0498 0! 1 e −3.31 2.718−3.3 Hence, P(1) = = = 0.1494 P (0) = 0.0498 1! 1 P (1) = 0.1494 P (2) = 0.2240 P (3) = 0.2240 P (4) = 0.1680 P (5) = 0.1008 P (6 or more) = 0.0504 + 0.0216 + 0.0081 + 0.0027........... Total Probability = 1.0 = 1 – [(0.0498 + 0.1494 + 0.2240 + 0.2240 + 0.1680 + 0.1008)] = 1 – 0.916 = 0.084 (b) P (more than 3 accidents) = 1.0 – [(0.0498 + 0.1494 + 0.2240 + 0.2240)] = 1.0 - (0.6472) = 0.3528 This exceeds the limit being given, therefore action should he taken to improve the safety of the intersection. Finding probabilities of the Poisson Distribution using the Poisson Table Table A.13, Poisson Probabilities ( 1 page) Find the same answers to the above Example question, using the Poisson Table Cjlaman & I-Jusoh 10/17/2021 Poisson Distribution as an Approximation of the Binomial Distribution For convenience because the binomial tables and probability formula are lengthy and tedious to use. The rule most often used by statisticians is that the Poisson is a good approximation of the binomial when n is greater than or equal to 20, and p is less than or equal to 0.05, ( is defined as being equal to the mean of the binomial, i.e.,  = n.p ) Example 6: Given a binomial distribution with n = 28 trials and p = 0.025, use the Poisson approximation to the binomial to find: (a) P (r > 3) (b) P (r = 6) (c) P ( r = 9) Check using formula (slight different, but 3 dsp is OKAY  = n.p = 28 x 0.025 = 0.70 P(r≥3) = 1 – p(x≤2) = 1 – [(0.70 x e-0.7) / 0!)+(0.71 x e-0.7) / 1!) (0.72 x e-0.7) / 2!)] = 1 – 0.9658 (a) From table, with  = 0.70 = 0.0341 P (r > 3) = 1.0 – [(0.4966 + 0.3476 + 0.1217)] = 1.0 - 0.9659 = 0.0344 (b) From table, with  = 0.7 P(r = 6) = 0.001 (c) From table, with  = 0.7 P(r = 9) = 0.000

Use Quizgecko on...
Browser
Browser