Sampling of Population & Principles of Probability PDF

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Document Details

EloquentHamster

Uploaded by EloquentHamster

Dr. Areej Al-Ali

Tags

probability sampling statistics mathematics

Summary

This document is a lecture presentation on sampling of populations and principles of probability. It includes definitions, examples, and types of sampling. It also covers topics such as simple random sampling, systematic random sampling, stratified random sampling, and clustered random sampling, along with advantages and disadvantages of each method.

Full Transcript

02/03/2024 Sampling of Population & Principles of Probability Session 4 Dr. Areej Al-Ali [email protected] 1 This week office hours Monday 4/3/2024: 9 am – 2 pm Wedneday 6/3/2024: 12 pm – 2pm 2 1 02/03/2024 Sampling of Populations Use observed sample characteristics to estimate population chara...

02/03/2024 Sampling of Population & Principles of Probability Session 4 Dr. Areej Al-Ali [email protected] 1 This week office hours Monday 4/3/2024: 9 am – 2 pm Wedneday 6/3/2024: 12 pm – 2pm 2 1 02/03/2024 Sampling of Populations Use observed sample characteristics to estimate population characteristics Sampling Frame 3 Sampling of Populations A census is a count of a population, where every member of the population is studied. For many purposes this is impractical, why? 1. Population is huge 2. Process is time consuming 3. Process is costly 4. Can’t be done sometimes.. 4 2 02/03/2024 Sampling of Populations So, for these reasons, we resort to selecQng a sample from the populaQon which is a “good representaQon” of the populaQon, perform the analysis on the sample, and extrapolate the results to the enQre populaQon. We want the sample to be representaQve, i.e., to have the same characterisQcs as the populaQon 5 Sampling of Populations To achieve representativeness, we must not allow the subjects’ own characteristics to reflect whether they are included in the sample. We allow chance to select the sample, giving each subject the same probability of being selected. 6 3 02/03/2024 Sampling of Populations Question: How precise are the results obtained using the sample? Depends on: (1) The sampling scheme: a detailed description of what data will be obtained and how this will be done (2) Sample size: The higher the sample size, the more precise are my results. 7 Types of Sampling of Populations Probability Sampling Non- Probability Sampling This includes: This includes: Simple random sampling Convenient sampling Systematic random sampling Quota sampling Stratified random sampling Clustered random sampling 8 4 02/03/2024 Probability Sampling: Simple Random Sample (SRS) A simple random sample is a group of subjects taken from a larger population so that each member of the population has an equal chance of being selected. The characteristics of the subject do not influence the chance of being chosen. How SRS can be chosen? This can be done using tables of random numbers, computer generated random numbers, or draw method. 9 Probability Sampling: Simple Random Sample (SRS) Advantages: ØEstimates are easy to calculate, their distribution is known and hence inference is possible. ØEnsures high representativeness if all subjects participate. Disadvantages: ØNot possible without complete list of population members. ØMinority subgroups in population may not be represented sufficiently in the sample. 10 5 02/03/2024 Probability Sampling: Simple Random Sample (SRS) Example 1: Questionnaires were sent to a random sample of 200 GPs in Kuwait, asking them about their diagnosis and treatment of hypertension. Why was a random sample used here? It is done so that the sample will be representative of the population, and we can use the sample to tell us about the population. Here, the replies of the sample will provide information about the views of all GPs in Kuwait, without the expense of asking them. 11 Probability Sampling: Simple Random Sample (SRS) Example 2: It has been suggested that reducing serum cholesterol may increase the risk of death from accident, or violence. This was investigated using a random sample of people born between 1913 and 1947 and living in North Karalla, Finland. What population does the random sample in this study represent? The sample was drawn from all people born between 1913 and 1947 and living in North Karalla, Finland. Since it is a random sample, it is representative of these individuals. The results of the study may or may not be generalized to other people living in the same area but born at different times and similarly may or may not be generalized to individuals from other places. 12 6 02/03/2024 Probability Sampling: Simple Random Sample (SRS) Example 3: Prima Magazine, a magazine for women in the UK, printed a survey on family life in its January 1999 issue. A random selection of the questionnaires were analysed, and the results were presented in March issue. What biases might there be in this sample of the women of the UK? The magazine may have a broad readership, but it will not be a representative sample of women in the UK. For example, It might exclude the very poor and the illiterate. 13 Probability Sampling: Systematic random Sample Systematic sampling is a type of probability sampling method in which sample members from a larger population are selected according to a random starting point but with a fixed, periodic interval (k). 14 7 02/03/2024 Probability Sampling: Systematic Random Sample 15 Probability Sampling: Systema?c Random Sample Advantages: ØEasier to perform in the field, less subject to selection bias by field worker, especially if a good frame is not available. ØCan provide greater information than SRS because it spreads more uniformly over entire population. Disadvantages: ØSample may be biased if hidden periodicity in the population coincides with that of the selection process. Difficult to assess the precision of the estimate from one survey. Also, the choice of k is not free, especially if you have small finite population. 16 8 02/03/2024 Probability Sampling: Stratified Random Sample A sample obtained by separating the population elements into nonoverlapping groups (strata), and then selecting a simple random sample from each stratum. 17 Probability Sampling: Stratified Random Sample Example: Suppose we are interested in estimating the proportion of people who support stem cells research in Kuwait. We may be interested not only in an overall estimate of the proportion, but an estimate for each one of the strata: Kuwaiti, non-Kuwaiti Arab, and others. A simple random sample from each of the three strata can be selected. 18 9 02/03/2024 Probability Sampling: Stra?fied Random Sample ØAdvantages: 1. May produce smaller bound on the error of estimation compared to SRS, especially if the measurements within strata are homogeneous. 2. Stratification of the population into convenient groups may reduce the cost/observation. 3. Estimates of population parameters for the subgroups (strata) may be desired. This ensures that specific groups are represented. Ø Disadvantages : More complex and requires greater efforts than SRS. Strata must be carefully defined. 19 Probability Sampling: Clustered Random Sample A cluster sample is a probability sample in which each sampling unit is a collection, or cluster, of elements. City blocks, Zip codes, or schools are examples of clusters. Drawing Cluster Sample: Once appropriate clusters have been identified in the population, then a frame that lists all clusters is formed. A SRS of clusters is then selected from the frame. 20 10 02/03/2024 Probability Sampling: Clustered Random Sample Example 1: Sociologist wants to estimate the per capita income in a city. Clusters which are the city blocks can be marked on the map and numbered, then randomly selecting say 20 numbers (blocks). All households in the blocks selected are surveyed. Example 2: You are interested in the average reading level of all the seventhgraders in Kuwait city. It would be very difficult to obtain a list of all seventhgraders and collect data from a random sample spread across the city. You obtain a list of all schools and collect data from a subset of these. You thus decide to use the cluster sampling method. 21 22 11 02/03/2024 Probability Sampling: Clustered Random Sample ØAdvantages: cluster sampling is an effective design to obtain information at minimum cost if: (1) A frame listing population elements either not available or costly to obtain, but a frame listing clusters is easily obtained. (2) The cost of obtaining observations increases as the distance separating the elements increases. ØDisadvantages : an expert is needed to know how to achieve balance between selecting many small clusters vs. few large clusters. The former controls variability, the latter is better economically. 23 24 12 02/03/2024 Non- Probability Sampling: Convenient Sampling Subjects are selected because of convenient accessibility and proximity to the researcher (family, friends, fellow student). Advantages: convenient, inexpensive. Disadvantages: highly unrepresentative, degree of generalizability is questionable. 25 Non- Probability Sampling: Quota Sampling Sampling in which people are selected according to a certain fixed quota. For example: 10% of MD admission in KU are for non-Kuwaiti. For 200 admissions, 20 admissions for non-Kuwaiti, and 180 for Kuwaiti. Advantages: ensures some degree of representativeness of strata. Disadvantages :Degree of generalizability is questionable. Quota frame in each category must be accurate. 26 13 02/03/2024 Principles of probability 27 Basic Terminology Sample Space: is the set of all possible outcomes. Event: is any set of outcomes of interest. Experiment (trial): a procedure that can be repeated many times and has a set of possible outcomes. Random Experiment: An experiment is said to be random if it has more than one possible outcome and deterministic if it has only one. 28 14 02/03/2024 Examples of experiment , sample space and events } Example: Toss a fair coin (experiment), Sample space (Ω)={H,T}, Event A: coin face is H. 29 Examples of experiment , sample space and events } Example: Toss a fair Dice (experiment), sample space (Ω)={1,2,3,4,5,6}, Event A: obtain an even number {2,4,6}. } Example: Gender of newborn (single birth) (experiment), sample space (Ω)={Boy, Girl}, Event A: have a girl {G} 30 15 02/03/2024 Examples of experiment , sample space and events Example: Toss two fair coins, Ω={(H,H), (H,T),(T,H),(T,T)}. Event A=at least one head {(H,H), (H,T),(T,H)}. Event B=Two heads {(H,H)} Example: The four human blood type are genetic phenotypes Ω={O,A,B,AB}, F=people with blood type O, G=people with blood type B. Example: Guess gender of two newborn (single pregnancy). Ω={(B,B),(B,G),(G,B),(G,G)}. Event A is to have at least one girl, so A={(B,G),(G,B),(G,G)} 31 Some definitions Union of two events A&B: combining the elements in both A&B without repeating an element. Intersection of two events A&B: Set of common elements in both A&B. Mutually exclusive: events A&B are mutually exclusive if they have NO elements in common. Hence, A&B are called “disjoint”. 32 16 02/03/2024 Some definitions Empty set (Null set) (φ): set with no elements. Subset: A is a subset from B if every element of A exists in B, wrioen A ⊂ B. Complement: The set that makes the set A become the whole space Ω. 33 Venn Diagram 34 17 02/03/2024 Examples Toss a fair dice, the set of all possible outcomes is the whole space Ω. Consider the event A= occurrence of an odd number. Event B= occurrence of a number greater than 3. Then Ω={1,2,3,4,5,6}, A= {1,3,5}, B= {4,5,6}. Now the complement of the event A is ={2,4,6}. A∩B={5}, A∪B={1,3,4,5,6} Question: Are the two events A and B disjoint? 35 Definition of probability The probability of an event is the likelihood of that event to occur expressed by relative frequency. If A is an event from Ω, the probability of A P(A)=(#elements of A / #elements in Ω) = #A #W This definition requires equally likely events. 36 18 02/03/2024 Some useful probabilis?c Nota?ons The symbol { } is used as shorthand for the phrase “the event.” A ⋃ B is the event that either A or B occurs. A ⋂ B is the event that both A and B occur simultaneously.  is the event that A does not occur. It is called the complement of A. NoQce that P(Â) = 1 – P(A), because  occurs only when A does not occur. 37 Example Toss a fair coin, Ω = {H,T}, Let A = {H}, B = {T}. ØA U B= {H,T}, A ∩ B = Φ, AC = {T}. ØP(A) = 1/2, P(B) = 1/2 ØP(A∩B) = P(φ) = 0/2 = 0 ØP(A∪B) = 2/2 = 1 ØSince P(A) + P(AC ) = 1, then P(AC ) = 1 - P(A) = 1/2 ØP(Φ) = 0 (null),P(Ω) = 1(sure event) ØHence 0 £ P ( A) £ 1 38 19 02/03/2024 Definition of probability Probability of male livebirth during the period 1965 - 1974 Time period Number of male livebirths Total number of livebirths (A) 1965 1,927,054 Probability of a male livebirths (B) (A/B) 3,760,358 0.51247 1965 - 1969 9,219,202 17,989,361 0.51248 1969 - 1974 17,857,857 34,832,051 0.51268 Ø The probability of live male births in 1965 was 0.51247 ; in 1965- 69 was 0.51248; in 1965-74 was 0.51268. 39 Addition Law of Probability: (1) Mutally exclusive events If outcomes A and B are two events that are mutually exclusive (disjoint or have no common elements), then P(A OR B occurs) = P(A) + P(B) Similarly, for three events A, B, and C are mutually exclusive , then Pr (A ⋃ B ⋃ C) = Pr(A) + Pr(B) + Pr(C) The probabilities of all the outcomes of an event sum up to 1 P (A) + P (B) + P (C) = 1 40 20 02/03/2024 Addition Law of Probability: (1) Mutally exclusive events Example: Consider two “single” births in the maternity section in a hospital. What is the probability of having either a boy OR a girl? Ω = {b , g}. A = {b}, B = {g}, A and B are disjoint. P(A ∪ B) =P(A) + P(B) = 1/2+ 1/2 =1 41 Addi?on Law of Probability: (1) Mutally exclusive events Example: The four human blood types below are genetic phenotypes that are mutually exclusive. Of 5400 individuals examined, the following frequency of each type is observed. Assume that each of the 5400 has an equal chance of being encountered. Blood Type Frequency O 2672 A 2041 B 486 AB 201 (1) Find relative frequency of each blood type. (2) Estimate the probability of encountering a person with blood type A. (3) Estimate the probability of encountering a person who has either blood type A OR type B. (4) Estimate the probability of encountering a person who has either blood type A OR type B OR type AB 42 42 21 02/03/2024 Addition Law of Probability: (1) Mutally exclusive events 43 43 Addition Law of Probability: (1) Mutally exclusive events 44 44 22 02/03/2024 Addi?on Law of Probability: (1) Mutally exclusive events Example: Let A be the event that a person has normotensive diastolic blood pressure (DBP) readings (DBP < 90), and let B be the event that a person has borderline DBP readings (90 ≤ DBP < 95). Suppose that Pr(A)=0.7, and Pr(B)=0.1 Let C be the event that a person has a DBP < 95. Calculate P(C)=? 45 Addition Law of Probability: (2) Not mutally exclusive events If outcomes A and B are two events that are not mutually exclusive (events have some intersection, common elements), then P (A ⋃ B) = P (A) + P (B) – P (A ⋂ B) Similarly, for three events A, B, and C, then Pr (A ⋃ B ⋃ C) = Pr(A) + Pr(B) + Pr(C) – Pr(A ⋂ B) – Pr(A ⋂ C) - Pr(B ⋂ C) - Pr(A ⋂ B ⋂ C) 46 23 02/03/2024 Addition Law of Probability: (2) Not mutally exclusive events Toss a fair dice, let A: even number occurred. Event B: a number less than 5 occurred. So, A={2,4,6}, B={1,2,3,4}. Ø clearly A and B are NOT mutually exclusive because A ∩ B={2,4}. Therefore ØP(A ∪ B) = P (A) + P (B) – P (A ⋂ B) = 3/6 + 4/6 – 2/6 = 5/6 47 Addi?on Law of Probability Mutally exclusive events (disjoint or have no common elements) If A and B are any events, then P (A ⋃ B) = P (A) + P (B) Not Mutally exclusive events (events have some intersection; have common elements) If A and B are any events, then P (A ⋃ B) = P (A) + P (B) – P (A ⋂ B) 48 24 02/03/2024 Independence of Probability Events We say that two events A and B are independent if and only if the occurrence of one event doesn’t affect the occurrence of the other. It means if event A happen then it doesn’t mean that B will also happen. 49 Multiplication Law of Probability Independent events Dependent events If A and B are any independent If A and B are any dependent events, then events, then P (A ⋂ B) = P (A) x P (B) P (A ⋂ B) ≠ P (A) x P (B) Note : ∩ means “and”, while ∪ means “or”. 50 25 02/03/2024 Multiplication Law of Probability: Independent events Consider two single births in the maternity section in a hospital. If the first birth was a baby boy, does it mean the second birth needs to be a baby girl? Or a baby boy? NO, such events are independent. In such case of independence, we have P(A ∩ B)=P(A).P(B) In the previous example, what is the probability of getting two baby girls? Say: A first baby is a girl P(A)=1/2, B second baby is a girl P(B)=1/2, A and B are independent, thus P(A ∩ B)= P(A) X P(B) = 1/2 X 1/2 = 1/4 = 0.25 51 Multiplication Law of Probability: Independent events Recall that Ω={(b,b),(b,g),(g,b),(g,g)} are all the possibilities of two births. Event A= two baby girls ={(g,g)}, thus P(A)=#A/# Ω = ¼. Exercise: If you toss three fair coins, what is the probability of getting three heads in the three tosses, i.e P(H ∩ H ∩ H). Always remember under independence we have P(A ∩B ∩C) = P(A) x P(B) x P(C) 52 26 02/03/2024 Multiplication Law of Probability: Independent events Suppose two doctors, A and B, test all patients coming into a clinic for syphilis. Let events A+ = {doctor A makes a positive diagnosis} and B+ = {doctor B makes a positive diagnosis}. Suppose doctor A diagnoses 10% of all patients as positive, doctor B diagnoses 17% of all patients as positive, and both doctors diagnose 8% of all patients as positive. Are the events A , B independent? Show your work. 53 Conditional Probability Suppose that one is interested in calculating the probability of an event “A” given that, or conditioned on, the occurrence of another event say “B”, we write P(A|B). P(A|B) = P(A ∩ B) / P(B) P(B|A) = P(B ∩ A) / P(A) 54 27 02/03/2024 Conditional Probability Consider two single births in the maternity section of a hospital. If the first birth was a boy (A), what is the probability that the second is also a boy (B), i.e., P(B|A) Solution: recall Ω={(b,b),(b,g),(g,b),(g,g)}, then P(B|A) = P(B ∩ A) / P(A) (1) P(B ∩ A) = 1/4 =0.25, (2) P(A) = 2/4=0.5, thus P(B|A)=0.25/0.50 = 0.50 55 Conditional Probability Consider the following data showing the disease status (CHD: yes or no) and test results (ECG: yes or no). Self-report DR. Diagnosed CHD ECG Abnormality TOTAL YES + NO - YES + NO TOTAL 44 131 175 574 8470 9044 Similarly, P(ECG=+) = 175/9219 = 0.02 P(ECG=-) = 9044/9219 = 0.98 618 8601 9219 P(CHD=YES) = 618/9219=0.07 P(CHD=NO) = 8601/9219=0.93 Note that P(CHD=YES) + P(CHD=NO) =1 The joint probability P(CHD=+ ∩ ECG=+)= 44/9219 =0.005 No5ce again that P(ECG=+) + P(ECG=-)=1 56 56 28 02/03/2024 Question: Continue previous example Given that CHD was diagnosed, what is the probability that the ECG was abnormal. Find P(ECG=+|CHD=+)=? P(ECH=+|CHD=+) = P(ECG=+∩ CHD=+) / P(CHD=+) =0.005/0.07 =0.07 or 7% This number is called Sensitivity of the test. Can you calculate P(ECG=-|CHD=-)=? This number is called Specificity of the test. 57 57 Condi?onal Probability: Try the following Suppose two doctors, A and B, test all patients coming into a clinic for syphilis. Let events A= {doctor A makes a positive diagnosis} and B= {doctor B makes a positive diagnosis}. Suppose doctor A diagnoses 10% of all patients as positive, doctor B diagnoses 17% of all patients as positive, and both doctors diagnose 8% of all patients as positive. Find the conditional probability that doctor B makes a positive diagnosis of syphilis given that doctor A makes a positive diagnosis. 58 58 29 02/03/2024 Thank you 59 30

Use Quizgecko on...
Browser
Browser