M2B: Basic Concepts of Sampling Design PDF

Document Details

FuturisticBernoulli

Uploaded by FuturisticBernoulli

Polytechnic University of the Philippines

Rumel Angelo T. Alfaro

Tags

sampling design statistical analysis sampling techniques software application

Summary

This presentation covers the basic concepts of sampling design, including definitions, terminologies, sample size determination, and various sampling techniques. It's geared towards undergraduate-level students learning about statistical analysis and software application.

Full Transcript

STAT 203 STATISTICAL M2B: BASIC CONCEPTS OF SAMPLING DESIGN ANALYSIS with SOFTWARE APPLICATION RUMEL ANGELO T. ALFARO P U P 1 COURSE OUTCOME & TOPICS Where we’...

STAT 203 STATISTICAL M2B: BASIC CONCEPTS OF SAMPLING DESIGN ANALYSIS with SOFTWARE APPLICATION RUMEL ANGELO T. ALFARO P U P 1 COURSE OUTCOME & TOPICS Where we’re going… Topics STAT 203 Define sampling and different Definition of Sampling (2.5) STATISTICAL types of samples. Terminologies in Sampling Design (2.6) ANALYSIS with Develop the knowledge on the SOFTWARE Sample Size Determination (2.7) terminologies used in sampling. APPLICATION Sampling Techniques (2.8) Calculate sample size. Understand the different sampling techniques. P U P 2 DEFINITION OF SAMPLING Sampling is the basis for inferential statistics. STAT 203 A sample is a segment of a population. It is, therefore, expected to reflect the population. By studying the characteristics of the sample one can make inferences STATISTICAL about the population. There are several reasons why we take a part of the ANALYSIS with population to study rather than taking a full census of the population. These are: SOFTWARE Sampling takes less time. APPLICATION Samples cost less. Samples are more accurate. Sample observations are usually of higher quality because they are better screened for errors in measurement and for duplication and misclassifications; Samples can be destroyed to gain information about quality (destructive sampling). A random sample is a sample obtained in such a way that each possible sample of fixed size n has an equal probability of being selected. P U P 3 DEFINITION OF SAMPLING STAT 203 Target STATISTICAL ANALYSIS with Pop. SOFTWARE APPLICATION (N) Sample (n) Effective sampling produces a n which is representative of N Note: n is only ever representative of the N it was drawn from, i.e. not necessarily the general population. P U P 4 REMINDERS IN SAMPLING STAT 203 1. Representativeness, not size, is the more important STATISTICAL consideration. ANALYSIS with 2. Use no less than 30 subjects, if SOFTWARE possible. APPLICATION 3. If you use complex statistics, you may need a minimum of 100 or more in your sample (varies with method). P U P 5 TERMINOLOGIES IN SAMPLING STAT 203 Observation Unit - An object on which a measurement is taken. This is the basic unit of observation, sometimes called an element. In studying human STATISTICAL populations, observation units are often individuals. ANALYSIS with Target Population - The complete collection of observations we want to SOFTWARE study. APPLICATION Sampled Population - The collection of all possible observation units that might have been chosen in a sample; the population from which the sample was taken. Sample - A subset of a population. P U P 6 TERMINOLOGIES IN SAMPLING STAT 203 Sampling Unit - A unit that can be selected for a sample. We may want to study individuals, but do not have a list of all individuals in the target STATISTICAL population. Instead, households serve as the sampling units, and the ANALYSIS with observation units are the individual living in the households. SOFTWARE Sampling Frame - A list, map, or other specification of sampling units in the APPLICATION population from which a sample may be selected. For a survey using in-person interviews, the sampling frame might be a list of all street addresses. Sampling Technique/Sampling Strategies - It is a plan you set forth to be sure that the sample you use in your research study represents the population from which you drew your sample. P U P 7 DECISION CRITERIA IN SAMPLE SIZE The sample size is typically denoted by n and it is always a positive integer. STAT 203 There are three criteria that need to be specified to determine the appropriate sample size: 1. Level of Precision - Also called sampling error, the level of precision, is the range in which the true value of STATISTICAL the population is estimated to be. ANALYSIS with 2. Confidence Interval - It is statistical measure of the number of times out of 100 that results can be SOFTWARE expected to be within a specified range. For example, a confidence interval of 90% means that results of an action will probably meet expectations 90% of the time. To find the right z – score to use, refer to the table: APPLICATION Confidence Level Z-score 90% 1.645 95% 1.96 98% 2.33 99% 2.575 3. Degree of Variability - Depending upon the target population and attributes under consideration, the degree of variability varies considerably. The more heterogeneous a population is, the larger the sample size is required to get an optimum level of precision. P U P 8 SAMPLING AND NON-SAMPLING ERROR STAT 203 Sampling error is the difference between the value of a sample statistic and the value of the corresponding population parameter. In the case of the mean, STATISTICAL 𝑆𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝐸𝑟𝑟𝑜𝑟 = 𝑥ҧ − 𝜇 ANALYSIS with assuming that the sample is random and no non-sampling error has been SOFTWARE made. The sampling error occurs only in a sample survey, and not in a census. APPLICATION It is important to remember that a sampling error occurs because of chance. The errors that occur in the collection, recording, and tabulation of data are called non-sampling errors. The non-sampling errors can occur both in a sample survey and in a census. 𝑁𝑜𝑛 − 𝑆𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝐸𝑟𝑟𝑜𝑟 = 𝐼𝑛𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑥ҧ − 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 𝑥ҧ P U P 9 SAMPLE SIZE DETERMINATION STAT 203 1. Slovin’s Formula STATISTICAL 2. Mean or Average Estimation ANALYSIS with 3. Proportion (Infinite Population) SOFTWARE APPLICATION 4. Finite Population Correction 5. Proportion (Conservative Approach) P U P 10 SAMPLE SIZE DETERMINATION The Slovin’s Formula is considered the most commonly used formula in STAT 203 determining the sample size or how much of a sample must be needed to represent the whole population. Its formula is as follows: STATISTICAL ANALYSIS with 𝑁 𝑛≥ SOFTWARE 1 + 𝑁𝑒 2 APPLICATION where n is the minimum sample size N is the population size e is the margin of error (usually set as 0.05 or 5%) NOTE: This formula shall only be used for probability sampling techniques. P U P 11 SAMPLE SIZE DETERMINATION Illustrative Example STAT 203 If the population size is 1,320, what is the minimum sample size required if we’ll assume 5% margin of error? STATISTICAL SOLUTION: Using the Slovin’s Formula, ANALYSIS with 𝑛≥ 𝑁 SOFTWARE 1 + 𝑁𝑒 2 APPLICATION 1320 𝑛≥ 2 1 + 1320 0.05 𝑛 ≥ 310.5882 𝑛 ≥ 311 P U P 12 SAMPLE SIZE DETERMINATION STAT 203 Estimating the Mean or Average The sample size required to estimate the population mean μ to with a level STATISTICAL of confidence with specified margin of error e, given by 2 ANALYSIS with 𝑍𝜎 𝑛≥ SOFTWARE 𝑒 APPLICATION where Z is the z-score corresponding to level of confidence. e is the level of precision. Take Note: When σ is unknown, it is common practice to conduct a preliminary survey to determine s and use it as an estimate of σ or use results from previous studies to obtain an estimate of σ. When using this approach, the size of the sample should be at least 30. P U P 13 SAMPLE SIZE DETERMINATION STAT 203 Illustrative Example A soft drink machine is regulated so that the amount of drink STATISTICAL dispensed is approximately normally distributed with a standard deviation equal ANALYSIS with to 0.5 ounce. Determine the sample size needed if we wish to be 95% confident SOFTWARE that our sample mean will be within 0.03 ounce from the true mean. APPLICATION SOLUTION: The z – score for 95% confidence in the z – table is 1.96. 2 1.96 0.5 𝑛≥ 0.03 𝑛 ≥ 1,067.11 𝑛 ≥ 1,068 (The rule is, always round up!) P U P 14 SAMPLE SIZE DETERMINATION Proportion (Infinite Population) STAT 203 This is also known as Cochran’s sample size formula. To find the sample size STATISTICAL needed to determine a confidence interval about a proportion, use this formula: 2 ANALYSIS with 𝑍 SOFTWARE 𝑛 ≥ 𝑝Ƹ 𝑞ො 𝑒 APPLICATION where 𝑥 𝑝Ƹ = 𝑛 and 𝑞ො = 1 − 𝑝Ƹ x is the number of persons or objects possessing a certain characteristic of interest in the sample and n is the sample size. P U P 15 SAMPLE SIZE DETERMINATION STAT 203 Finite Population Correction Factor If the population is small then the sample size can be reduced slightly. STATISTICAL 𝑛0 ANALYSIS with 𝑛≥ 𝑛 −1 SOFTWARE 1 + 0𝑁 APPLICATION where 𝑍 2 𝑛0 = 𝑝Ƹ 𝑞ො is Cochran’s sample size recommendation 𝑒 N is the population size P U P 16 SAMPLE SIZE DETERMINATION Illustrative Examples STAT 203 1. A researcher wishes to estimate, with 95% confidence, the proportion of people who own a home computer. A previous study shows that 40% of STATISTICAL those interviewed had a computer at home. The researcher wishes to be ANALYSIS with accurate within 2% of the true proportion. Find the minimum sample size SOFTWARE necessary. APPLICATION SOLUTION: 2 𝑍 𝑛 ≥ 𝑝ො𝑞ො 𝑒 2 1.96 𝑛 ≥ 0.40 0.60 0.02 𝑛 ≥ 2,304.96 𝑛 ≥ 2,305 P U P 17 SAMPLE SIZE DETERMINATION Illustrative Examples STAT 203 2. Using the same case from previous problem, solve for the adjusted sample size using the Finite Population Correction Factor from a STATISTICAL population of 1,000. ANALYSIS with SOLUTION: SOFTWARE APPLICATION Since 𝑛0 = 2,305 2,305 𝑛≥ 2,305 − 1 1 + 1,000 𝑛 ≥ 697.63 𝑛 ≥ 698 P U P 18 SAMPLE SIZE DETERMINATION STAT 203 Proportion (Conservative Approach) However, if there are no prior or similar given values (𝑝Ƹ is unknown), you STATISTICAL use the conservative formula using the strong law of large number. ANALYSIS with 𝑍2 SOFTWARE 𝑛≥ 2 APPLICATION 4𝑒 This is considered conservative in the sense that 𝑝Ƹ and 𝑞ො are both 0.5 for: 2 𝑍 𝑛 ≥ 𝑝Ƹ 𝑞ො 𝑒 P U P 19 SAMPLE SIZE DETERMINATION Illustrative Example STAT 203 1. The same researcher wishes to estimate the proportion of executives STATISTICAL who own a car phone. She wants to be 90% confident and be accurate ANALYSIS with within 5% of the true proportion. Find the minimum sample size SOFTWARE necessary. APPLICATION SOLUTION: 𝑍2 𝑛≥ 2 4𝑒 1.6452 𝑛≥ 4(0.05)2 𝑛 ≥ 270.60 𝑛 ≥ 271 P U P 20 TYPES OF SAMPLING STAT 203 Two types of sampling: STATISTICAL 1. Probability Sampling – each element of the population has a ANALYSIS with known and nonzero chance of being selected. SOFTWARE 2. Non-Probability Sampling – members of the population are APPLICATION not given equal chances of being selected. P U P 21 PROBABILITY SAMPLING TECHNIQUES STAT 203 1. Simple random sampling – samples are drawn from a population using a method such as the lottery method or using computer or calculator to STATISTICAL generate random numbers. Also called lottery or fishbowl sampling. ANALYSIS with SOFTWARE APPLICATION P U P 22 PROBABILITY SAMPLING TECHNIQUES STAT 203 2. Stratified random sampling – subdivide the population into at least two different subpopulations (or strata) that share the same STATISTICAL characteristics (such as gender), and then draw a sample from each stratum. ANALYSIS with SOFTWARE APPLICATION P U P 23 STRATIFIED RANDOM SAMPLING STAT 203 Proportional Allocation using Stratified Random Sampling: STATISTICAL 1. Solve for the sample size using the Slovin’s Formula. ANALYSIS with 2. Determine the population from each stratum. SOFTWARE APPLICATION 3. Get the individual ratios between the population from each stratum and population size. 4. Multiply the ratios to the computed sample size from (1). P U P 24 STRATIFIED RANDOM SAMPLING Illustrative Example STAT 203 A researcher from Matamis Senior High School will use Stratified Random Sampling for his research project setting the five sections of the school as his strata. The enrolment statistics of the school is shown below. STATISTICAL ANALYSIS with Section Number of Students SOFTWARE G101 30 APPLICATION G102 27 G103 20 G104 25 G105 38 𝑁 = 140 How many participants will he need from each section using proportional allocation? P U P 25 STRATIFIED RANDOM SAMPLING STAT 203 1. Solve for the sample size using the Slovin’s Formula. 𝑛 ≥ 103.71 STATISTICAL 𝑛 ≥ 104 (always round up!) ANALYSIS with SOFTWARE 2. Determine the population from each stratum. APPLICATION Section Number of Students G101 30 G102 27 G103 20 G104 25 G105 38 𝑁 = 140 P U P 26 STRATIFIED RANDOM SAMPLING STAT 203 3. Get the individual ratios between the population from each stratum in terms of the population size. STATISTICAL 4. Multiply the ratios to the computed sample size from (1). ANALYSIS with SOFTWARE Section Number of Students Ratio Sample Size (n) APPLICATION G101 30 30/140=0.2142 22.2768=23 G102 27 27/140=0.1929 20.0616=21 G103 20 20/140=0.1429 14.8616=15 G104 25 25/140=0.1786 18.5744=19 G105 38 38/140=0.2714 28.2256=29 P U P 27 PROBABILITY SAMPLING TECHNIQUES STAT 203 3. Systematic sampling – choose some starting point and then select every kth element in the population. STATISTICAL 𝑁 𝑘= ANALYSIS with 𝑛 SOFTWARE APPLICATION P U P 28 SYSTEMATIC SAMPLING STAT 203 Obtaining a Systematic Random Sample 1. Decide on a method of assigning a unique serial number, from 1 to N, to STATISTICAL each one of the elements in the population. ANALYSIS with 𝑁 SOFTWARE 2. Compute for the sampling interval 𝑘 = 𝑛 APPLICATION 3. Select a number, from 1 to k, using a randomization mechanism. The element in the population assigned to this number is the first element of the sample. The other elements of the sample are those assigned to the numbers and so on until you get a sample of size. P U P 29 SYSTEMATIC SAMPLING STAT 203 Illustrative Example STATISTICAL We want to select a sample of 50 students from 500 students ANALYSIS with under this method kth item and picked up from the sampling frame. SOFTWARE SOLUTION: APPLICATION We start to get a sample starting form i and for every kth unit subsequently. Suppose the random number i is 6, then we select 15, 25, 35, 45,... P U P 30 PROBABILITY SAMPLING TECHNIQUES STAT 203 4. Cluster sampling – divide the population area into sections (or clusters), randomly select a few of those sections, and then choose all the members STATISTICAL from the selected sections. ANALYSIS with SOFTWARE APPLICATION P U P 31 CLUSTER SAMPLING STAT 203 Obtaining a Cluster Sample 1. Divide the population into non-overlapping clusters. STATISTICAL ANALYSIS with 2. Number the clusters in the population from 1 to N. SOFTWARE 3. Select n distinct numbers from 1 to N using a randomization mechanism. APPLICATION The selected clusters are the clusters associated with the selected numbers. 4. The sample will consist of all the elements in the selected clusters. P U P 32 CLUSTER SAMPLING STAT 203 Illustrative Example A researcher wants to survey academic performance of high school students in STATISTICAL MIMAROPA. ANALYSIS with SOFTWARE 1. He/She can divide the entire population into different clusters. APPLICATION 2. Then the researcher selects a number of clusters depending on his research through simple or systematic random sampling. 3. Then, from the selected clusters the researcher can either include all the high school students as subject or he can select a number of subjects from each cluster through simple or systematic random sampling. P U P 33 NON-PROBABILITY SAMPLING TECHNIQUES STAT 203 1. Convenience, haphazard or accidental sampling - members of the population are chosen based on their relative ease of access. To sample friends, co- workers, or shoppers at a single mall, are all examples of convenience sampling. Such STATISTICAL samples are biased because researchers may unconsciously approach some kinds of ANALYSIS with respondents and avoid others and respondents who volunteer for a study may differ in SOFTWARE unknown but important ways from others. APPLICATION P U P 34 NON-PROBABILITY SAMPLING TECHNIQUES STAT 203 2. Snowball sampling - The first respondent refers an acquaintance. The friend also refers a friend, and so on. Such samples are biased STATISTICAL because they give people with more social connections an unknown ANALYSIS with but higher chance of selection but lead to higher response rates. SOFTWARE APPLICATION P U P 35 NON-PROBABILITY SAMPLING TECHNIQUES STAT 203 3. Judgmental sampling or purposive sampling - The researcher chooses the sample based on who they think would be appropriate for the STATISTICAL study. This is used primarily when there is a limited number of people that ANALYSIS with have expertise in the area being researched, or when the interest of the SOFTWARE research is on a specific field or a small group. APPLICATION P U P 36 NON-PROBABILITY SAMPLING TECHNIQUES STAT 203 Different types of purposive sampling include: 1. Deviant case - The researcher obtains cases that substantially STATISTICAL differ from the dominant pattern (a special type of purposive sample). ANALYSIS with The case is selected in order to obtain information on unusual cases SOFTWARE that can be specially problematic or specially good. APPLICATION 2. Case study - The research is limited to one group, often with a similar characteristic or of small size. 3. Quota Sampling - A quota is established (e.g. 65% women) and researchers are free to choose any respondent they wish as long as the quota is met. P U P 37 SAMPLING TECHNIQUES Identify the type of sampling used: STAT 203 Systematic 1. Motorola selects every 50th pager from the assembly line for careful testing and analysis. Simple 2. A reporter writes the name of each senator on a separate card, shuffles the cards, and then draws 5 names. STATISTICAL ANALYSIS with Cluster 3. A dean at School of EE-ECE-COE surveys all students from each of 12 randomly selected classes from all classes offered by the department. SOFTWARE Stratified 4. A dean at School of Architecture selects 15 men and 15 women from each of 4 classes. APPLICATION Convenience 5. Glamour Magazine obtains sample data from readers who decide to mail in a questionnaire printed in the latest issue. Stratified 6. A BIR auditor randomly selects 15 taxpayers with less than P250,000 in gross income and 15 taxpayers with gross income with at least P250,000. Stratified 7. ABS-CBN News polls 750 men and 750 women about their use of credit cards. Case Study/ 8. A medical researcher from UST interviews all leukemia patients in each of 9 Cluster randomly selected Metro Manila cities. P U P 38 STAT 203 STATISTICAL QUESTIONS? ANALYSIS with SOFTWARE APPLICATION You may reach at the ff channels during Consultation Hours: Google Classroom Microsoft Teams P U P 39

Use Quizgecko on...
Browser
Browser