Geog 380: Geospatial Communication - Sampling Methods - PDF

Summary

This document provides an overview of sampling methods in geospatial communication. It covers topics like defining target populations and sampling frames, considerations for selecting the number of samples, types of sampling techniques and methods, and sample size determination. It also examines different methods, and discusses advantages and disadvantages.

Full Transcript

Geog 380: Geospatial Communication Source: https://earth.nullschool.net Topic 13: Sampling GEOG 380 - Topic 13 © Geoffrey Hay (2023) 1 Learning outcomes By the end of this topic a successful student should be able to: § Identify the similarities between non-spatial and spatial sampling § Recog...

Geog 380: Geospatial Communication Source: https://earth.nullschool.net Topic 13: Sampling GEOG 380 - Topic 13 © Geoffrey Hay (2023) 1 Learning outcomes By the end of this topic a successful student should be able to: § Identify the similarities between non-spatial and spatial sampling § Recognize the different elements considered when selecting the number of samples § Apply different methods for sampling and recognize their relative merits and drawbacks GEOG 380 - Topic 13 2 Samples and Populations § Population: the total set of individuals or potential observations in a defined group § E.g., all the residents of Calgary § In order to measure them all, we would need to conduct a census, and this often not feasible https://www.omniconvert.com/what-is/sample-size/ § Sample: a subset of individuals or observations in the population § Hopefully, the sample represents the population! GEOG 380 - Topic 13 3 The Role of Sampling § Sampling helps us answer several difficult questions 1. How large should the sample be? 2. How/where should the samples be chosen? 3. How much reliability will we have in results based on this sample? https://www.omniconvert.com/what-is/sample-size/ GEOG 380 - Topic 13 § All of these revolve around the fact that we normally can’t conduct a census of an entire population! 4 Sampling Units § Sampling units: the individual items in a sample, and the basic entity upon which observations are made § May be discrete entities (e.g., people, households, cities, etc.), points, or areas (e.g., quadrats, strips, plots, pixels, etc.) § Must be explicitly defined! § Sampling units must be selected to match the scale of the information desired § e.g., household income à households § Personal income à individual people https://ccnmtl.columbia.edu/projects/qmss/samples_and_sampli ng/types_of_sampling.html GEOG 380 - Topic 13 5 Steps for sampling McGrew and Monroe (2000) GEOG 380 - Topic 13 6 2015, 2009, 1966 (Majorville medicine wheel) New Project: Use Machine Learning to Discover Medicine Wheels in Alberta § Alberta is very big, where can I find Medicine Wheels: Where Study site? § Siksika Native Reserve (700km^2) § Where are the objects of interest located on the site? § Alberta Archaeological Site Inventory Records (100+ locations) § What scale imagery do I need? § Can I see the objects of interest in the Scene/Image? § How many samples do I need to train the Machine Learning system? § Machine learning heuristics (80% training, 20% test) GEOG 380 - Topic 13 7 2009 2015 GEOG 380 - Topic 13 8 Steps for sampling McGrew and Monroe (2000) GEOG 380 - Topic 13 9 Q1: How large a sample? § One important question to be addressed in a proper sample design is how large should the sample be to be representative? https://www.fredhutch.org/en/news/centernews/2020/02/sample-size.html § Less certainty with small samples § More certainty with larger samples § Larger samples = more cost § What factors do you think should go into this decision? GEOG 380 - Topic 13 10 Sample size determination § Different methods depending on the objectives: https://7esl.com/rule-of-thumb/ § There are two commonly used strategies for sample-size determination: (i) rules of thumb, and (ii) formulas § Formulas: the precision of the estimate of a population parameter is a function of the variance of the population, the sample size, and the allowable error GEOG 380 - Topic 13 11 Random Sampling and Nonsampling Errors Sampling Designs and Sampling Procedures – Bertina Mitchell § Random Sampling Error § The difference between the sample result and the result of a census conducted using identical procedures. § A statistical fluctuation that occurs because of chance variations in the elements selected for a sample § Systematic Sampling Error § Systematic (non sampling) error results from non-sampling factors, primarily the nature of a study’s design and the correctness of execution. https://www.investopedia.com/terms/s/samplingerror.asp § § It is not due to chance fluctuations Less than Perfectly Representative Samples § Random sampling errors and systematic errors associated with the sampling process may combine to yield a sample that is less than perfectly representative of the population. GEOG 380 - Topic 13 12 Sample size determination (method 1) § For determining the sample size necessary to estimate the population mean ⎛ Zs ⎞ n =⎜ ⎟ ⎝E⎠ 2 McGrew and Monroe (2000) Where: n = number of samples Z = desired level of confidence s = standard deviation of a pilot sample E = tolerable error GEOG 380 - Topic 13 Calculating for Sample size https://www.youtube.com/watch?v=qVDVAZigXg0 13 Sample size determination (method 2) 2 t * CV n= 2 E 2 Husch et al. (2003) Where: n = Sample size t = Student’s t-value for the specified probability t-test: a statistic that checks if two means are reliably different from each other. t-value = (variance between groups)/(variance within groups) big t-value = different groups; small t-value = similar groups CV = Coefficient of variation (ratio of the std to the mean; shows variability in relation to mean of population) E = Tolerable error, expressed as % of the mean GEOG 380 - Topic 13 Statscast – what is a t-test? https://www.youtube.com/watch?v=0Pd3dc1GcHc 9.56 mins 14 Sample Size Determination - Procedure 1. Make a reasonable guess at the value of n § How much time do you have? Resources? 2. Look up critical Student’s t-value § Two-tailed probability of obtaining a larger value 3. Select value for E (allowable error) § 10-20% is a reasonable place to start 4. Select value for CV (coefficient of variation) § Large Sample Sizes Increase Statistical Power and Increase The Flexibility Of The Effect Size GEOG 380 - Topic 13 Need prior estimate of variation in population – preliminary (pilot) sample? 5. Calculate n 6. Proceed iteratively until n is reasonable § Things to change: n and E 15 Q2: Where/how to choose samples? § OK, so we know that we need n samples. Where or how do we choose them? § There are many techniques designed to help achieve a sample that is ‘representative’ of the population § The major issue to avoid is bias § Under-representing or over-representing elements of the population because of inappropriate sample design GEOG 380 - Topic 13 16 Sampling methods/designs § Non-probability (selective) methods § Judgmental § Quota § Snowball § Probability methods § § § § GEOG 380 - Topic 13 Systematic Simple Random Stratified random Clustered random 17 Selective Sampling § Observer manually selects sampling unit locations in areas that appear to be representative § Expert knowledge § Advantages § Many would argue that it has no place in well-designed sampling strategies, BUT § Can bail you out in some practical situations https://www.statology.org/snowball-sampling/ Snowball or chain-referral sampling. This sampling method is often used when researchers wish to study a population where the subjects are particularly hard to identify or reach i.e., individuals with rare diseases, homeless individuals, ex-convicts. This is a non-probability sampling method, thus not every population member has the same opportunity to be selected for the study. GEOG 380 - Topic 13 § Disadvantages § Relies on human choice, which is prejudiced by individual opinion, and may result in results that are not representative of the population 18 Simple Random Sampling § The fundamental sampling method. § Each sampling unit in the population has an equal chance of being selected § The selection of any individual should not affect the chance of selecting another individual McGrew and Monroe (2000) GEOG 380 - Topic 13 19 Advantages § Given sufficient samples, this produces an unbiased estimate of population mean and information needed to assess the sampling error § Computers assist greatly through random number generators or random point generators Wheelofnames.com GEOG 380 - Topic 13 20 Disadvantages § The two criteria (equal and independent) is harder to achieve than you might realize § Requires developing a system that is consistent with the sampling unit § How would you conduct a random sample of all the trees in a forest? § Individuals must be sampled with replacement (thus the samples are independent of each other) in order to be a true random sample. This means that an individual can get selected more than once, even if its unlikely. GEOG 380 - Topic 13 21 … continued ... § Cost and difficulty in accessing widely-dispersed locations, if field work is involved § Locating them § Traveling between them https://www.pond5.com/stock-footage/item/79413480-cowboy-sits-horse-cliffmonument-valley-utah GEOG 380 - Topic 13 § May miss small groups in the population, or produce estimates that are biased towards larger groups 22 Systematic Sampling § Samples are selected in a systematic or regular fashion, though the starting point is random § e.g., Every fourth address in the phone book, or every 10 meters along a transect McGrew and Monroe (2000) GEOG 380 - Topic 13 23 … continued… § Advantages § Can provide reliable population estimates by spreading sample over the entire population § Simple to execute in practice, because sample units are located at regular intervals and travel is straight-forward § Disadvantages § Not completely unbiased, because selections are not truly equal and independent § Can fare poorly if systematic patterns occur within population GEOG 380 - Topic 13 24 Stratified Random Sampling § Population is stratified, or divided into groups with reduced variability, and sampled randomly within each strata § Proportional allocation: sampling intensity proportional to sizes of strata § Optimum allocation: sampling intensity proportional to standard deviation of the distribution variable McGrew and Monroe (2000) GEOG 380 - Topic 13 25 … continued … § Advantages § May produce more accurate results because small groups are not missed, and large ones are not over-represented § Can generate separate estimates for each strata § Disadvantages § A basis for stratification is required § Sampling estimates are subject to errors in the stratification criteria GEOG 380 - Topic 13 26 Cluster sampling § The population is broken up into clusters § Exhaustive § Clusters selected at random § Two options § Randomly select samples from the cluster § Use all members of cluster (population) McGrew and Monroe (2000) GEOG 380 - Topic 13 27

Use Quizgecko on...
Browser
Browser