Spatial Analysis Concepts PDF
Document Details
Uploaded by IndulgentPeace
Queen's University
Tags
Summary
This document provides an overview of important concepts in spatial analysis, including the definition of uncertainty and error, internal and external validation methods, spatial operations, and various methods of map types. It also introduces techniques such as thiessen polygons, types of functions and methods of selection and comparing methods.
Full Transcript
Week VII Uncertainty vs. Error - Uncertainty: Uncertainty may be defined as a measure of the user‘s understanding of the difference between the contents of a dataset, and the real phenomena that the data are believed to represent. - Error: The measurement of uncertainty. Types...
Week VII Uncertainty vs. Error - Uncertainty: Uncertainty may be defined as a measure of the user‘s understanding of the difference between the contents of a dataset, and the real phenomena that the data are believed to represent. - Error: The measurement of uncertainty. Types of uncertainty - Conceptual Uncertainty : conceptions of place are problematic - What is the natural or intrinsic unit for analysis? - Representational Uncertainty : choice of the continuous field or the discrete object view - Analytical Uncertainty : internal & external validation, error propagation, lineage error, MAUP, ecological fallacy Internal & External Validation Internal : 1. Split the data set into training set and test set 2. Training set to train model 3. Test set is set aside for validation 4. Model is tested against the test set to check for accuracy External : 1. All the data is used to train model 2. Independent datasets are used to validate model 3. A high accuracy or performance in EV means that the model is robust Error propagation Ecological Fallacy - Error in the interpretation of statistical data when an inference about the nature or characteristics of a individuals are based upon statistics created in the aggregate - The result is inaccurate or false conclusions about social phenomena - aggregation reduces information Scope - Local: point to point - Neighborhood: adjacent regions have input - Global: the entire input data layer may influence output Methods of Selection - Set algebra : , =, ,... - Boolean Algebra: AND, OR, NOT - Spatial Operations: topology, attributes Spatial Selection Operations - Topological: (e.g., in, adjacent, contains) - Spatial attributes: (e.g., state size, directionality) - Attributes of regions: (e.g., state population) Classed vs. Unclassed - Classed Pro: more control over map - Con: need to make choices - Unclassed: Pro: no decisions to make - Con: can be hard to read Network elements - Links: conduits for movement - Nodes: where links join - Impedance: cost of movement over link Week VIII Choosing a Map Type - Point Data: Dot Map, Picture Symbol, Graduated Symbol - Line Data: Network, Flow, Isopleth - Polygon Data, Choropleth, Area Quantitative, Stepped Surface, Hypsometric, Dasymetric, Cartogram - Volume Data, Gridded Fishnet, Realistic Perspective, Hill-Shaded, Image Map - Temporal Data, Multiple Views, Animation Differing Map types - Dot Map - Density Dot map - Picture symbol map - Graduated symbol map - Flow map - Area qualitative map - Stepped surface map - Choropleth map - Continuous / unclassed choropleth map - Hypsometric map - Dasymetric map - Cartogram - Isoline map - Image map - Multivariate map Differing views of maps - Fishnet / gridded - Realistic - Hillshade relief Aspatial Descriptive Statistics - Range: Minimum, Maximum, Min-Max, Outliers - Central Tendency: Mean, Median, Mode - Variation: Variance, Standard Deviation Theissen (Voronoi) Polygons - A technique for partitioning space into polygons from a set of points such that any location inside each polygon is closer to that point than any of the other points Cluster analysis - Moran's index - A statistical measure of spatial autocorrelation, which means the degree of similarity or dissimilarity between values of a variable in nearby locations. - Nearest Neighbor Index - A spatial statistic that measures how clustered or dispersed a set of points are in a given area. - Hot Spot Analysis - Detect spatial patterns of concentration or dispersion - High Z scores = spatial clustering of high values - Low Z scores = spatial clustering of low values Raster Analysis - Applications: Hazards, climate, environment, people-environment, terrain analysis Types of Functions - Local functions: mathematical,boolean, logical, reclassification, multilayer raster overlay - Mathematical functions; apply unary or binary functions to cells - Logical Operations: Outputs a “True” (1) or “false” (0) Reclassification - Approaches : lookup table or ranges, conditional functions, nested functions Raster overlay - Issues: Account for attributes, grid sizes, regions and # of cells Global functions - Statistical ( Mean, sum, max, of cell in layers) - Distance - Friction/cost surfaces Cost Surfaces - Friction: cost/distance (taking least costly path) - Cost surface: Global costs of movement accounting for friction Barriers - Relative barriers: Slows movement - Absolute barriers: completely stops movements Week IX Digital terrain model → DSM 1M vs digital surface model → DTM 1M Slope: - As a percent rise/run *100 = A/B *100 - As a degree ∅ = Tan -1 (A/B) Calculating Slope (ex, rise = 10ft, run= 40ft) - Slope: 10/40 x100 = 25% - Slope = tan-1 (10/40) = 14.0 Degrees Slope calculation - Measured in the steepest direction of elevation change - Often does not fall parallel to the raster rows/columns → 4 nearest cells, 3rd order finite difference - Elevation is x: each cell is assigned a subscript and the elevation value by subscripted value - S: atan (formula is on paper) Viewshed - The view point for a point is the collection of areas visible from that point - Image is on paper Hydrologic functions - An area that contributes flow to a point on the landscape → identifies from a flow direction surface Drainage networks → see if need to study on Wednesday Watershed Delineation process 1. Condition DEM 2. Process Sinks 3. Flow direction, 4. Flow accumulation 5. Stream definition 6. Outlet identification 7. Watershed Delineation Sources if 3D data - Passive vs active sensors - Passive: detect natural energy without emitting their own signals, ie. Cameras, thermal, senors - Active: emit their own signals and measure the return, ie. LiDAR, Sar, laser, sonar Sampling - Methods: Systematic, radom, cluster, adaptive - Systematic sampling : - Approach: samples spaced uniformly at fixed X, Y - Advantages: Easy to understand; parallel lines - Disadvantages: receive same attention, difficult to stay on lines, may be biased - Random Sampling - Approach: Select point based on random number - Advatges: Less biased - Disadvantages : not high variety, difficult to explain - Cluster Sampling: - Approach: Samples within each cluster, plot on map - Advantages: Reduces travel time - Disadvantages: WIll inert disadvantages of parent approach (Eg. random or systematic) - Adaptive Sampling - Approach : more samples where there is more variability, need prior knowledge of variability eg. two stage sampling - Advantage: more efficient, better representation - Disadvantage: Need prior info on variability Spatial Estimation - Outputs - Raster surface: Values are measured at a set of sample points - Boundaries and cell dimensions established - Interpolation Method estimated the value of each measured grid cell - Contour lines: Iterative process - From the sample points estimate points if value, connect these points to form a line - Estimate the next value, creating another line with the restriction that lines of a different value do not cross Thiessen Polygons 1. Draw lines connecting nearest neighbors. 2. Bisect each line 3. Connect bisectors and assign the enclosed point value to the polygon Trend surface interpolation - Fitting a statistical model, trend surface, through the measured points Kriging - Statistical based estimator of spatial variables combine components into a single layer→ creates math model used to estimate → Spatial trend, autocorrelation, random variation - Process → - 1. Set sample of points used to estimate variogram - 2. Variogram model is made - 3. Variogram model is then used to interpolate the entire surface Lag Distance - Used in calculating semi-variances for kriging - Semi variance model → Comparing Methods of input data, thiessen polygons, fixed radius, IDW, splines, trend surface, Kiriging Spatial modeling → Expressing a process or a pattern as a set of GIS Layers and operation - Purpose: Define and solve problems, resolve conflicts, solve problems through analysis - Description: Describe spatia;l patterns or processes with maps, images or models - Determine spatial associations and factors affecting spatial distribution. Prediction Vs Prescription - Prediction: what will happen over time - Prescription: what would happen if Types of spatial models + knowing difference - Cartographic models - Simple spatial - Spatio-temporal K. Noltie - Tick borne illnesses - DeBeer Internship - Focused on forest disturbance → and its impacts - Difficulties with road extraction - Lyme diseases Week XI Data Standards - Used to format, asses, document and deliver data → Analysis standards: Ensure most appropriate methods are used → professional or certification standards: Establish education, knowledge or experience of analyst Spatial data standards - Media standards: Physical specifics - Format standards: data file components/structures - Spatial data accuracy standards: Data quality - Documentations standards: how to describe spatial data lineage Geospatial certifications - Geographic information system professional (GISP) - Exams and experience from GIS certification institute (GISCI) - Technical certifications - GIS certifications Data Accuracy - Error: difference between encoded and actual expected value - Types of error - Random Error: Random or stochastic noise in data - Systematic error: persistent/identifiable bias in data - Gross Error; Big mistakes Documenting spatial data accuracy - Positional: how close the locations of objects represented in a digital data set corresponds to the locations for the real world entities - Attribute Accuracy: summarizes how different the attributes are from the true values - Logical Accuracy: Reflects the present, absence or frequency of inconsistent data - Completeness: How well data set captures all the features - Lineage: Describes sources, methods, timing, and persons responsible for the development of data set → helps establish bounds on the other measures of accuracy Accuracy Vs. Precision → Look at paper/ slides Determining Accuracy - Truth vs Data - Truth: Must know the accuracy of our measure of the “truth”, truth must be independent and have a higher order of measurement Expressing Accuracy - How close is the observation to the truth - Expressed as a number or probability distribution - Percent only - How often the value is wrong National standard for spatial data accuracy - NSSDA (1998) - In theory, replaces national map accuracy standard, applies to vector and raster models Review From Lecture: - Key terms, equations, formulas, concepts, analysis and applications, comparing methods Spatial estimation - what equations, how to know which one to use - distribution data, variation between points - Thiessen Polygons - - Fixed radius - bad for clustered points, tends to be better for systematic, perfectly random distribute data, weigh all points interacted with equally - Inverse distance weighted - Splies - weigh all points interacted with equally - Trend surface - bad for hotspot and clustered because of smoothness transition - Kriging - based on geostatistics and measurement of spatial autocorrelation, used lag methods and distance. Lag distance is simply distance between pair of points - Statistical model Variograms - Plot of the semivariance over a range of lag distances. Summarizes the spatial autocorrelation of a variable. Small lag distance = small semi-variance Spatial Statistics - Mean (Average) know equations - Variance - know equation - Standard deviation - distance from mean in term of % - know equation - How they work on graph Difference between accuracy and precision - Accuracy is how close values are to what they are supposed to be - precision : How close values they are in repetition Error Tables; not going to be tested - Probability that the cells relation to the field, map identifies the cell Vector scopes - Local : Point to point - 1-1 - Neighborhood operation: Adjacent features - Global: entire data set Raster scopes: - Neighborhood - input cells, center cells, collection of cells - - Zones- zones across raster - Global - whole zone - Ex. water, would be zonal - Moving windows rook’s case neighborhood (Kernel is middle??), kings case neighborhood. Reclassification Unary vs Binary Condition and functions ( if- then-else rules) Cost surfaces (Does not only mena money) - Friction : cost x distance (taking the least cost path, time, transit) - Cost Surface: Global costs of movement accounting for friction Minimum bounding geometrics Surface normals: determines surface of model, how light affects, hillshading Sources of 3D data Vector Overlay - look at slide - Uniton, intersect … - Clip vs intersect - Clip: the boundaries and attributes of only the clipped layer, just clips top layer, cookie cutter - Intersect: boundaries and attributes of both layers, still retaining info of both, two cookie cutter thing Geocoding : the connection between a point on the mao as expressed in coordinated and addresses Methods of selection - look how they are done (Things are sequel) - how they will look in operations → Set operations, topological or spatial, boolean Topographic modelling Weighted linear combination - Combine layers with weights vs. AND/OR. Allows the decision maker to make tradeoffs. - Applying different weights, slope more important than aspect, slope more weight