Geographic Information Systems (GIS) PDF
Document Details
Uploaded by LuxuriantSheep
Arba Minch University
Tags
Summary
This document presents the theoretical foundations of Geographic Information Systems (GIS). It covers key concepts, definitions, and the historical development of GIS. The theory includes discussions on geographical concepts like geography, and information, and the importance of GIS.
Full Transcript
**Chapter 1: Basics of Geographic Information Systems (GIS)** ### Concepts and definition of GIS Almost everything that happens somewhere, humans is confined in our activities to the surface and near-surface of the Earth. We travel over it and in the lower levels of the atmosphere, and through tun...
**Chapter 1: Basics of Geographic Information Systems (GIS)** ### Concepts and definition of GIS Almost everything that happens somewhere, humans is confined in our activities to the surface and near-surface of the Earth. We travel over it and in the lower levels of the atmosphere, and through tunnel dug just below the surface. We dig ditches and bury pipelines and cables, construct mines to get at mineral deposit, d drill wells to access oil and gas. Keeping track of all this activity is important, and knowing where it occurs can be the most convenient basis for tracking. Knowing where something happens is of critically important if we want to go there ourselves or send someone there, to find other information about the same place, or to inform people who live nearby. In addition, most (perhaps all) decisions have geographic consequences. Therefore, geographic location is an important attributes of activities, policies, strategies, and plans. Geographic information systems are a special class of information systems that keep the track not only of events, activities, and things, but also of where these events, activities, and things happen or exist. **Definitions of GIS** Before examining some common definitions, we should look at what the words themselves mean. (**I.e. Geography + Information + System)** ***Geography**:* is the science which deals the description of the earth's surface, treating of its form and physical features, its natural and political divisions, the climate, productions, population, etc. of various countries. It is frequently divided into physical and human geography. - So by **geography** we mean the real world or the spatial realities (Example: the location of a city, location of a school etc) ***Information:*** refers to the action of informing, formation or moldings of the mind or character, training, instruction, teaching or communication of instructive knowledge. It is also to mean knowledge communicated concerning some particular fact, subject or event. It is the action of telling or fact of being told of something. - So **Information** is about the data and their meanings (Example: The name of a city, the area of a city, the population density of a city etc) **System:** It refers to an organized or connected group of objects. In other words, it is a set or assemblage of things connected, associated or interdependent so as to form a complex unity; a whole composed of parts in orderly arrangement according to some scheme or plan: rarely applied to a simple or small assemblage of things(nearly = group or set). - The **system** is also about the computer technology (Example: Computer hardware &Software) Different types of definitions can be given to GIS depends on the profession and the purpose it is using. But common to all definition is the spatial location. Some of the definitions are: - *"GIS is a system for capturing, storing, checking, integrating, manipulating, analysing and displaying data which are spatially referenced to the Earth.* - "GIS is a special case of information system where the database consists of observation about spatially distributed features, activities or events, which are definable in space as points, lines or area. - GIS as *"A GIS is a system, consisting of hardware ,software, data, procedures and proper organizational context, which compiles, stores, manipulates, analyses, models and visualizes spatial data, to solve planning and management problems"* (*Christiansen*, 1998 ) *[Why GIS is unique?]* - GIS is unique because it handles **[SPATIAL]** information, information which is referenced by its location in space GIS makes connections between activities based on spatial proximity - People commonly think GIS as a single, well-defined, integrated computer system. However, this is not always the case. A GIS can be made up of a variety of software and hardware tools. The important factor is the level of integration of these tools to provide smoothly operating, fully functional geographic data processing environment. ### History of GIS The history of GIS is divided in to three eras. These are: The era of innovation, era of commercialization, and era of exploitation. 1. **Era of Innovation** (1957- 1980): It is the era where GIS is introduced to the world. It was created by Harvard researchers in the Harvard Laboratory for computer graphics and spatial analysis. The most important events in the era of innovation were the foundation of ESRI (Environmental System and Research Institute) and the launch of landsat I in 1972. 2. **Era of Commercialization** (1981-1999): It is the era where GIS is used to make a business. Hence, a number of Government and private organizations were established to make GIS a worldwide profit making industry. The main events of this era were the launch of ArCInfo, introduction of GPS operation (*It is used for navigation, surveying and mapping)*, and Internet GIS products. 3. **Era of Exploitation** (1999 - present): It is the era where we are now. It is distinct by a high number of GIS users. The prominent activities of this era are the availability of more than one million users, Launch of IKONOS and QUICKBIRD satellites, and the introduction of Google earth and Mobile mapping. ### Coordinate system in GIS The Purpose of coordinate system is to provide a common basis for communication about a particular place or area on the earth\'s surface and to integrate datasets within maps as well as to perform various integrated analytical operations in GIS. - The most critical issue in dealing with coordinate systems is *[knowing]*what the projection is and having the correct coordinate system information associated with a dataset. ***There are two types of coordinate systems---geographic and projected*** 1. ***Geographic coordinate system (GCS)*** uses a three-dimensional spherical surface to define locations on the earth. In a geographic coordinate system, a point is referenced by its longitude and latitude values. Longitude and latitude are angles measured from the earth\'s center to a point on the earth\'s surface. Illustration of a globe with longitude and latitude values ***Figure 1:** geographic coordinate system* In the GCS, horizontal lines, or east--west lines are lines of equal latitude or parallels. Vertical lines or north--south lines are lines of equal longitude or meridians. These lines encompass the globe and form a gridded network called a graticule. ![](media/image3.png) 2. ***Projected coordinate system:*** A projected coordinate system is defined on a flat, two-dimensional surface. A projected coordinate system is always based on a geographic coordinate system that is based on a sphere or spheroid. In a projected coordinate system, locations are identified by X, Y coordinates on a grid, with the origin at the center of the grid. Each position has two values that reference it to the central location. One specifies its horizontal position and the other its vertical position. The two values are called the x-coordinate and y-coordinate. Using this notation, the coordinates at the origin are x = 0 and y = 0. ***Figure 3:** projected coordinate system* **The Universal Transverse Mercator System (UTM)** UTM is the commonly used system in GIS. This grid system have been widely adopted for topographic maps, referencing satellite imagery, natural resource data bases and other application that require [precise positioning] - In the UTM grid system, the area of the earth between 84^o^ North and 80^o^ South Latitude is divided into north-south columns at 6^o^ of longitude wide called Zones. These are numbered from 1 to 60 eastward beginning at the 180W meridian. Zones in the northern hemisphere are labeled with N, and zones in the southern hemisphere are labeled with S.X and Y coordinates in each zone are measured in meters. Within each zone, the meridian in the center of the zone is given an easting value of 500,000 meters. - UTM zone 37 is used to central parts of Ethiopia **A working GIS integrates the following five key components as described below.** A. - Input devices, which includes digitizer, scanner, keyboard - Storage devices includes, hard disc, floppy disc, CD ROM - Processing devices or processor, and - Output device includes printers, plotter and monitor. B. **Software:** GIS software provides the functions and tools needed to store, analyzed and display geographic information. Some of the known GIS software which are available in the market are: Arc view, Arc/info, Arc GIS, Map info, etc. However, the selection of a GIS software package for a particular project is usually based on criteria such as price, database availability and types, and the capability and flexibility of the software. C. **Data:** Perhaps the most important component of GIS is data and that is why geographic data or information is considered as the heart of GIS. Locations and other characteristics of natural features and human activities on, above and beneath the earth's surface are recorded as geographic data for GIS. Generally, primary and secondary data sources may have three modes or dimensions, i.e. spatial, temporal, or thematic. - - - Therefore, as mentioned above major emphasis in GIS operation is placed on data -- from data input to data analysis and to the presentation of data. Geographic data and related tabular data can be collected, compiled into custom specifications and requirements or occasionally purchased from a commercial data provider. A GIS can integrate spatial data with other existing data resources, often stored in corporate DBMS. - ***What is Database Management System?*** The next logical component in a GIS is a ***database management system*** (DBMS). Traditionally, this widely-used term refers to a type of software that is used to input, manage, and analyze attribute data. It is also used in that sense here, the need to recognize spatial database management is necessary. Thus, a GIS typically incorporates not only a traditional DBMS but also a variety of utilities to manage the spatial and attribute components of the geographic data stored. With a DBMS, it is possible to enter attribute data, such as tabular information and statistics, and subsequently extract specialized tabulations and statistical summaries to provide new tabular reports. However, most importantly, a DBMS provides us with the ability to analyze attribute data. For example, we might query (ask) the system to find all property parcels where the head of the household is single but with one or more children, producing a map of the results. The final product (a map) is certainly spatial, but the analysis itself has no spatial qualities whatsoever. ***Spatial and Attribute Databases*** - It is important to recognize that a GIS is not simply a mapping tool, nor is it solely a fancy way to display images from aerial photos, Landsat scenes, etc. GIS contains databases where features are related to each other according to their spatial locations. Thus for each feature contained within the GIS database, information exists on: a\) Its identity (what is it?), b\) Its location (where is it?), and c\) Its relationship to other features (where it is relative to all other features? which are adjacent, distance, etc.). A GIS map contains "layers", which are combined to produce a map. It is important to remember that with GIS, you can "peel off" these layers one by one to examine them one by one or in some combination (see the figure below). Thus, central to the GIS system is the database---a collection of maps and associated information in digital form. Because the database is concerned with earth surface features, it comprises two elements---a *spatial database* describing the geography (shape and position) of earth surface features, and an *attribute database* describing the characteristics or qualities of these features. Thus we might have a property parcel defined in the spatial database and qualities such as its land use, ownership, property valuation, and so on, in the attribute database. D. **People*:*** GIS technology is of limited value without the people who manage the system and develop plans for applying it to the real world problems. GIS users range from technical specialists who design and maintain the system to those who use it to help them to perform their everyday work. However, the identification of GIS specialists versus end users is often critical to the proper implementation of GIS technology. For instance, based on their information needs and the way they interact with the system, GIS users can be classified into three categories as follows. - GIS experts - General GIS users, and - GIS viewers E. **Methods:** A successful GIS operates according to a well-designed implementation plan and business rules, which are the models and operating practices unique to each organization. As in all organizations dealing with sophisticated technology, new tools can only be used effectively if they are properly integrated into the entire business strategy and operation. To do this properly requires not only the necessary investments in hardware and software, but also in the retraining and/or hiring of personnel to utilize the new technology in the proper organizational context. Generally, failure to implement your GIS with a proper organizational commitment will result in unsuccessful system and many of the issues concerning with organizational commitment are described in implementation issues and strategies. ![](media/image6.png) **Chapter 2: GIS data Types and Analysis** **The data used in GIS is classified into two types. Spatial and non-spatial data** Spatial data is the data or information that identifies the geographic location of features and boundaries on Earth, such as natural or constructed features. Spatial data is represented in the form of either raster data model or vector data model. **[Vector Based Model]** **Vector Data** **Vector data provides a way to represent real world features within the GIS environment. A vector feature has its shape represented using geometry. The geometry is made up of one or more interconnected vertices. A vertex describes a position in space using x, y and optionally z axis. In the vector data model, features on the earth are represented as:** - **points** - **lines / routes** - **polygons / areas** **[Point Features]** - A point feature is a zero-dimensional abstraction of an object which is represented by X, Y co-ordinate. A point normally represents a geographic feature which is too small to display as a line or area; for example, the location of a building on a small-scale map, or the location of a service on a medium scale map. Points have attributes that describe the geographic feature they represent. **[Line Features:]** - A set of ordered co-ordinates that represent the shape of geographic features too narrow to display as area at the given scale. Lines have attributes that describe the geographic feature they represent. ![](media/image8.png)**[Area Features (Polygon Features):]** - A polygon is defined by the lines that make up its boundary and a point inside its boundary for identification. Polygons have also attributes that describe the geographic feature they represent. **Advantage of vector data modeling** - Vector data accurately represents true shape and size - Vector data represents non-continuous data (e.g., rivers, political boundaries, road lines, mountain peaks) - Vector data creates aesthetically pleasing maps - Vector data structure requires less disk storage - Since most data, e.g. hard copy maps are in vector forms no data conversion is required. **Disadvantage of vector data modeling** - Complex Data structure - Overlaying multiple vector maps is often time consuming - Continuous data, such as elevation data, is not effectively represented. - Spatial analysis and filtering within polygons is impossible - Raster Data is cell based data such as aerial imagery and remote sensing. Raster data is characterized by pixel values. Basically, a raster file is a giant table, where each pixel is assigned a specific value from 0 to 255. The meaning behind these pixel values is specified by the user -- they could represent elevation, temperature, hydrology and etc. ![](media/image10.png) **Advantage of Raster data modeling** - representing continuous data (e.g., slope, elevation) - representing multiple feature types (e.g., points, lines, and polygons) as single feature types (cells) - rapid computations (\"map algebra\") in which raster layers are treated as elements in mathematical expressions - analysis of multi-layer or multivariate data (e.g., satellite image processing and analysis) **Disadvantage of Raster data modeling** - **Raster data needs** large disk size - Cell size determines the resolution at which the data represents - It is difficult to represent linear features - Projection and transformations are more difficult. **The Choice between Raster and Vector Models** The choice between raster and vector based model depend upon the type of data analysis and other operations. However, there is always scope to convert one form to another. i.e., raster to vector or vector to raster - Certain kind of data manipulation such as polygon intersection, union, clipping, merging etc are complex in raster data model as compared to vector. - Multi-theme overlay operations are easier in raster data model than vector. - Representation of surface is more common in raster-based model than vector - [**Non spatial data** is the attribute information which describes the spatial information. Attribute data could be numeric and/or text (e.g. attributes of a land parcel might include address, owner's name and other property values). Attribute data is typically stored in tabular format.] **GIS spatial analysis** Spatial analysis: Analytical techniques associated with the study of the location of geographical entities together with their spatial dimensions Spatial data analysis is a core GIS operation which is applied on a raw data set to retrieve derivative information for interpretation and decision-making. Spatial analysis is used to make new information Spatial analysis allows us to study and understand the real world processes by developing and applying manipulation / analysis criteria. Spatial analysis involves different operations. Some of the operations are: 1. - Spatial analysis can be done in two ways: a. **Vector based analysis** b. **Raster based analysis** - Making maps alone does not justify the high cost of building GIS. Maps could be produced using a simpler cartographic package. Likewise, if the purpose is to generate tabular output, simpler database management systemcould do it. - It is a spatial analysis that requires the logical connection between attribute data and map features. - Spatial analysis ranges from simple display of features to complex, multistep analytical models. - Some of the Spatial analysis include: I. **Showing the geographic distribution of data** II. **Querying GIS data** Another type of GIS analysis is querying, or selecting from the database.**Querying** let us to identify and focus on a specific set of features. There are two types of GIS queries, *attribute* and *location* queries. Attribute queries find features based on their attributes. The police department mentioned above could use an attribute query of their database to obtain a table of crimes that fall into a particular category. The query on the CRIM\_CAT field shows records where the value in the field is 9. The map on the other hand shows the results of the query. Location queries, also called spatial queries, find features based on where they are. The police department could use a location query of the database to find crimes that occurred within a given area. III. **Identifying what is nearby** **Identifying what is nearby** type of GIS analysis is to find what is nearby feature. One way to find what is near is by creating a buffer around the feature. Buffer is an area of specified distance (radius) around a map feature (point, line or polygon). For example,a city planning commission could identify the area within 1,000 meters of a proposed airport by buffering the airport feature. ![](media/image15.png) Using buffer operations we can generate one or more polygons (*multiple buffers*) by surrounding existing geographic features. We can buffer any type of feature. - Buffers around points form *circular areas* - Buffers around lines form ‚*worms*' - Buffers around polygons form larger *regions*. ![](media/image19.png) **Unit 3: GIS data Sources and management** - Data collection is split into *data capture* (direct data input) and *data transfer* (input of data from other systems). - Data collection is one of the most time-consuming and expensive, yet important, of GIS tasks. - There are many diverse sources of geographic data and many methods available to enter them into a GIS. **Data capture** Two main types of data capture are ***Primary data sources*** are those collected in digital format specifically for use in a GIS project. ***Secondary sources*** are digital and analog datasets that were originally captured for another purpose and need to be converted into a suitable digital format for use in a GIS project. **Primary geographic data capture** - **Primary data sources are those collected in digital format specifically for use in a GIS project. GIS data can be in raster, vector or attribute** 1. **Raster data capture** - Remote sensing is a technique used to derive information about the physical, chemical, and biological properties of objects without direct physical contact - Information is derived from measurements of the amount of electromagnetic radiation reflected, emitted, or scattered from objects. - ***Resolution*** is a key physical characteristic of remote sensing systems. - ***Spatial resolution*** refers to the size of object that can be resolved and the most usual measure is the pixel size. - ***Spectral resolution*** refers to the parts of the electromagnetic spectrum that are measured. - ***Temporal resolution*,** or repeat cycle, describes the frequency with which images are collected for the same area. 2. **Vector data capture** **Surveying** - Ground surveying is based on the principle that the 3-D location of any point can be determined by measuring angles and distances from other known points. - Traditional equipment like transits and theodolites have been replaced by total stations that can measure both angles and distances to an accuracy of 1 mm - Ground survey is a very time-consuming and expensive activity, but it is still the best way to obtain highly accurate point locations. - Typically used for capturing buildings, land and property boundaries, manholes, and other objects that need to be located accurately. Also employed to obtain reference marks for use in other data capture projects. - Relatively new technology that employs a scanning laser rangefinder to produce accurate topographic surveys - Typically carried on a low-altitude aircraft that also has an inertial navigation system and a differential GPS to provide location. **Secondary geographic data capture** **Vector data capture** - Secondary vector data capture involves digitizing vector objects from maps and other geographic data sources. **Heads-up digitizing and vectorization** - Vectorization is the process of converting raster data into vector data. - The simplest way to create vectors from raster layers is to digitize vector objects manually straight off a computer screen using a mouse or digitizing cursor. **Raster data capture using scanners** - Three main reasons to scan hardcopy media are: Documents are scanned to reduce wear and tear, improve access, provide integrated database storage, and to index them geographically - Maps, aerial photographs and images are scanned prior to vectorization 3. **Capturing attribute data** - Attributes can be entered by direct data loggers, manual keyboard entry, optical character recognition (OCR) or, increasingly, voice recognition. - An essential requirement for separate data entry is a common identifier (also called a key) that can be used to relate object geometry and attributes together following data capture **Data Entry Techniques in GIS** - Digitizing Paper Map - By scanning - Keyboard entry - Spatial Data elsewhere A. **Digitizing Paper Map** A cost-effective method of data capture is the DIGITIZING of existing maps. This requires the conversion of an analogue map into a digital map. A number of digitizing techniques exist. - On-tablet Digitizing (manual) - On-screen Digitizing ![](media/image24.png) B. **Scanning** A digital scanner illuminates the document and measures with a sensor the intensity of the reflected or transmitted light. C. **Keyboard entry** Keyboard entry is entering GIS data through key coding. Keyboard entry is the best method to enter attribute data D. **Spatial Data elsewhere** - **Number of satellites visible** **Database management systems** **Database management Concepts** - A database is a *large, computerized collection of structured data*. - Databases have been in use since the 1960's, for various purposes like *bank account administration, stock monitoring, salary administration, order bookkeeping, and flight reservation systems to name just a few.* - The common denominator between these applications is that the amount of data is usually *quite large, but the data itself has a simple and regular structure* - Designing a database is **not an easy task**. - **Firstly**, one has to consider carefully *what the database purpose* is, and who its users will be. - **Secondly**, one needs to identify the **available data sources** and **define the format** in which the data will be organized within the database. - This format is usually called the data-base structure. - **Lastly,** data can be entered into the database - It is important to keep the data **up-to-date**, and it is therefore wise to set up the processes for this, and **make someone responsible** for regular maintenance of the database. - *[A database management system (DBMS) is a software package that allows the user to set up, use and maintain a database]* **Reasons for using a DBMS** There are various reasons why one would want to use a DBMS for data storage and processing I. A DBMS supports the **storage** and **manipulation of very large data sets**. II\. A DBMS can be instructed to **guard** over data **correctness** - an important aspect of data correctness is data entry **checking:** ensuring that the data that is entered into the database does not contain obvious errors. - For instance, since we know the study area we are working in, we also know the range of possible geographic coordinates, so we can ensure the DBMS checks them. III\. A DBMS supports the **simultaneous use** of the same data set by many users. - using the database at the same time, without affecting each other's activities. IV\. A DBMS provides a high-level, declarative query language - A query is a computer program that extracts data from the database that meet the conditions indicated in the query. V. A DBMS supports the use of **a data model**. - A data model is a language with which one can define a database structure and manipulate the data stored in it. - The most prominent data model is the relational data model. VI\. A DBMS includes **data backup** and **recovery** functions to ensure data availability at all times. - Regular back-ups of the data set, and automatic recovery schemes provide an insurance against loss of data. VII\. A DBMS allows the control of **data redundancy.** - A well-designed database takes care of storing single facts only once. - Storing a fact multiple times---a phenomenon known as data redundancy **The relational data model** - Relational DBMS is **most widely** accepted for managing the attributes of geographic data. - The relational DBMS is attractive because of its: - *Simplicity in organization and data modeling.* - *Flexibility - data can be manipulated in an ad hoc manner by joining tables.* - *Efficiency of storage - by the proper design of data tables redundant data can be minimized; and* - *The non-procedural nature - queries on a relational database do not need to take into account the internal organization of the data.*