Introduction to GIS Programming and Fundamentals with Python and ArcGIS® PDF
Document Details
Uploaded by Deleted User
2017
Chaowei Yang
Tags
Summary
This textbook introduces GIS programming and fundamentals using Python and ArcGIS. It covers topics such as object-oriented programming, Python language features, and vector data visualization. The book includes practical exercises and problems.
Full Transcript
Introduction to Programming and Fundamentals with Chaowei Yang Introduction to GIS Programming and Fundamentals with Python and ArcGIS® Introduction to GIS Programming and Fundamentals with Python and ArcGIS® Chaowei Yang With the collaboration of Manzhu Yu Qunying H...
Introduction to Programming and Fundamentals with Chaowei Yang Introduction to GIS Programming and Fundamentals with Python and ArcGIS® Introduction to GIS Programming and Fundamentals with Python and ArcGIS® Chaowei Yang With the collaboration of Manzhu Yu Qunying Huang Zhenlong Li Min Sun Kai Liu Yongyao Jiang Jizhe Xia CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2017 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed on acid-free paper International Standard Book Number-13: 978-1-4665-1008-1 (Hardback) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www. copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com For Chaowei Yang's parents, Chaoqing Yang and Mingju Tang, for continually instilling curiosity and an exploring spirit Contents Preface.....................................................................................................................xv Acknowledgments.............................................................................................. xxi Editor................................................................................................................... xxiii Contributors........................................................................................................ xxv Section I Overview 1. Introduction.....................................................................................................3 1.1 Computer Hardware and Software.................................................... 3 1.2 GIS and Programming.........................................................................5 1.3 Python..................................................................................................... 7 1.4 Class and Object.................................................................................... 9 1.5 GIS Data Models.................................................................................. 10 1.6 UML...................................................................................................... 11 1.7 Hands-On Experience with Python................................................. 14 1.8 Chapter Summary............................................................................... 16 Problems.......................................................................................................... 17 2. Object-Oriented Programming................................................................. 19 2.1 Programming Language and Python.............................................. 19 2.2 Class and Object.................................................................................. 21 2.2.1 De!ning Classes.................................................................... 21 2.2.2 Object Generation.................................................................. 23 2.2.3 Attributes................................................................................ 23 2.2.4 Inheritance.............................................................................. 25 2.2.5 Composition............................................................................ 26 2.3 Point, Polyline, and Polygon.............................................................. 27 2.4 Hands-On Experience with Python................................................. 30 2.5 Chapter Summary...............................................................................30 Problems.......................................................................................................... 31 Section II Python Programming 3. Introduction to Python................................................................................ 35 3.1 Object-Oriented Support.................................................................... 35 3.2 Syntax................................................................................................... 36 3.2.1 Case Sensitivity...................................................................... 36 3.2.2 Special Characters.................................................................. 36 vii viii Contents 3.2.3 Indentation.............................................................................. 36 3.2.4 Keywords................................................................................ 37 3.2.5 Multiple Assignments........................................................... 38 3.2.6 Namespace.............................................................................. 38 3.2.7 Scope........................................................................................ 38 3.3 Data Types............................................................................................ 40 3.3.1 Basic Data Types.................................................................... 40 3.3.2 Composite Data Types...........................................................42 3.4 Miscellaneous...................................................................................... 48 3.4.1 Variables.................................................................................. 48 3.4.2 Code Style............................................................................... 49 3.5 Operators.............................................................................................. 50 3.6 Statements............................................................................................ 53 3.7 Functions..............................................................................................54 3.8 Hands-On Experience with Python................................................. 56 3.9 Chapter Summary............................................................................... 56 Problems.......................................................................................................... 57 4. Python Language Control Structure, File Input/Output, and!Exception Handling............................................................................. 61 4.1 Making Decisions............................................................................... 61 4.2 Loops.....................................................................................................64 4.3 Other Control Structures................................................................... 66 4.4 File Input/Output................................................................................ 67 4.5 Exceptions............................................................................................ 69 4.6 Hands-On Experience with Python................................................. 70 4.6.1 Find the Longest Distance between Any Two Points....... 70 4.6.2 Hands-On Experience: I/O, Create and Read a File......... 70 4.6.3 Hands-On Experience: I/O, Flow Control, and File......... 72 4.6.4 Hands-On Experience: Input GIS Point Data from"Text File.......................................................................... 74 4.7 Chapter Summary............................................................................... 75 Problems.......................................................................................................... 75 5. Programming Thinking and Vector Data Visualization.....................77 5.1 Problem: Visualizing GIS Data.........................................................77 5.2 Transforming Coordinate System.....................................................80 5.2.1 How to Determine Ratio Value?.......................................... 82 5.3 Visualizing Vector Data.....................................................................84 5.4 Point, Polyline, Polygon...................................................................... 86 5.5 Programming Thinking..................................................................... 87 5.5.1 Problem Analysis................................................................... 88 5.5.2 Think in Programming......................................................... 88 5.5.3 Match Programming Language Patterns and"Structure......................................................................89 Contents ix 5.5.4 Implement Program.............................................................. 89 5.6 Hands-On Experience with Python................................................. 90 5.6.1 Reading, Parsing, and Analyzing Text File Data..............90 5.6.2 Create GIS Objects and Check Intersection....................... 92 5.7 Chapter Summary............................................................................... 95 Problems.......................................................................................................... 95 6. Shapefile Handling...................................................................................... 97 6.1 Binary Data Manipulation................................................................. 97 6.2 Shape!le Introduction...................................................................... 101 6.3 Shape!le Structure and Interpretation.......................................... 102 6.3.1 Main File Structure of a Shape!le..................................... 102 6.3.1.1 Main File Header.................................................. 102 6.3.1.2 Feature Record...................................................... 104 6.3.2 Index File Structure (.shx)................................................... 105 6.3.3 The.dbf File.......................................................................... 107 6.4 General Programming Sequence for Handling Shape!les......... 107 6.5 Hands-On Experience with Mini-GIS........................................... 108 6.5.1 Visualize Polylines and Polygons...................................... 108 6.5.2 Interpret Polyline Shape!les.............................................. 109 6.6 Chapter Summary............................................................................. 113 Problems........................................................................................................ 113 7. Python Programming Environment....................................................... 115 7.1 General Python IDE.......................................................................... 115 7.1.1 Python Programming Windows....................................... 115 7.1.1.1 Command-Line GUI............................................ 115 7.1.1.2 Interactive GUI..................................................... 115 7.1.1.3 File-Based Programming.................................... 116 7.1.2 Python IDE Settings............................................................ 117 7.1.2.1 Highlighting......................................................... 117 7.1.2.2 General Setting of the Programming Window................................................................ 118 7.1.2.3 Fonts Setup for the Coding................................. 118 7.1.3 Debugging............................................................................ 118 7.1.3.1 SyntaxError........................................................... 120 7.1.3.2 Run-Time Exceptions........................................... 121 7.1.3.3 Handling Exceptions........................................... 122 7.1.3.4 Add Exception Handles and Clean-Up Actions to File Read/Write................................. 123 7.2 Python Modules................................................................................ 124 7.2.1 Module Introduction........................................................... 125 7.2.2 Set Up Modules.................................................................... 125 7.2.3 System Built-In Modules..................................................... 126 7.3 Package Management and Mini-GIS............................................. 127 x Contents 7.3.1 Regular GIS Data Organization......................................... 127 7.3.2 Mini-GIS Package................................................................ 128 7.4 Hands-On Experience with Mini-GIS........................................... 131 7.4.1 Package Management and Mini-GIS................................ 131 7.4.2 Run and Practice the Mini-GIS Package........................... 132 7.5 Chapter Summary............................................................................. 135 Problems........................................................................................................ 135 8. Vector Data Algorithms............................................................................. 137 8.1 Centroid.............................................................................................. 137 8.1.1 Centroid of a Triangle......................................................... 137 8.1.2 Centroid of a Rectangle....................................................... 137 8.1.3 Centroid of a Polygon.......................................................... 138 8.2 Area..................................................................................................... 139 8.2.1 Area of a Simple Polygon.................................................... 139 8.2.2 Area of a Polygon with Hole(s).......................................... 140 8.3 Length................................................................................................. 141 8.3.1 Length of a Straight Line Segment.................................... 141 8.3.2 Length of a Polyline............................................................. 142 8.4 Line Intersection................................................................................ 142 8.4.1 Parallel Lines........................................................................ 145 8.4.2 Vertical Lines........................................................................ 145 8.5 Point in Polygon................................................................................ 146 8.5.1 A Special Scenario............................................................... 146 8.6 Hands-On Experience with Python............................................... 148 8.6.1 Using Python to Draw a Polygon and Calculate the"Centroid.......................................................................... 148 8.6.2 Using Python to Draw Polygon and Calculate the"Area of Polygon............................................................. 148 8.6.3 Using Python to Draw Line Segments and Calculate"the Intersection................................................... 148 8.7 Chapter Summary............................................................................. 150 Problems........................................................................................................ 150 Section III Advanced GIS Algorithms!and!Their Programming!in ArcGIS 9. ArcGIS Programming................................................................................ 153 9.1 ArcGIS Programming...................................................................... 153 9.2 Introduction to ArcPy Package....................................................... 154 9.2.1 ArcPy Functions, Classes, and Modules.......................... 154 9.2.2 Programming with ArcPy in ArcMap.............................. 155 9.2.3 Programming with ArcPy in Python Window outside ArcMap.................................................................... 156 Contents xi 9.2.4 Using Help Documents....................................................... 157 9.3 Automating ArcTools with Python................................................ 158 9.4 Accessing and Editing Data with Cursors.................................... 160 9.4.1 SearchCursor........................................................................ 160 9.4.2 UpdateCursor....................................................................... 164 9.4.3 InsertCursor.......................................................................... 164 9.4.4 NumPy.................................................................................. 165 9.5 Describing and Listing Objects....................................................... 166 9.5.1 Describe................................................................................. 166 9.5.2 List.......................................................................................... 167 9.6 Manipulating Complex Objects...................................................... 169 9.7 Automating Map Production.......................................................... 172 9.8 Creating ArcTools from Scripts....................................................... 172 9.9 Handling Errors and Messages...................................................... 176 9.10 External Document and Video Resources..................................... 177 9.11 Implementing Spatial Relationship Calculations Using"ArcGIS....................................................................................178 9.12 Summary............................................................................................ 180 9.13 Assignment........................................................................................ 182 10. Raster Data Algorithm.............................................................................. 185 10.1 Raster Data......................................................................................... 185 10.2 Raster Storage and Compression.................................................... 186 10.2.1 Run Length Coding............................................................. 187 10.2.2 Quad Tree.............................................................................. 188 10.3 Raster Data Formats......................................................................... 189 10.3.1 TIFF........................................................................................ 189 10.3.2 GeoTIFF................................................................................. 190 10.3.3 IMG........................................................................................ 190 10.3.4 NetCDF.................................................................................. 190 10.3.5 BMP........................................................................................ 190 10.3.6 SVG......................................................................................... 191 10.3.7 JPEG....................................................................................... 191 10.3.8 GIF.......................................................................................... 191 10.3.9 PNG........................................................................................ 191 10.4 Color Representation and Raster Rendering................................ 191 10.4.1 Color Representation........................................................... 191 10.4.2 Raster Rendering.................................................................. 194 10.5 Raster Analysis.................................................................................. 196 10.6 Hands-On Experience with ArcGIS............................................... 198 10.6.1 Hands-On Practice 10.1: Raster Color Renders................ 198 10.6.2 Hands-On Practice 10.2: Raster Data Analysis: Find"the Area with the Elevation Range between 60"and 100 and the Land Cover Type as “Forest”........... 199 xii Contents 10.6.3 Hands-On Practice 10.3. Access the Attribute Information of Raster Dataset and Calculate the Area..... 200 10.7 Chapter Summary............................................................................. 205 Problems........................................................................................................ 205 11. Network Data Algorithms........................................................................ 207 11.1 Network Representation.................................................................. 207 11.1.1 Basics Network Representation......................................... 207 11.1.2 Directed and Undirected Networks.................................. 207 11.1.3 The Adjacency Matrix......................................................... 209 11.1.4 Network Representation in GIS......................................... 209 11.2 Finding the Shortest Path................................................................. 210 11.2.1 Problem Statement............................................................... 210 11.2.2 A Brute Force Approach for the Shortest Path"Algorithm..................................................................... 211 11.2.3 Dijkstra Algorithm............................................................... 212 11.3 Types of Network Analysis............................................................. 214 11.3.1 Routing.................................................................................. 214 11.3.2 Closest Facility..................................................................... 214 11.3.3 Service Areas........................................................................ 214 11.3.4 OD Cost Matrix.................................................................... 216 11.3.5 Vehicle Routing Problem.................................................... 216 11.3.6 Location-Allocation............................................................. 217 11.4 Hands-On Experience with ArcGIS............................................... 218 11.5 Chapter Summary............................................................................. 221 Problems........................................................................................................222 12. Surface Data Algorithms...........................................................................223 12.1 3D Surface and Data Model............................................................. 223 12.1.1 Surface Data..........................................................................223 12.1.2 Surface Data Model.............................................................223 12.1.2.1 Discrete Data.........................................................223 12.1.2.2 Continuous Data...................................................225 12.2 Create Surface Model Data.............................................................. 228 12.2.1 Create Grid Surface Model................................................. 228 12.2.2 Creating TIN Surface Model.............................................. 229 12.2.3 Conversion between TIN and Raster Surface Models................................................................................... 229 12.3 Surface Data Analysis....................................................................... 230 12.3.1 Elevation................................................................................ 230 12.3.2 Slope....................................................................................... 231 12.3.3 Aspect.................................................................................... 232 12.3.4 Hydrologic Analysis............................................................234 12.4 Hands-On Experience with ArcGIS............................................... 236 Contents xiii 12.4.1 Hands-On Practice 12.1: Conversion among DEM, TIN, and Contours............................................................... 236 12.4.2 Hands-On Practice 12.2: Generate Slope and Aspect..... 239 12.4.3 Hands-On Practice 12.3: Flow Direction.......................... 239 12.5 Chapter Summary............................................................................. 242 Problems........................................................................................................ 242 Section IV Advanced Topics 13. Performance-Improving Techniques...................................................... 247 13.1 Problems............................................................................................. 247 13.2 Disk Access and Memory Management........................................ 248 13.2.1 File Management.................................................................. 249 13.2.2 Comprehensive Consideration........................................... 249 13.3 Parallel Processing and Multithreading........................................ 251 13.3.1 Sequential and Concurrent Execution.............................. 251 13.3.2 Multithreading..................................................................... 251 13.3.3 Load Multiple Shape!les Concurrently Using"Multithreading.......................................................... 252 13.3.4 Parallel Processing and Cluster, Grid, and"Cloud"Computing........................................................ 253 13.4 Relationship Calculation and Spatial Index..................................254 13.4.1 Bounding Box in GIS........................................................... 255 13.4.2 Spatial Index......................................................................... 256 13.5 Hands-On Experience with Mini-GIS........................................... 257 13.5.1 Data Loading with RAM as File Buffer............................ 257 13.5.2 Data Loading with Multithreading................................... 258 13.5.3 Bounding Box Checking to Speed Up Intersection........ 258 13.5.4 Line Intersection Using R-Tree Index................................ 261 13.6 Chapter Summary............................................................................. 262 Problems........................................................................................................ 263 14. Advanced Topics......................................................................................... 265 14.1 Spatial Data Structure...................................................................... 265 14.1.1 Raster Data Structure in NetCDF/HDF........................... 265 14.1.2 Application of NetCDF/HDF on Climate Study............. 266 14.2 GIS Algorithms and Modeling........................................................ 270 14.2.1 Data........................................................................................ 270 14.2.2 Density Analysis.................................................................. 271 14.2.3 Regression Analysis (OLS and GWR)............................... 272 14.3 Distributed GIS.................................................................................. 275 14.3.1 System Architecture............................................................ 276 14.3.2 User Interface........................................................................ 277 xiv Contents 14.4 Spatiotemporal Thinking and Computing.................................... 280 14.4.1 Problem: Dust Simulation and Computing Challenges......................................................................... 280 14.4.2 Methodology 1: Utilizing High-Performance Computing to Support Dust Simulation.......................... 281 14.4.3 Methodology 2: Utilizing Spatiotemporal Thinking to Optimize High-Performance Computing.................... 281 14.4.3.1 Dust Storms’ Clustered Characteristics: Scheduling Methods............................................ 282 14.4.3.2 Dust Storms’ Space–Time Continuity: Decomposition Method....................................... 283 14.4.3.3 Dust Storm Events Are Isolated: Nested"Model........................................................284 14.4.4 Methodology 3: Utilizing Cloud Computing to"Support Dust Storm Forecasting...................................284 14.5 Chapter Summary............................................................................. 285 Problems........................................................................................................ 286 References........................................................................................................... 287 Index..................................................................................................................... 291 Preface Why Another GIS Programming Text? Geographical information system (GIS) has become a popular tool under- pinning many aspects of our daily life from routing for transportation to !nding a restaurant to responding to emergencies. Convenient GIS tools are developed with different levels of programming from scripting, using python for ArcGIS, to crafting new suites of tools from scratch. How much programming is needed for projects largely depends on the GIS software, types of applications, and knowledge structure and background of the application designer and developer. For example, simple scripting integrates online mapping applications using Google maps. Customized spatial analyses applications are routinely using ArcGIS with minimum program- ming. Many develop an application leveraging open-source software for managing big data, modeling complex phenomena, or responding to con- current users for popular online systems. The best design and development of such applications require designers and developers to have a thorough understanding of GIS principles as well as the skill to choose between com- mercial and open-source software options. For most GIS professionals, this is a challenge because most are either GIS tool end users or information technology (IT) professionals with a limited understanding of GIS. To !ll this gap, over the last decade, Chaowei Yang launched an introduc- tory GIS programming course that was well received. Enrollment continues to rise and students report positive feedback once they are in the workplace and use knowledge developed from the class. To bene!t a broader spectrum of students and professionals looking for training materials to build GIS programming capabilities, this book is written to integrate and re!ne the authors’ knowledge accumulated through courses and associated research projects. The audience for this book is both IT professionals to learn the GIS principles and GIS users to develop programming skills. On the one hand, this book provides a bridge for GIS students and professionals to learn and practice programming. On the other hand, it also helps IT professionals with programming experience to acquire the fundamentals of GIS to better hone their programming skills for GIS development. Rather than try to compete with the current GIS programming literature, the authors endeavor to interpret GIS from a different angle by integrating GIS algorithms and programming. As a result, this book provides a practical knowledge that includes fundamental GIS principles, basic programming skills, open-source GIS development, ArcGIS development, and advanced xv xvi Preface topics. Structured for developing GIS functions, applications, and systems, this book is expected to help GIS/IT students and professionals to become more competitive in the job market of GIS and IT industry with needed programming skills. What Is Included in the Text? This book has four sections. Section I (Chapters 1 and 2) is an overview of GIS programming and introduces computer and programming from a practical perspective. Python (integral programming language for ArcGIS) program- ming is extensively presented in Section II (Chapters 3 through 8) in the context of designing and developing a Mini-GIS using hands-on experience following explanations of fundamental concepts of GIS. Section III (Chapters 9 through 12) focuses on advanced GIS algorithms and information on how to invoke them for programming in ArcGIS. Advanced topics and performance optimization are introduced in Section IV (Chapters 13 and 14) using the Mini-GIS developed. Chapter 1 introduces computer, computer programming, and GIS. In"addition, the Uni!ed Markup Language (UML) is discussed for capturing GIS models implemented through simple Python programming. Chapter 2 introduces object-oriented programming and characteristics with examples of basic GIS vector data types of Point, Polyline, and Polygon. Chapter 3 introduces Python syntax, operators, statements, miscella- neous features of functions, and Python support for object-oriented pro- gramming. Using GIS examples, Chapter 4 introduces Python language control structures, !le input/output, and exception handling. Chapter 5 presents programming thinking using the visualization of vector data as an example of the work#ow of this critical process in programming. Chapter 6 introduces the Python integrated programming environment (IDE), modules, package management, and the Mini-GIS package. Chapter 7 discusses shape!le formats and steps on how to handle shape!les within the Mini-GIS. Chapter 8 introduces vector data processing algo- rithms and includes line intersection, centroid, area, length, and point in polygon. This presentation includes how Mini-GIS/ArcGIS supports these algorithms. Chapter 9 bridges Sections II and III by introducing ArcGIS" program- ming in Python using ArcPy, ArcGIS programming environment, automat- ing tools, accessing data, describing objects, and !xing errors. Chapter 10 introduces raster data algorithms, including raster data format, storage, and compression with hands-on experience using ArcGIS. Chapter 11 addresses network data algorithms for representing networks and calculating the shortest path in principles and using ArcGIS. Chapter 12 explores surface or Preface xvii 3D data representation of 3D data, converting data formats and 3D"analyses for elevation, slope, aspect, and #ow direction with examples in ArcGIS programming. Chapter 13 introduces performance-improving techniques and includes storage access and management, parallel processing and multithreading, spatial index, and other techniques for accelerating GIS as demonstrated in Mini-GIS. Advanced topics, including GIS algorithms and modeling, spatial data structure, distributed GIS, spatiotemporal thinking, and computing, are presented in Chapter 14. Hands-On Experience As a practical text for developing programming skills, this book makes every effort to ensure the content is as functional as possible. For every introduced GIS fundamental principle, algorithm and element, an example is explored as a hands-on experience using Mini-GIS and/or ArcGIS with Python. This" learning work#ow helps build a thorough understanding of the fundamentals and naturally maps to the fundamentals and program- ming skills. For system and open-source development, a step-by-step development of a python-based Mini-GIS is presented. For application development, ArcGIS is adopted for illustration. The Mini-GIS is an open-source software developed for this text and can be adopted for building other GIS applications. ArcGIS, a commercial product from ESRI, is used to experience state-of-the-art commercial software. For"learning purpose, ArcGIS is available for free from ESRI. Online Materials This book comes with the following online materials: Instructional slides for instructors using this text for classroom education and professionals to assist in learning GIS programming. Python codes for class exercises and hands-on experiences and structured and labeled by chapter to code the chapter’s sequence. Mini-GIS as an open-source package for learning the GIS fundamentals and for exemplifying GIS principles and algorithms. Answers to problems for instructors to check their solutions. xviii Preface The Audience for and How to Use This Text This text serves two functions: a text for systematic building GIS program- ming skills and a reference for identifying a python solution for speci!c GIS"algorithms or function from scratch and/or ArcGIS. The text is intended to assist four categories of readers: Professors teaching GIS programming or GIS students learning with a speci!c focus on hands-on experience in classroom settings. Programmers wanting to learn GIS programming by scanning through Section I and Chapters 3 and 4, followed by a step-by-step study of the remaining chapters. GIS system designers most interested in algorithm descriptions, algorithms implementation from both scratch and ArcGIS to assemble a practical knowledge about GIS programing to aid in GIS choice for future development. IT professionals with a curiosity of GIS for GIS principles but skipping the programming exercises. The intent of the authors for such a broad audience is based on the desire to cultivate a competitive professional workforce in GIS development, enhance the literature of GIS, and serve as a practical introduction to GIS research. How Did We Develop This Text? The text material was !rst developed by Professor Chaowei Yang in 2004 and offered annually in a classroom setting during the past decade. During that time span, many students developed and advanced their programming skills. Some became professors and lecturers in colleges and were invited to write speci!c book chapters. Keeping the audience in mind, several professors who teach GIS programming in different cultural backgrounds and university settings were invited to review the book chapters. The following is the book development work#ow: Using his course materials, Professor Yang structured this book with Irma Shagla’s help, and the text’s structure was contracted to be published as a book. Assistant Professor Qunying Huang, University of Wisconsin, Madison, explored using the earlier versions of the text’s materials. Assistant Professors Huang and Zhenlong Li, University of South Carolina, developed Section II of the text in collaboration with Professor Yang. Preface xix Dr. Min Sun, Ms. Manzhu Yu, Mr. Yongyao Jiang, and Mr. Jizhe Xia developed Section III in collaboration with Professor Yang. Professor Yang edited and revised all chapters to assure a common structure and composition. Ms. Manzhu Yu and Professor Yang edited the course slides. Assistant Professor Li, Mr. Kai Liu, Mrs. Joseph George, and Ms."Zifu"Wang edited Mini-GIS as the software for the text. After the above text and course materials were completed, four professors and two developers were invited to review the text’s content. The assembled materials for the text were !nally reviewed by several" professionals, including Ms. Alena Deveau, Mr. Rob Culbertson, and Professor George Taylor. The text was formatted by Ms. Minni Song. Ms. Manzhu Yu and Professor Yang completed a !nal review of the chapters, slides, codes, data, and all relevant materials. Acknowledgments This text is a long-term project evolving from the course “Introduction to GIS Programming” developed and re!ned over the past decade at George Mason University. Many students and professors provided constructive suggestions about what to include, how best to communicate and challenge the students, and who should be considered as audience of the text. The outcome re#ects Professor Yang’s programming career since his undergraduate theses at China’s Northeastern University under the mentoring of Professor Jinxing Wang. Professor Yang was further mentored in programming in the GIS domain by Professors Qi Li and Jicheng Chen. His academic mentors in the United States, Professors David Wong and Menas Kafatos, provided support over many decades, giving him the chance to teach the course that eventually led to this text. Professor Yang thanks the brilliant and enthusiastic students in his classes at George Mason University. Their questions and critiques honed his teaching" skills, improved the content, and prompted this effort of developing"a text. Professor Yang thanks his beloved wife, Yan Xiang, and children—Andrew, Christopher, and Hannah—for accommodating him when stealing valuable family time to complete the text. Ms. Manzhu Yu extends her gratitude to the many colleagues who provided support, and read, wrote, commented, and assisted in the editing, proofreading, and formatting of the text. Assistant Professor Huang thanks her wonderful husband, Yunfeng Jiang, and lovely daughter, Alica Jiang. Dr. Min Sun thanks her PhD supervisor, Professor David Wong, for educating her. She also thanks David Wynne, her supervisor in ESRI where she worked as an intern, and her other coworkers who collectively helped her" gain a more complete understanding of programming with ESRI products. Last but not least, she thanks her parents and lovely dog who accompanied her when she was writing the text. Yongyao Jiang thank his wife Rui Dong, his daughter Laura, and his par- ents Lixia Yao and Yanqing Jiang. xxi Editor Chaowei Yang is a professor of geographic information science at George Mason University (GMU). His research interest is on utilizing spatiotem- poral principles to optimize computing infrastructure to support science discoveries. He founded the Center for Intelligent Spatial Computing and the NSF Spatiotemporal Innovation Center. He served as PI or Co-I for projects totaling more than $40 M and funded by more than 15 agencies, organiza- tions, and companies. He has published 150+ articles and developed a num- ber of GIS courses and training programs. He has advised 20+ postdoctoral and PhD students who serve as professors and scientists in highly acclaimed U.S. and Chinese institutions. He received many national and international awards, such as the U.S. Presidential Environment Protection Stewardship Award in 2009. All his achievements are based on his practical knowledge of GIS and geospatial information systems. This book is a collection of such practical knowledge on how to develop GIS tools from a programming perspective. The content was offered in his programming and GIS algorithm classes during the past 10+ years (2004–2016) and has been adopted by his students and colleagues serving as professors at many universities in the United States and internationally. xxiii Contributors Fei Hu is a PhD candidate at the NSF Spatiotemporal Innovation Center, George Mason University. He is interested in utilizing high-performance cloud computing technologies to manage and mine big spatiotemporal data. More speci!cally, he has optimized the distributed storage system (e.g., HDFS) and parallel computing framework (e.g., Spark, MapReduce) to ef!ciently manage, query, and analyze big multiple-dimensional array-based datasets (e.g., climate data and remote sensing data). He aims to provide scientists with on-demand data analytical capabilities to relieve them from time-consuming computational tasks. Qunying Huang is an assistant professor in the Department of Geography at the University of Wisconsin, Madison. Her !elds of expertise include geo- graphic information science (GIScience), cyber infrastructure, spatiotemporal big data mining, and large-scale environmental modeling and simulation. She is very interested in applying different computing models, such as clus- ter, grid, GPU, citizen computing, and especially cloud computing, to address contemporary big data and computing challenges in the GIScience. Most recently, she is leveraging and mining social media data for various applica- tions, such as emergency response, disaster mitigation, and human mobility. She has published more than 50 scienti!c articles and edited two books. Yongyao Jiang is a PhD candidate in Earth systems and geoinforma- tion sciences at the NSF Spatiotemporal Innovation Center, George Mason University. He earned an MS (2014) in GIScience at Clark University and a BE (2012) in remote sensing at Wuhan University. His research focuses on data discovery, data mining, semantics, and cloud computing. Jiang has received the NSF EarthCube Visiting Graduate Student Early-Career Scientist Award (2016), the Microsoft Azure for Research Award (2015), and !rst prize in the Robert Raskin CyberGIS Student Competition (2015). He serves as the tech- nical lead for MUDROD, a semantic discovery and search engine project funded by NASA’s AIST Program. Zhenlong Li is an assistant professor in the Department of Geography at the University of South Carolina. Dr. Li’s research focuses on spatial high-performance computing, big data processing/mining, and geospa- tial cyberinfrastructure in the area of data and computational intensive GISciences. Dr. Li’s research aims to optimize spatial computing infra- structure by integrating cutting-edge computing technologies and spatial principles to support domain applications such as climate change and hazard management. xxv xxvi Contributors Kai Liu is a graduate student in the Department of Geography and GeoInformation Sciences (GGS) in the College of Science at George Mason University. Previously, he was a visiting scholar at the Center of Intelligent Spatial Computing for Water/Energy Science (CISC) and worked for 4 years at Heilongjiang Bureau of Surveying and mapping in China. He earned a BA in geographic information science at Wuhan University, China. His research focuses on geospatial semantics, geospatial metadata management, spatio- temporal cloud computing, and citizen science. Min Sun is a research assistant professor in the Department of Geography and Geoinformation Science at George Mason University. Her research interests include measuring attribute uncertainty in spatial data, developing visual analytics to support data exploration, WebGIS, and cloud computing. She is an expert in ArcGIS programming and also serves as the assistant director for the U.S. NSF Spatiotemporal Innovation Center. Jizhe Xia is a research assistant professor at George Mason University. He earned a PhD in Earth systems and geoinformation sciences at the George Mason University in the spring of 2015. Dr. Xia’s research interests are spatiotemporal computing, cloud computing, and their applications in geographical sciences. He proposed a variety of methods to utilize spatiotemporal patterns to optimize big data access, service quality (QoS) evaluation, and cloud computing application. Manzhu Yu is a PhD candidate in the Department of Geography and Geoinformation Science, George Mason University. Her research interests include spatiotemporal methodology, pattern detection, and spatiotemporal applications on natural disasters. She received a Presidential Scholarship from 2012 to 2015. She has published approximately 10 articles in renowned journals, such as PLoS ONE and IJGIS, and contributed as a major author in several book chapters. Section I Overview 1 Introduction This chapter introduces the basic concepts of computer, hardware, software, and programming, and sets up the context for GIS programming. 1.1 Computer Hardware and Software A computer is a device that has the capability to conduct different types of automated tasks based on speci!c instructions prede!ned by or through interactions with end users. For example, clicking on the ArcGIS icon will execute ArcGIS software. We can select a destination and starting point to trigger a routing analysis to identify a driving route using Google Maps. Computers are some of the fastest-evolving technologies as re#ected by the processing capability of small calculators to supercomputers. The size of the devices has reduced from computers occupying a building to mobile devices in pockets (Figure 1.1). The user interactions range from typing punched cards (early computers) to human–computer interaction, such as speaking to invoke an action or task. There are two important components of a computer (Hwang and Faye 1984): (1) the physical device that can conduct automated processing, and (2)"instruction packages that can be con!gured to provide speci!c functional- ity, such as word processing or geographic information processing. The !rst component of a computer, the hardware, is touchable as physical machines. The second component, the software, may be purchased with the hardware in the form of an operating system, or installed by downloading online. Computer hardware can be con!gured or programmed to perform different tasks; thus, a computer may also be called a general-purpose device. The software var- ies greatly, whether it is providing document-processing capability, !nancial management, tax return processing, or scienti!c simulations such as climate change or the spread of disease. Depending on the type of software, it is either procured publicly (freeware) or proprietary (requiring purchase and licensing). Depending on the usage, software can be categorized as system software, application software, or embedded software (Figure 1.2). System software refers to the basic software that must be installed for a computer to operate. Windows and Linux are examples of operating system (OS) soft- ware, an essential component of a computer. Application software supports 3 4 Introduction to GIS Programming and Fundamentals with Python and ArcGIS® (a) (b) FIGURE 1.1 (a) NASA supercomputer. (From NASA supercomputer at http://www.nas.nasa.gov/hecc/ resources/pleiades.html.) (b) Other computers: personal computer (PC), laptop, pad. (From differ- ent computers at http://www.computerdoc.com.au/what-are-the-different-types-of-computers.) speci!c groups of tasks, such as Microsoft Word for document processing and Microsoft Outlook for emails. Embedded software is a type of !rmware that is burned onto hardware and becomes part of that hardware. Embedded software exists longer on a computer than any other software. The !rmware will always come with the hardware when you purchase a computer, so the !rmware will not have to be changed as frequently, especially when updat- ing a web browser or Turbo Tax Return routinely. Geographic information system (GIS) is one type of application software that deals primarily with geographic information (Longley et"al. 2001). The global positioning system (GPS, Misra and Enge 2006) is used for locating geographic places, and can be installed in both cars and smart phones for routing. GIS software includes two categories: professional GIS and light- weight GIS. Professional GIS software, such as ArcGIS, provides the most Application software Word, Web browser, ArcGIS System software Windows, Linux,... Embedded software Hardware FIGURE 1.2 Different types of software. Introduction 5 complete set of GIS functionalities for professionals in the GIS domain. Less intense, but popular, GIS software used to view the geographic environment are the online mapping application, such as Google Maps and Google Earth. 1.2 GIS and Programming GIS originates from several domains and refers to the system designed to capture, observe, collect, store, and manage geographic data, and to pro- vide tools for spatial analyses and visualization (Longley et" al. 2001). GIS can help obtain geographic data to be used for decision making, such as choosing routes for emergency response. GIS is known to have started from the Canadian natural resource inventory computer program led by Roger Tomlinson in the 1960s. GIS is becoming increasingly popular on mobile devices as a means of analyzing information and patterns related to traf!c and weather. Coined by Mike Goodchild, the term “GIS” can also refer to the !eld of geographic information science or GIScience—the study of the scienti!cally applied GIS principles and technologies (Goodchild 1992). According to GIS scientists, GIScience involves remote sensing, global navigation satellite systems, and GIS. Additionally, in various domains, GeoInformatics may be applied to remote sensing, global navigation satellite system, and GIS infor- mation. These topics, however, will not be explored in this book. GIS is the system comprising hardware (computer, mobile devices, GPS), software (ArcGIS or online mapping), and data (geographic information) that can be utilized to accomplish a set of functionalities for a group of users. All three components must be utilized for GIS to work effectively. A sig- ni!cant difference between GIS and other software applications is its ability to manage and manipulate the large volume and complexity of geographic data, which comprises embedded spatiotemporal and attribute information. The complex character of GIS data demands a speci!c suite of software to extract information for decision making. Mature software packages are pub- licly available, including the most up-to-date set of ArcGIS software and the latest edition of Google Maps web mapping software. The process of developing software is called programming. Programming instructs the computer to accomplish a task based on the orders. There are many different types of programming levels (Mitchell 1996). The lowest level to program are based on the speci!c hardware instructions supported by the central processing units (CPU), and used by smart-instrument devel- opers. Because CPU instructions are processed as a sequence of 0s and 1s, assembling language is developed to assist developers to remember those instructions. Both languages are considered low level and are speci!c to the hardware. Advanced languages have been developed to facilitate human 6 Introduction to GIS Programming and Fundamentals with Python and ArcGIS® understanding, but are still restricted by the hardware instructions. For example, C programming language is commonly used to develop software (Kernighan and Ritchie 2006). To make the programming organization more similar to how we view the world, C++ was proposed to support object- oriented programming based on C (Stroustrup 1995). Since then, many dif- ferent programming languages have been developed and are used in GIS programming. For instance, Java is a language for cross-platform appli- cation development proposed by Sun Microsystems (Arnold et" al. 2000). JavaScript is used to conduct scripting (simpler) programming for manip- ulating objects within a web browser. In addition to Java and JavaScript, ArcGIS has recently added Python to its list of programming languages (Van Rossum 2007). Why do we need GIS programming? Mature GIS software and applica- tion templates provide many tools to accomplish our daily tasks; however, in order to understand the fundamentals of how GIS works and to customize software for speci!c problems, programming is required. The following list gives programming examples: Customizing software for application: The National Park Service is developing a simple web mapping application to allow the general public to interactively select and view information for a particular National Park. Using an online mapping tool such as Google Maps and selecting a park with your mouse will trigger a query of the selected information for that park. In this scenario, we need geo- graphic information about the parks, a program for front-end user interaction, and a database query language that will generate result for the selected park. Automating a process: Suppose there are 100 geographic datasets col- lected in text !le format and we need to convert them into a shape- !le, a native data !le format used by ArcView and ArcGIS, for further processing. ArcGIS can perform the conversion one by one, but doing this manually 100 times is monotonous. Therefore, a sim- ple scripting tool to automatically read and process the 100 datasets into shape!les would be bene!cial. Using Python scripts in ArcGIS provides the capability to do so. Satisfying simple GIS needs: Suppose there is a transportation com- pany that needs to track their company vehicles’ positions based on 5-minute intervals. However, the company cannot afford to pur- chase a professional GIS software license. To resolve the issue, the company can use Python to create a map to show the company’s service region and vehicle locations every 5 minutes. This program- ming may include Zoom In/Out, and Move/Pan features, anima- tions based on locations, and a selection of one or many vehicles. Introduction 7 Cultivating advanced GIS professionals: Suppose a group of students are asked to invent a routing algorithm based on predicted traf- !c conditions and real-time traf!c information. The students will need to organize the road network information comparing real-time and predicted network speed. It is essential to use the most accu- rate predicted information in the routing process. Programming is needed throughout the entire process for network management and routing, and for reconstructing the results into map form or written directions. Geographic information has become increasingly important in all walks of human life, whether it is for scienti!c discovery, forecasting natural disas- ters, advancing technologies of observations, or creating public awareness about location and routing. While some applications require complete GIS technologies to produce valuable results, many geographic information applications do not require sophisticated geographic information systems. For the latter case, open-source or small geospatial information software is utilized, while commercial GIS systems such as ArcGIS, are available for the former case. To better address both needs, it is essential to understand the fundamentals of how GIS works and its basic geographic information processing. This chapter introduces the background structure for building such capabilities: computer hardware and software, GIS and programming, GIS data models and Uni!ed Markup Language (UML, Fowler 2004), and Python. Hands-on programming experience is needed for understanding the concepts and developing the essential skills utilized by GIS professionals in their work and research. Based on GIS fundamentals, this book will help you develop and improve systematic programming skills and will provide a more in-depth understanding of GIS fundamentals. Owing to its popularity within the GIS community, Python will be the primary programming lan- guage used in this book. 1.3 Python Python was originally developed by a Dutch programmer, Guido van Rossum, in 1990. Van Rossum was reportedly a fan of the British comedy series, Monty Python’s Flying Circus, and upon developing the open-source programming language, he borrowed to the name “Python” for the language and his nonpro!t institution, the Python Software Foundation. Similar to programming languages C++ and Java, Python is an object- oriented and interactive language. Python is dynamic in that it uses an auto- matic memory management mechanism to allocate and release memory for 8 Introduction to GIS Programming and Fundamentals with Python and ArcGIS® data (variables). Python and ArcGIS regularly release new versions of their programs; this book is based on Python release 2.7 and ArcGIS 10.1. There are many reasons for choosing Python, including the following:* It is excellent for programming beginners, yet superb for experts. The syntax of Python is very simple and easy to learn. When you become familiar with them, you will feel that it is really very handy. It is highly scalable and well suited for both large and small projects. It is in a rapid development phase. Almost every half year, there is a new major release. It is portable cross-platform. This means that a program written in Windows can be run using the Linux or Mac operating systems. It is easily extensible. You can always add more class functions to your current project. It has powerful standard libraries. Many third parties also provide highly functional packages for you to utilize. Instead of developing GIS functions from scratch, you can simply download the source code and integrate them into your project. It is a fully object-oriented language, simple yet elegant, and stable and mature. There are several steps to learning Python for GIS programming: Get familiar with the concept of class and object (Chapters 1 and 2). Learn the syntax of Python, including variables, data types, struc- tures, controls, statements, and other programming structures (Chapters 1 through 4). Build Python programs from scratch and integrate open-source libraries to facilitate programming (Chapter 5). Become comfortable with the Python programming environment (Python interpreter or Python Text editor, Chapter 6). Solve GIS problems by writing code for GIS algorithms (Chapters 7 through 13). These components are introduced in the above order throughout this book. This chapter introduces important concepts such as object-oriented programming, UML, and GIS models. * http://pythoncard.sourceforge.net/what_is_python.html. Introduction 9 1.4 Class and Object Within this section, we will discuss two types of fundamental concepts: class and object (Rumbaugh et"al. 1991). Class uses a set of attributes and behav- iors to represent a category of real-world phenomena. For example, Figure 1.3 shows how to extract the student attributes and behaviors. Another example is online shopping on Amazon or eBay. Both the custom- ers and online products must be abstracted into classes: Customers would have a customer ID, shipping address, and bill- ing address. Customer behavior would include adding or deleting a product to the shopping cart. Products would have a product ID, product name, and product price. Product behavior would include setting the price, and totaling the"product quantity/amount. An object is a speci!c instance of a class. We can consider objects as instances of classes by assigning values to their attributes. Speci!cally, a class is the abstraction of a category or collection of real-world entities while an object is a speci!c real-world entity within the class. Within a computer, a class is the template and an object is the speci!c entity that occupies the com- puter memory. The computer can operate on both the attributes and behav- iors of an object. For example, when a student logs in to their college web system with a username and password, the system will create a new stu- dent object. The computer reads each student as an independent object with several different attributes (e.g., username, password, and student ID). After logging in, a student is able to search, register, or add/drop classes using the object in the system, which represents him or her speci!cally. Chapter 2 will introduce how to de!ne classes and objects using Python. FIGURE 1.3 An example of representing students with the Student class. 10 Introduction to GIS Programming and Fundamentals with Python and ArcGIS® 1.5 GIS Data Models GIS data models are used to capture essential geospatial elements of a spe- ci!c problem (Longley et"al. 2001). There are three types of data models: vec- tor data, raster data, and special data. Vector data models consist of point, polyline, and polygon model types. Raster data includes equally split cells of digital elevation models and images. Special data are composed of network and linear data. This book highlights different types of GIS data models, but will focus mainly on vector data models. A point can refer to a class of vector data represented by a pair of x, y coor- dinates in a two-dimensional (2D) space or a tuple of x, y, and z coordinates in a three-dimensional (3D) space. For example, a city is represented as a point on a world map. Each city has a group of attributes, which would include the city name, population, average household income, and acro-names. Another example using points is a map depicting all the restaurants within a certain region. In addition to its point location, each restaurant will contain other relevant information, including its name, room capacity, cuisine, and the year it opened. In these cases, the point is a general classi!cation, whereas the city or the restaurant is a more speci!c type of class containing different attributes. When designing, each point of the rectangle can represent a class (Figure 1.4). This diagram is also referred to as a UML class diagram. The !rst row refers to the name of the class: City; the second row refers to the attributes of the class: name and averageIncome; the third row refers to a set of methods: getName, getAverageIncome, and setName. Polylines are a class of vector data represented by a list of points. For instance, a river can be represented as a polyline on a map, which then can be categorized as a type of polyline class. A polyline class may include point coordinates, relevant attributes, and a set of methods. Another polyline data- set example can be roads, highways, and interstates. Both examples are cat- egories of polylines. Rivers can be represented using UML (Figure 1.5). The !rst row of the UML is the subject of the class: River; the second row includes the river’s attributes: name and coordinates; and the third row refers to the methods the programmer will use: getName, setCoordinates, and setName. FIGURE 1.4 A UML diagram for the City class. Introduction 11 FIGURE 1.5 The River class includes three parts. FIGURE 1.6 The County class includes three parts. Polygons are another class of vector data that are also represented by a list of points; however, with polygons, the !rst and last points are the same. For example, on the map of the state of Virginia, a speci!c county, like Fairfax County, can be represented as a polygon. The county is a type of polygon class, which includes a list of points, relevant attributes, and a set of meth- ods. Countries on a world map may also be represented as polygons. In either case, both the county and country are types of polygons. As shown in Figure 1.6, the !rst row is the subject name: County; the second row is the subject’s attributes: name and population; and the third row refers to the meth- ods: getName, setPopulation, and setName. Developing more methods will require adding more methods and attri- butes to each class to capture the evolution of the data models and the functionality of software; UML diagrams are used to standardize their rep- resentation. This section uses class diagrams and relevant UML standards for the point, polyline, and polygon classes. 1.6 UML In 1997, the Object Management Group (OMG)* created the UML to record the software design for programming. Software designers and programmers * See OMG at http://www.omg.org/. 12 Introduction to GIS Programming and Fundamentals with Python and ArcGIS® use UML to communicate and share the design. Similar to the English lan- guage in which we communicate through sharing our ideas via talking or writing, UML is used for modeling an application or problem in an object- oriented fashion. UML modeling can be used to facilitate the entire design and development of software. The UML diagram is used to capture the programming logic. There are two types of diagrams that we will speci!cally discuss: class diagrams and object diagrams (Figure 1.7). The UML class diagram can represent a class using three parts: name, attributes, and methods. The attributes and methods have three different accessibilities: public (+), private (-), and protected (#). Attributes and meth- ods are normally represented in the following format: Attributes: accessibility attribute Name: Attribute data type, for example, +name: String Methods: accessibility method Name (method arguments): method return type, for example, +setName(name:String): void Public refers to the method/attributes that can be accessed by other classes. Private methods/attributes cannot be accessed by any other classes. Protected methods/attributes cannot be accessed by other classes except those classes inherited from this class (explained below). There are several fundamental relationships among different classes: dependency, inheritance, composition, and aggregation. Dependency repre- sents one class dependent on another. Inheritance is an important relation- ship in which a class is a subtype of another class. Figure 1.8 illustrates the dependency between geometry and coordinate systems in that the existence of geometry depends on a coordinate system. This relationship is repre- sented by a dashed line and an arrow from the geometry to the coordinate system class. The relationship between a point, line, and polygon are classi- !ed within the geometry class. Aggregation and composition are two other important relationships in UML. Aggregation represents “has a” relationship in UML. For example, a state is an aggregation of a number of counties (Figure 1.9a). Composition represents, or “owns” relationship. For example, a multipoint class may be composed of two or more points (Figure 1.9b). The relationship can be quanti!ed by the number of elements involved. For example, a line includes 2+ points and a state includes 0+ counties. There are six different types of this multiplicity relationship (Figure 1.10). A mul- tipoint is composed of two or more points (Figure 1.9b) and a state is aggre- gated by zero or more counties. An object is an instantiation of a class. The object diagram shows a complete or partial view of the model system structure at a speci!c time. So, the state Introduction Diagram Structure Behavior diagram diagram Class Component Object Activity Use case diagram diagram diagram diagram diagram Composite State Profile Deployment Package Interaction structure machine diagram diagram diagram diagram diagram diagram Interaction Sequence Communication Timing Notation: UML overview diagram diagram diagram diagram FIGURE 1.7 The class diagram and object diagram used in this book. 13 14 Introduction to GIS Programming and Fundamentals with Python and ArcGIS® Geometry Coordinate system Point Line Polygon FIGURE 1.8 Inheritance and dependency. (a) (b) Composition Counties Aggregation State Point MultiPoint 0..* 2..* Hollow Filled diamond diamond FIGURE 1.9 (a) Aggregation and (b) composition are two polar relationships among classes. FIGURE 1.10 Multicity relationship among classes. of an object can be changed. Figure 1.11’s class name is worldMap, and its object is the coordinate system that changed from WGS 1972 to WGS 1984 after performing reprojection. 1.7 Hands-On Experience with Python A point is the basic data model within GIS. This section will examine how to create a point class, including coordinates and calculations of the Introduction 15 FIGURE 1.11 worldMap is an object of the Map class and the state is changing with different operations. distances between points. You will learn how to create point objects from point class. 1. Open the program (Figure 1.12): Windows→All Programs→ArcGIS→Python 2.7 or Windows→All Programs→Python 2.7→IDLE (Python GUI) FIGURE 1.12 Launch the Python programming window (GUI). 16 Introduction to GIS Programming and Fundamentals with Python and ArcGIS® >>> impor t math >>> class Point ( ) : def __ini t__(sel f ) : sel f. x = 0 sel f. y = 0 def setXY(sel f , x , y) : sel f. x = x sel f. y = y def calDis(sel f , p) : return math. sqr t ( (sel f. x-p. x)**2+(sel f. y-p. y)**2) >>> p1 = Point ( ) >>> p2 = Point ( ) >>> p1. setXY(1 , 2) >>> p2. setXY(2 , 3) >>> p1. calDis(p2) 1. 4142135623730951 >>> CODE 1.1 Creating a point class and generating two points, then calculating the distance between the two points. 2. Type in the point class codes as shown in Code 1.1. Programming tips: 1. Coding should be exactly the same as the !gure shows. 2. The init method is de!ned with four underscores: two “_” before and two after “init.” 3. Python is case sensitive, so lower- and uppercase of the same letter will make a difference. 4. There is no need to understand every statement for now; they will be gradually explained in the following chapters. 1.8 Chapter Summary This chapter brie#y introduced GIS programming and included A general introduction to computer hardware and software De!nitions of GIS and programming Python in a practical context Practical knowledge about several GIS data models Introduction 17 The uni!ed modeling language for modeling object-oriented GIS data Relevant hands-on experience PROBLEMS De!ne computer, programming, software, and GIS. What are the different methods to categorize software? What are the three GIS data models found on the UML diagram? Explain why we need to learn GIS programming. Use the UML diagram to model the relationship between polylines. Use the UML diagram to model the relationship between polygons. Practice Python’s Chapter 3 tutorial: https://docs.python.org/3/tuto- rial/introduction.html. Use Python to calculate the distance between Point (1, 2) and Point (2, 2). Discuss how to identify classes used on a world map and how to use UML to capture those classes. 2 Object-Oriented Programming This chapter introduces object-oriented programming in regard to Python’s programming language, classes and objects, object generation, inheritance, GIS classes and objects, and a general programming experience. 2.1 Programming Language and Python Programming language is de!ned as an arti!cial language used to write instructions that can be translated into machine language and then executed by a computer. This de!nition includes four important aspects: (1) arti!cial language, a type of programming language created solely for computer com- munication; (2) instruction based, a programming language with limited instructions supported by a speci!c computer or CPU; (3) translation, the conversion from human instructions to a technical computer program, or CPU; and (4) translator, of which there are two types: interpreter and compiler (Aho and Ullman 1972). There are two different methods computer programmers use to convert languages into a legible format on the computer. One method requires a computer programmer to compile a group of statements written in a spe- ci!c language and convert them into a machine-readable format prior to running the program. The other method entails simultaneously translating the language while running the program. For example, in C programming, we need to use C compiler to translate the program into machine codes before execution. Similarly, C++ and Java are compiling-type program- ing languages. BASIC programming language is an interpreter language (Lien" 1981), in which the interpreter will translate the program while it is running. Likewise, Python, Perl, and PHP are considered interpreter languages. Therefore, in order to successfully use Python on a computer, a Python interpreter must also be installed. Programming languages have evolved considerably from machine and assembly languages to intermediate and advanced languages (Rawen 2016). Machine language instructions are represented in a speci!c sequence using 0s and 1s. One single digit, or number, is called a bit. A combination of three bits is called an octal number (an eight digit combination using the numbers 0–7), whereas a combination of four bits is called a hex number 19 20 Introduction to GIS Programming and Fundamentals with Python and ArcGIS® FIGURE 2.1 Print ‘A’ 1000 times using different types of languages. (a"16"digit combination using the numbers 0–15). Assembly languages depict the machine bit operations with easy-to-remember text representations. Intermediate languages are typically more powerful and easy to code. Advanced languages are more similar to human language, do not have access to speci!c hardware functions, and are executed on several different hardware types. The example uses different representations for the “print letter ‘A’ for 1000 times” (Figure 2.1). Machine languages become increasingly dif!cult to understand by humans, so only speci!c CPUs are able to read the language accurately (Hutchins 1986). Therefore, in GIS programming, we normally use advanced languages such as C, Java, or Python instead of machine or assembly language. C is a typical procedural language that was developed around 1969– 1973 and became available to the general public around 1977–1979. It was of!cially"standardized by the ANSI X3J11 committee in the mid-1980s and has become one of the most commonly used languages in the computer industry. The early editions of GRASS (Geographic Resource Analysis Support System, Neteler and Mitasova 2013) GIS* open-source software and ArcGIS were developed using C. Bjarne Stroustrup of Bell Laboratories expanded C to C++ in order to support object-oriented features. C++ supports C fea- tures in function calls and object-oriented classes/objects fashion. Both C and C++ are complex for beginning programmers. Since 1999, ISO/ANSI has * http://grass.osgeo.org/. Object-Oriented Programming 21 standardized C++ to improve and maintain state-of-the-art quality within the industry. C and C++ are commonly used in Linux and have in#uenced other languages such as C# and Java. Developed by Sun at SunWorld’95, Java is a pure object-oriented language developed to target Internet and cross-platform applications. Over time, Java has become increasingly popular among IT companies such as Microsoft, Borland/Eclipse, IBM, and Sun. The of!cial Java resource can be found at java.sun.com and an open-source compiler/programming environment can be found on the Eclipse Foundation website at www.eclipse.com. Python is an interactive language programming system created by Guido van Rossum in 1990. Python is dynamically written and uses auto- matic memory management. The nonpro!t Python Software Foundation consistently updates and manages this open-source project. Python is fully developed in that it can write once and run many times on different platforms. This book will analyze and explain Python as it is applied to GIS and ArcGIS* programming. You can download any version from Python’s website; however, not all versions interactively work with ArcGIS. Python is easy to learn and use, and is supported by ArcGIS, which is why we have chosen it to be the programming language for this book. 2.2 Class and Object Classes and objects are widely used in Python. Class de!nes the template for a category of objects with name, attributes, and methods. Objects are instances of classes with attributes and methods. The attributes and methods can be referred to using a ‘.’. For example, the coordinate attributes and calDis method of a point object created from a Point class can be referred to using point.x, point.y, and point.calDis(). 2.2.1 Defining Classes Python provides the mechanism to de!ne a class using the keyword class with the syntax of ‘class className:’, for example, ‘class Point:’, ‘class Polyline:’, or ‘class Polygon:’. The attributes and methods can be de!ned for a class using the ‘def’ keyword. Figure 2.2 shows the Python code for de!ning a Point class with attributes name, x, y de!ned and the method setName() de!ned. In the __init__ method, “0, 0” was passed in as value for x, y, and name. Many classes de!ne a special method named __init__() to create/construct objects. The method will be called when we create an object using the class * http://www.esri.com/software/arcgis. 22 Introduction to GIS Programming and Fundamentals with Python and ArcGIS® Colon for the opening of the class body Keyword to Name of this class Init function Colon for the opening define all classes of the method body The first argument is always self Keyword used to The default value for define function The second argument x argument name is x Method name Assign value Dot used to call the attribute FIGURE 2.2 An example of de!ning a Point class with Python. (such as Point class here). The __init__ method has four ‘_’—two before and two after ‘init’—to make it the construction method that will be used when creating an object. For all methods de!ned by a class, the !rst parameter is always ‘self’, which refers to the object itself. This can be used to refer to the attributes and methods of the objects. For example, the __init__ method will create a point object with self as the !rst parameter and x, y, name initial val- ues for the object. By default (without specifying the values), the values for x, y, and name will be 0, 0, and blank string, respectively. The !rst two state- ments of Code 2.1 create two point objects (point0 and point1). The object point0 is created with default values and point1 is created with arguments >>> class Point : def __ini t__(sel f , x=0 , y=0 , name= ' ' ) : sel f. x = x sel f. y = y sel f. name = name def setName(sel f , name) : sel f. name = name >>> point0 = Point ( ) >>> point1 = Point (1 , 1 , ' f i rs t point ' ) >>> point0. x , point0. y , point0. name (0 , 0 , ' ' ) >>> point1. x , point1. y , point1. name (1 , 1 , ' f i rs t point ' ) >>> point1. setName( ' second point ' ) >>> point1. name ' second point ' >>> CODE 2.1 Creating a point may pass in value to the object through parameters. Object-Oriented Programming 23 of 1, 1, and ‘!rst point’. If no parameters are given when creating point0, the default values 0, 0, and ’ ’ will be used. When" values (1,"1, ’!rst point’) are given parameters, the __init__ method will assign the values passed into the attributes of point1. 2.2.2 Object Generation To create an object, type objectName = className() with none or multiple parameters, which will be passed to the attributes declared in the __init__() methods. objectName = className(value1,value2,…) In Code 2.1, we generated two objects, point0 and point1. While declaring object point0, no parameter is passed while three values (1, 1, ’!rst point’) are used to generate point1. To refer to an object’s attribute or method, we start with the objectName, followed by a period and then end with the attribute name or method name. objectName.attributeName objectName.methodName() Code 2.1 uses.x,.y, and.name following the objects point0 and point1 to refer to the attributes x, y, and name. The instruction point1.setName() is called to change the name of point1 to ‘second point’. 2.2.3 Attributes Each class may have one or more attributes. Section 1.4 explains how attri- butes can be public, private, or protected to indicate different accessibility by other objects. How do you explicitly specify the public and private attributes while declaring a class? Public: Attributes in Python are, by default, “public” all the time. Private: Attributes that begin with a double underscore (“_”). Such" attributes can be protected as private because it cannot be directly accessed. However, they can be accessed by object._ClassName _attributeName, for example, test._Test_foobar, where test is an object of Test class, and _foobar is a private attribute (Code"2.2). Protect: Attributes pre!x with a single underscore “_” by convention. However, they still can be accessed outside of the class in Python. Another important attribute in Python is the static attribute, which is used to hold data that is persistent and independent of any object of the class 24 Introduction to GIS Programming and Fundamentals with Python and ArcGIS® >>> class Tes t : def __ini t__(sel f ) : sel f. __foobar = "pr ivate at t r" sel f. foobar = "publ ic at t r" >>> test = Tes t ( ) >>> test. foobar ' publ ic at t r ' >>> test. __foobar Traceback (mos t recent cal l las t ) : Fi le "" , l ine 1 , in test. __foobar At t r ibuteEr ror : Tes t ins tance has no at t r ibute ' __foobar ' >>> test. _Tes t__foobar ' pr ivate at t r ' >>> CODE 2.2 Declare public, private, and protect attributes. (Code"2.3). For example, we can create a map including different layers, and the layer scale can be static and the same to all layer objects. A class (and instantiated object) can have special built-in attributes. The"special class attributes include a class name and description of the class (Code 2.4). >>> class Tes t : vers ion = 1. 0 >>> Tes t. vers ion 1.0 >>> t1 = Tes t ( ) >>> t2 = Tes t ( ) >>> t1. vers ion 1.0 >>> t2. vers ion 1.0 >>> Tes t. vers ion = 2. 0 >>> t1. vers ion 2.0 >>> t2. vers ion 2.0 >>> t1. vers ion = 3. 0 >>> t1. vers ion 3.0 >>> Tes t. vers ion 2.0 >>> t2. vers ion 2.0 >>> CODE 2.3 Declare static attributes. Object-Oriented Programming 25 >>> class Point : """Point Class Def ini t ion""" def __ini t__(sel f ) : sel f. x = 0. 0 sel f. y = 0. 0 def getDis tance( ) : pass ## ignore here >>> Point. __name__ ' Point ' >>> Point. __doc__ ' Point Class Def ini t ion ' >>> Point. __module__ ' __main__ ' >>> CODE 2.4 Special class attributes. >>> p1 = Point ( ) >>> p1. __class__ >>> p1. __dict__ { ' y ' : 0. 0 , ' x ' : 0. 0} >>> CODE 2.5 Special object attributes. _name_: class name _doc_: description _bases_: parent classes _dict_: attributes _module_: module where class is de!ned The special object attributes include a class name and an object’s attributes (Code 2.5). _class_: class from which object is instantiated _dict_: attributes of object 2.2.4 Inheritance Chapter 1 introduces three important relationships among objects in object- oriented programming: inheritance, encapsulation, and polymorphism. Inheritance is an ef!cient way to help reuse a developed class. While private attributes and methods cannot be inherited, all other public and protected attributes and methods can be automatically inherited by subclasses. 26 Introduction to GIS Programming and Fundamentals with Python and ArcGIS® FIGURE 2.3 An example of inheritance (ParkingLot class inherits from class Polygon, and Polygon inherits from Feature). To inherit a super class in Python, include the super class name in a pair of parentheses after the class name. class DerivedClassName(SuperClass1) We can also inherit multiple classes in Python by entering more than one class name in the parentheses. class DerivedClassName(SuperClass1, SuperClass2, SuperClass3) Figure 2.3 shows an example of inheritance. Assuming we have a class Feature, which includes a method draw(), then the class Polygon will inherit from the class Feature. With this inheritance, the Polygon class will have the method draw() as well. When we de!ne the ParkingLot class with the inheritance from the Polygon, the ParkingLot will have attributes of x and y coordinates as well as the method draw(). The Polygon and ParkingLot may have different drawing implementations; however, you can use the draw() feature for both the Polygon and ParkingLot. This particular method using different implementations for different subclasses is called polymorphism. 2.2.5 Composition Composition is an ef!cient way to help us reuse created objects, and to maintain the part-to-whole relationship between objects. To maintain the Object-Oriented Programming 27 FIGURE 2.4 Composition example (a Polygon class includes attribute points as objects generated from class Point). composition relationship, you must de!ne a class with an attribute that can include a number of other class objects. Figure 2.4 shows an example of composition. The class Point and the class Polygon inherit from the class Feature. The class Polygon border is de!ned by a sequence of points formed in a ring and is captured by point