Introduction to Computation and Programming Using Python (2nd Edition) PDF

Introduction to Computation and Programming Using Python Introduction to Computation and Programming Using Python Revised and Expanded Edition John V. Guttag The MIT Press Cambridge, Massachusetts London, England © 2013 Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. MIT Press books may be purchased at special quantity discounts for business or sales promotional use. For information, please email [email protected] or write to Special Sales Department, The MIT Press, 55 Hayward Street, Cambridge, MA 02142. Printed and bound in the United States of America. Library of Congress Cataloging-‐in-‐Publication Data Guttag, John. Introduction to computation and programming using Python / John V. Guttag. — Revised and expanded edition. pages cm Includes index. ISBN 978-‐0-‐262-‐52500-‐8 (pbk. : alk. paper) 1. Python (Computer program language) 2. Computer programming. I. Title. QA76.73.P48G88 2013 005.13'3—dc23 10 9 8 7 6 5 4 3 2 1 To my family: Olga David Andrea Michael Mark Addie CONTENTS PREFACE.......................................................................................................xiii ACKNOWLEDGMENTS..................................................................................... xv 1 GETTING STARTED.................................................................................... 1 2 INTRODUCTION TO PYTHON...................................................................... 7 2.1 The Basic Elements of Python............................................................... 8 2.1.1 Objects, Expressions, and Numerical Types.................................... 9 2.1.2 Variables and Assignment............................................................ 11 2.1.3 IDLE............................................................................................ 13 2.2 Branching Programs........................................................................... 14 2.3 Strings and Input............................................................................... 16 2.3.1 Input............................................................................................ 18 2.4 Iteration.............................................................................................. 18 3 SOME SIMPLE NUMERICAL PROGRAMS.................................................. 21 3.1 Exhaustive Enumeration.................................................................... 21 3.2 For Loops............................................................................................ 23 3.3 Approximate Solutions and Bisection Search...................................... 25 3.4 A Few Words About Using Floats........................................................ 29 3.5 Newton-Raphson................................................................................ 32 4 FUNCTIONS, SCOPING, and ABSTRACTION............................................. 34 4.1 Functions and Scoping....................................................................... 35 4.1.1 Function Definitions..................................................................... 35 4.1.2 Keyword Arguments and Default Values....................................... 36 4.1.3 Scoping........................................................................................ 37 4.2 Specifications..................................................................................... 41 4.3 Recursion........................................................................................... 44 4.3.1 Fibonacci Numbers...................................................................... 45 4.3.2 Palindromes................................................................................. 48 4.4 Global Variables................................................................................. 50 4.5 Modules.............................................................................................. 51 4.6 Files................................................................................................... 53 viii 5 STRUCTURED TYPES, MUTABILITY, AND HIGHER-ORDER FUNCTIONS.. 56 5.1 Tuples................................................................................................ 56 5.1.1 Sequences and Multiple Assignment............................................. 57 5.2 Lists and Mutability............................................................................ 58 5.2.1 Cloning........................................................................................ 63 5.2.2 List Comprehension..................................................................... 63 5.3 Functions as Objects.......................................................................... 64 5.4 Strings, Tuples, and Lists................................................................... 66 5.5 Dictionaries........................................................................................ 67 6 TESTING AND DEBUGGING...................................................................... 70 6.1 Testing................................................................................................ 70 6.1.1 Black-Box Testing........................................................................ 71 6.1.2 Glass-Box Testing........................................................................ 73 6.1.3 Conducting Tests......................................................................... 74 6.2 Debugging.......................................................................................... 76 6.2.1 Learning to Debug........................................................................ 78 6.2.2 Designing the Experiment............................................................ 79 6.2.3 When the Going Gets Tough......................................................... 81 6.2.4 And When You Have Found “The” Bug.......................................... 82 7 EXCEPTIONS AND ASSERTIONS.............................................................. 84 7.1 Handling Exceptions........................................................................... 84 7.2 Exceptions as a Control Flow Mechanism........................................... 87 7.3 Assertions........................................................................................... 90 8 CLASSES AND OBJECT-ORIENTED PROGRAMMING............................... 91 8.1 Abstract Data Types and Classes........................................................ 91 8.1.1 Designing Programs Using Abstract Data Types............................ 96 8.1.2 Using Classes to Keep Track of Students and Faculty................... 96 8.2 Inheritance......................................................................................... 99 8.2.1 Multiple Levels of Inheritance..................................................... 101 8.2.2 The Substitution Principle.......................................................... 102 8.3 Encapsulation and Information Hiding.............................................. 103 8.3.1 Generators................................................................................. 106 8.4 Mortgages, an Extended Example..................................................... 108 ix 9 A SIMPLISTIC INTRODUCTION TO ALGORITHMIC COMPLEXITY............ 113 9.1 Thinking About Computational Complexity....................................... 113 9.2 Asymptotic Notation.......................................................................... 116 9.3 Some Important Complexity Classes................................................. 118 9.3.1 Constant Complexity.................................................................. 118 9.3.2 Logarithmic Complexity.............................................................. 118 9.3.3 Linear Complexity...................................................................... 119 9.3.4 Log-Linear Complexity................................................................ 120 9.3.5 Polynomial Complexity............................................................... 120 9.3.6 Exponential Complexity.............................................................. 121 9.3.7 Comparisons of Complexity Classes............................................ 123 10 SOME SIMPLE ALGORITHMS AND DATA STRUCTURES......................... 125 10.1 Search Algorithms.......................................................................... 126 10.1.1 Linear Search and Using Indirection to Access Elements.......... 126 10.1.2 Binary Search and Exploiting Assumptions.............................. 128 10.2 Sorting Algorithms.......................................................................... 131 10.2.1 Merge Sort................................................................................ 132 10.2.2 Exploiting Functions as Parameters.......................................... 135 10.2.3 Sorting in Python..................................................................... 136 10.3 Hash Tables.................................................................................... 137 11 PLOTTING AND MORE ABOUT CLASSES................................................ 141 11.1 Plotting Using PyLab....................................................................... 141 11.2 Plotting Mortgages, an Extended Example....................................... 146 12 STOCHASTIC PROGRAMS, PROBABILITY, AND STATISTICS................... 152 12.1 Stochastic Programs....................................................................... 153 12.2 Inferential Statistics and Simulation............................................... 155 12.3 Distributions.................................................................................. 166 12.3.1 Normal Distributions and Confidence Levels............................. 168 12.3.2 Uniform Distributions.............................................................. 170 12.3.3 Exponential and Geometric Distributions................................. 171 12.3.4 Benford’s Distribution.............................................................. 173 12.4 How Often Does the Better Team Win?............................................ 174 12.5 Hashing and Collisions................................................................... 177 x 13 RANDOM WALKS AND MORE ABOUT DATA VISUALIZATION................. 179 13.1 The Drunkard’s Walk...................................................................... 179 13.2 Biased Random Walks.................................................................... 186 13.3 Treacherous Fields.......................................................................... 191 14 MONTE CARLO SIMULATION.................................................................. 193 14.1 Pascal’s Problem............................................................................. 194 14.2 Pass or Don’t Pass?......................................................................... 195 14.3 Using Table Lookup to Improve Performance................................... 199 14.4 Finding π........................................................................................ 200 14.5 Some Closing Remarks About Simulation Models............................ 204 15 UNDERSTANDING EXPERIMENTAL DATA.............................................. 207 15.1 The Behavior of Springs.................................................................. 207 15.1.1 Using Linear Regression to Find a Fit....................................... 210 15.2 The Behavior of Projectiles.............................................................. 214 15.2.1 Coefficient of Determination..................................................... 216 15.2.2 Using a Computational Model................................................... 217 15.3 Fitting Exponentially Distributed Data............................................ 218 15.4 When Theory Is Missing.................................................................. 221 16 LIES, DAMNED LIES, AND STATISTICS.................................................. 222 16.1 Garbage In Garbage Out (GIGO)...................................................... 222 16.2 Pictures Can Be Deceiving.............................................................. 223 16.3 Cum Hoc Ergo Propter Hoc............................................................... 225 16.4 Statistical Measures Don’t Tell the Whole Story............................... 226 16.5 Sampling Bias................................................................................. 228 16.6 Context Matters.............................................................................. 229 16.7 Beware of Extrapolation.................................................................. 229 16.8 The Texas Sharpshooter Fallacy...................................................... 230 16.9 Percentages Can Confuse................................................................ 232 16.10 Just Beware.................................................................................. 233 17 KNAPSACK AND GRAPH OPTIMIZATION PROBLEMS.............................. 234 17.1 Knapsack Problems........................................................................ 234 17.1.1 Greedy Algorithms.................................................................... 235 17.1.2 An Optimal Solution to the 0/1 Knapsack Problem................... 238 xi 17.2 Graph Optimization Problems......................................................... 240 17.2.1 Some Classic Graph-Theoretic Problems................................... 244 17.2.2 The Spread of Disease and Min Cut.......................................... 245 17.2.3 Shortest Path: Depth-First Search and Breadth-First Search.... 246 18 DYNAMIC PROGRAMMING..................................................................... 252 18.1 Fibonacci Sequences, Revisited....................................................... 252 18.2 Dynamic Programming and the 0/1 Knapsack Problem................... 254 18.3 Dynamic Programming and Divide-and-Conquer............................. 261 19 A QUICK LOOK AT MACHINE LEARNING................................................ 262 19.1 Feature Vectors.............................................................................. 264 19.2 Distance Metrics............................................................................. 266 19.3 Clustering....................................................................................... 270 19.4 Types Example and Cluster............................................................. 272 19.5 K-means Clustering........................................................................ 274 19.6 A Contrived Example...................................................................... 276 19.7 A Less Contrived Example............................................................... 280 19.8 Wrapping Up................................................................................... 286 PYTHON 2.7 QUICK REFERENCE................................................................. 287 INDEX.......................................................................................................... 289 PREFACE This book is based on an MIT course that has been offered twice a year since 2006. The course is aimed at students with little or no prior programming experience who have desire to understand computational approaches to problem solving. Each year, a few of the students in the class use the course as a stepping stone to more advanced computer science courses. But for most of the students it will be their only computer science course. Because the course will be the only computer science course for most of the students, we focus on breadth rather than depth. The goal is to provide students with a brief introduction to many topics, so that they will have an idea of what’s possible when the time comes to think about how to use computation to accomplish a goal. That said, it is not a “computation appreciation” course. It is a challenging and rigorous course in which the students spend a lot of time and effort learning to bend the computer to their will. The main goal of this book is to help you, the reader, become skillful at making productive use of computational techniques. You should learn to apply computational modes of thoughts to frame problems and to guide the process of extracting information from data in a computational manner. The primary knowledge you will take away from this book is the art of computational problem solving. The book is a bit eccentric. Part 1 (Chapters 1-8) is an unconventional introduction to programming in Python. We braid together four strands of material: The basics of programming, The Python programming language, Concepts central to understanding computation, and Computational problem solving techniques. We cover most of Python’s features, but the emphasis is on what one can do with a programming language, not on the language itself. For example, by the end of Chapter 3 the book has covered only a small fraction of Python, but it has already introduced the notions of exhaustive enumeration, guess-and-check algorithms, bisection search, and efficient approximation algorithms. We introduce features of Python throughout the book. Similarly, we introduce aspects of programming methods throughout the book. The idea is to help you learn Python and how to be a good programmer in the context of using computation to solve interesting problems. Part 2 (Chapters 9-16) is primarily about using computation to solve problems. It assumes no knowledge of mathematics beyond high school algebra, but it does assume that the reader is comfortable with rigorous thinking and not intimidated by mathematical concepts. It covers some of the usual topics found in an introductory text, e.g., computational complexity and simple algorithms. xiv Preface But the bulk of this part of the book is devoted to topics not found in most introductory texts: data visualization, probabilistic and statistical thinking, simulation models, and using computation to understand data. Part 3 (Chapters 17-19) looks at three slightly advanced topics—optimization problems, dynamic programming, and clustering. Part 1 can form the basis of a self-contained course that can be taught in a quarter or half a semester. Experience suggests that it is quite comfortable to fit both Parts 1 and 2 of this book into a full-semester course. When the material in Part 3 is included, the course becomes more demanding than is comfortable for many students. The book has two pervasive themes: systematic problem solving and the power of abstraction. When you have finished this book you should have: Learned a language, Python, for expressing computations, Learned a systematic approach to organizing, writing and debugging medium-sized programs, Developed an informal understanding of computational complexity, Developed some insight into the process of moving from an ambiguous problem statement to a computational formulation of a method for solving the problem, Learned a useful set of algorithmic and problem reduction techniques, Learned how to use randomness and simulations to shed light on problems that don’t easily succumb to closed-form solutions, and Learned how to use computational tools, including simple statistical and visualization tools, to model and understand data. Programming is an intrinsically difficult activity. Just as “there is no royal road to geometry,”1 there is no royal road to programming. It is possible to deceive students into thinking that they have learned how to program by having them complete a series of highly constrained “fill in the blank” programming problems. However, this does not prepare students for figuring out how to harness computational thinking to solve problems. If you really want to learn the material, reading the book will not be enough. At the very least you should try running some of the code in the book. All of the code in the book can be found at http://mitpress.mit.edu/ICPPRE. Various versions of the course have been available on MIT’s OpenCourseWare (OCW) Web site since 2008. The site includes video recordings of lectures and a complete set of problem sets and exams. Since the fall of 2012, edX and MITx, have offered an online version of this course. We strongly recommend that you do the problem sets associated with one of the OCW or edX offerings. 1 This was Euclid’s purported response, circa 300 BC, to King Ptolemy’s request for an easier way to learn mathematics. ACKNOWLEDGMENTS This book grew out of a set of lecture notes that I prepared while teaching an undergraduate course at MIT. The course, and therefore this book, benefited from suggestions from faculty colleagues (especially Eric Grimson, Srinivas Devadas, and Fredo Durand), teaching assistants, and the students who took the course. The process of transforming my lecture notes into a book proved far more onerous than I had expected. Fortunately, this misguided optimism lasted long enough to keep me from giving up. The encouragement of colleagues and family also helped keep me going. Eric Grimson, Chris Terman, and David Guttag provided vital help. Eric, who is MIT’s Chancellor, managed to find the time to read almost the entire book with great care. He found numerous errors (including an embarrassing, to me, number of technical errors) and pointed out places where necessary explanations were missing. Chris also read parts of the manuscript and discovered errors. He also helped me battle Microsoft Word, which we eventually persuaded to do most of what we wanted. David overcame his aversion to computer science, and proofread multiple chapters. Preliminary versions of this book were used in the MIT course 6.00 and the MITx course 6.00x. A number of students in these courses pointed out errors. One 6.00x student, J.C. Cabrejas, was particularly helpful. He found a large number of typos, and more than a few technical errors. Like all successful professors, I owe a great deal to my graduate students. The photo on the back cover of this book depicts me supporting some of my current students. In the lab, however, it is they who support me. In addition to doing great research (and letting me take some of the credit for it), Guha Balakrishnan, Joel Brooks, Ganeshapillai Gartheeban, Jen Gong, Yun Liu, Anima Singh, Jenna Wiens, and Amy Zhao all provided useful comments on this manuscript. I owe a special debt of gratitude to Julie Sussman, P.P.A. Until I started working with Julie, I had no idea how much difference an editor could make. I had worked with capable copy editors on previous books, and thought that was what I needed for this book. I was wrong. I needed a collaborator who could read the book with the eyes of a student, and tell me what needed to be done, what should be done, and what could be done if I had the time and energy to do it. Julie buried me in “suggestions” that were too good to ignore. Her combined command of both the English language and programming is quite remarkable. Finally, thanks to my wife, Olga, for pushing me to finish and for pitching in at critical times. 1 GETTING STARTED A computer does two things, and two things only: it performs calculations and it remembers the results of those calculations. But it does those two things extremely well. The typical computer that sits on a desk or in a briefcase performs a billion or so calculations a second. It’s hard to image how truly fast that is. Think about holding a ball a meter above the floor, and letting it go. By the time it reaches the floor, your computer could have executed over a billion instructions. As for memory, a typical computer might have hundreds of gigabytes of storage. How big is that? If a byte (the number of bits, typically eight, required to represent one character) weighed one ounce (which it doesn’t), 100 gigabytes would weigh more than 3,000,000 tons. For comparison, that’s roughly the weight of all the coal produced in a year in the U.S. For most of human history, computation was limited by the speed of calculation of the human brain and the ability to record computational results with the human hand. This meant that only the smallest problems could be attacked computationally. Even with the speed of modern computers, there are still problems that are beyond modern computational models (e.g., understanding climate change), but more and more problems are proving amenable to computational solution. It is our hope that by the time you finish this book, you will feel comfortable bringing computational thinking to bear on solving many of the problems you encounter during your studies, work, and even everyday life. What do we mean by computational thinking? All knowledge can be thought of as either declarative or imperative. Declarative knowledge is composed of statements of fact. For example, “the square root of x is a number y such that y*y = x.” This is a statement of fact. Unfortunately it doesn’t tell us how to find a square root. Imperative knowledge is “how to” knowledge, or recipes for deducing information. Heron of Alexandria was the first to document a way to compute the square root of a number.2 His method can be summarized as: Start with a guess, g. If g*g is close enough to x, stop and say that g is the answer. Otherwise create a new guess by averaging g and x/g, i.e., (g + x/g)/2. Using this new guess, which we again call g, repeat the process until g*g is close enough to x. 2 Many believe that Heron was not the inventor of this method, and indeed there is some evidence that it was well known to the ancient Babylonians. 2 Chapter 1. Getting Started Consider, for example, finding the square root of 25. 1. Set g to some arbitrary value, e.g., 3. 2. We decide that 3*3 = 9 is not close enough to 25. 3. Set g to (3 + 25/3)/2 = 5.67.3 4. We decide that 5.67*5.67 = 32.15 is still not close enough to 25. 5. Set g to (5.67 + 25/5.67)/2 = 5.04 6. We decide that 5.04*5.04 = 25.4 is close enough, so we stop and declare 5.04 to be an adequate approximation to the square root of 25. Note that the description of the method is a sequence of simple steps, together with a flow of control that specifies when each step is to be executed. Such a description is called an algorithm.4 This algorithm is an example of a guess- and-check algorithm. It is based on the fact that it is easy to check whether or not a guess is a good one. A bit more formally, an algorithm is a finite list of instructions that describe a computation that when executed on a provided set of inputs will proceed through a set of well-defined states and eventually produce an output. An algorithm is a bit like a recipe from a cookbook: 1. Put custard mixture over heat. 2. Stir. 3. Dip spoon in custard. 4. Remove spoon and run finger across back of spoon. 5. If clear path is left, remove custard from heat and let cool. 6. Otherwise repeat. It includes some tests for deciding when the process is complete, as well as instructions about the order in which to execute instructions, sometimes jumping to some instruction based on a test. So how does one capture this idea of a recipe in a mechanical process? One way would be to design a machine specifically intended to compute square roots. Odd as this may sound, the earliest computing machines were, in fact, fixed- program computers, meaning they were designed to do very specific things, and were mostly tools to solve a specific mathematical problem, e.g., to compute the trajectory of an artillery shell. One of the first computers (built in 1941 by Atanasoff and Berry) solved systems of linear equations, but could do nothing else. Alan Turing’s bombe machine, developed during World War II, was designed strictly for the purpose of breaking German Enigma codes. Some very simple computers still use this approach. For example, a four-function calculator is a fixed-program computer. It can do basic arithmetic, but it cannot 3 For simplicity, we are rounding results. 4 The word “algorithm” is derived from the name of the Persian mathematician Muhammad ibn Musa al-Khwarizmi. Chapter 1. Getting Started 3 be used as a word processor or to run video games. To change the program of such a machine, one has to replace the circuitry. The first truly modern computer was the Manchester Mark 1.5 It was distinguished from its predecessors by the fact that it was a stored-program computer. Such a computer stores (and manipulates) a sequence of instructions, and has a set of elements that will execute any instruction in that sequence. By creating an instruction-set architecture and detailing the computation as a sequence of instructions (i.e., a program), we make a highly flexible machine. By treating those instructions in the same way as data, a stored-program machine can easily change the program, and can do so under program control. Indeed, the heart of the computer then becomes a program (called an interpreter) that can execute any legal set of instructions, and thus can be used to compute anything that one can describe using some basic set of instructions. Both the program and the data it manipulates reside in memory. Typically there is a program counter that points to a particular location in memory, and computation starts by executing the instruction at that point. Most often, the interpreter simply goes to the next instruction in the sequence, but not always. In some cases, it performs a test, and on the basis of that test, execution may jump to some other point in the sequence of instructions. This is called flow of control, and is essential to allowing us to write programs that perform complex tasks. Returning to the recipe metaphor, given a fixed set of ingredients a good chef can make an unbounded number of tasty dishes by combining them in different ways. Similarly, given a small fixed set of primitive elements a good programmer can produce an unbounded number of useful programs. This is what makes programming such an amazing endeavor. To create recipes, or sequences of instructions, we need a programming language in which to describe these things, a way to give the computer its marching orders. In 1936, the British mathematician Alan Turing described a hypothetical computing device that has come to be called a Universal Turing Machine. The machine had an unbounded memory in the form of tape on which one could write zeros and ones, and some very simple primitive instructions for moving, reading, and writing to the tape. The Church-Turing thesis states that if a function is computable, a Turing Machine can be programmed to compute it. The “if” in the Church-Turing thesis is important. Not all problems have computational solutions. For example, Turing showed that it is impossible to write a program that given an arbitrary program, call it P, prints true if and only if P will run forever. This is known as the halting problem. 5 This computer was built at the University of Manchester, and ran its first program in 1949. It implemented ideas previously described by John von Neumann and was anticipated by the theoretical concept of the Universal Turing Machine described by Alan Turing in 1936. 4 Chapter 1. Getting Started The Church-Turing thesis leads directly to the notion of Turing completeness. A programming language is said to be Turing complete if it can be used to simulate a universal Turing Machine. All modern programming languages are Turing complete. As a consequence, anything that can be programmed in one programming language (e.g., Python) can be programmed in any other programming language (e.g., Java). Of course, some things may be easier to program in a particular language, but all languages are fundamentally equal with respect to computational power. Fortunately, no programmer has to build programs out of Turing’s primitive instructions. Instead, modern programming languages offer a larger, more convenient set of primitives. However, the fundamental idea of programming as the process of assembling a sequence of operations remains central. Whatever set of primitives one has, and whatever methods one has for using them, the best thing and the worst thing about programming are the same: the computer will do exactly what you tell it to do. This is a good thing because it means that you can make it do all sorts of fun and useful things. It is a bad thing because when it doesn’t do what you want it to do, you usually have nobody to blame but yourself. There are hundreds of programming languages in the world. There is no best language (though one could nominate some candidates for worst). Different languages are better or worse for different kinds of applications. MATLAB, for example, is an excellent language for manipulating vectors and matrices. C is a good language for writing the programs that control data networks. PHP is a good language for building Web sites. And Python is a good general-purpose language. Each programming language has a set of primitive constructs, a syntax, a static semantics, and a semantics. By analogy with a natural language, e.g., English, the primitive constructs are words, the syntax describes which strings of words constitute well-formed sentences, the static semantics defines which sentences are meaningful, and the semantics defines the meaning of those sentences. The primitive constructs in Python include literals (e.g., the number 3.2 and the string 'abc') and infix operators (e.g., + and /). The syntax of a language defines which strings of characters and symbols are well formed. For example, in English the string “Cat dog boy.” is not a syntactically valid sentence, because the syntax of English does not accept sentences of the form. In Python, the sequence of primitives 3.2 + 3.2 is syntactically well formed, but the sequence 3.2 3.2 is not. The static semantics defines which syntactically valid strings have a meaning. In English, for example, the string “I are big,” is of the form , which is a syntactically acceptable sequence. Nevertheless, it is not valid English, because the noun “I” is singular and the verb “are” is plural. This is an example of a static semantic error. In Python, the sequence 3.2/'abc' is syntactically well formed ( ), but Chapter 1. Getting Started 5 produces a static semantic error since it is not meaningful to divide a number by a string of characters. The semantics of a language associates a meaning with each syntactically correct string of symbols that has no static semantic errors. In natural languages, the semantics of a sentence can be ambiguous. For example, the sentence “I cannot praise this student too highly,” can be either flattering or damning. Programming languages are designed so that each legal program has exactly one meaning. Though syntax errors are the most common kind of error (especially for those learning a new programming language), they are the least dangerous kind of error. Every serious programming language does a complete job of detecting syntactic errors, and will not allow users to execute a program with even one syntactic error. Furthermore, in most cases the language system gives a sufficiently clear indication of the location of the error that it is obvious what needs to be done to fix it. The situation with respect to static semantic errors is a bit more complex. Some programming languages, e.g., Java, do a lot of static semantic checking before allowing a program to be executed. Others, e.g., C and Python (alas), do relatively less static semantic checking. Python does do a considerable amount of static semantic checking while running a program. However, it does not catch all static semantic errors. When these errors are not detected, the behavior of a program is often unpredictable. We will see examples of this later in the book. One doesn’t usually speak of a program as having a semantic error. If a program has no syntactic errors and no static semantic errors, it has a meaning, i.e., it has semantics. Of course, that isn’t to say that it has the semantics that its creator intended it to have. When a program means something other than what its creator thinks it means, bad things can happen. What might happen if the program has an error, and behaves in an unintended way? It might crash, i.e., stop running and produce some sort of obvious indication that it has done so. In a properly designed computing system, when a program crashes it does not do damage to the overall system. Of course, some very popular computer systems don’t have this nice property. Almost everyone who uses a personal computer has run a program that has managed to make it necessary to restart the whole computer. Or it might keep running, and running, and running, and never stop. If one has no idea of approximately how long the program is supposed to take to do its job, this situation can be hard to recognize. Or it might run to completion and produce an answer that might, or might not, be correct. 6 Chapter 1. Getting Started Each of these is bad, but the last of them is certainly the worst, When a program appears to be doing the right thing but isn’t, bad things can follow. Fortunes can be lost, patients can receive fatal doses of radiation therapy, airplanes can crash, etc. Whenever possible, programs should be written in such a way that when they don’t work properly, it is self-evident. We will discuss how to do this throughout the book. Finger Exercise: Computers can be annoyingly literal. If you don’t tell them exactly what you want them to do, they are likely to do the wrong thing. Try writing an algorithm for driving between two destinations. Write it the way you would for a person, and then imagine what would happen if that person executed the algorithm exactly as written. For example, how many traffic tickets might they get? 2 INTRODUCTION TO PYTHON Though each programming language is different (though not as different as their designers would have us believe), there are some dimensions along which they can be related. Low-level versus high-level refers to whether we program using instructions and data objects at the level of the machine (e.g., move 64 bits of data from this location to that location) or whether we program using more abstract operations (e.g., pop up a menu on the screen) that have been provided by the language designer. General versus targeted to an application domain refers to whether the primitive operations of the programming language are widely applicable or are fine-tuned to a domain. For example Adobe Flash is designed to facilitate adding animation and interactivity to Web pages, but you wouldn’t want to use it build a stock portfolio analysis program. Interpreted versus compiled refers to whether the sequence of instructions written by the programmer, called source code, is executed directly (by an interpreter) or whether it is first converted (by a compiler) into a sequence of machine-level primitive operations. (In the early days of computers, people had to write source code in a language that was very close to the machine code that could be directly interpreted by the computer hardware.) There are advantages to both approaches. It is often easier to debug programs written in languages that are designed to be interpreted, because the interpreter can produce error messages that are easy to correlate with the source code. Compiled languages usually produce programs that run more quickly and use less space. In this book, we use Python. However, this book is not about Python. It will certainly help readers learn Python, and that’s a good thing. What is much more important, however, is that careful readers will learn something about how to write programs that solve problems. This skill can be transferred to any programming language. Python is a general-purpose programming language that can be used effectively to build almost any kind of program that does not need direct access to the computer’s hardware. Python is not optimal for programs that have high reliability constraints (because of its weak static semantic checking) or that are built and maintained by many people or over a long period of time (again because of the weak static semantic checking). However, Python does have several advantages over many other languages. It is a relatively simple language that is easy to learn. Because Python is designed to be interpreted, it can provide the kind of runtime feedback that is especially helpful to novice programmers. There are also a large number of freely available libraries that interface to Python and provide useful extended functionality. Several of those are used in this book. 8 Chapter 2. Introduction to Python Now we are ready to start learning some of the basic elements of Python. These are common to almost all programming languages in concept, though not necessarily in detail. The reader should be forewarned that this book is by no means a comprehensive introduction to Python. We use Python as a vehicle to present concepts related to computational problem solving and thinking. The language is presented in dribs and drabs, as needed for this ulterior purpose. Python features that we don’t need for that purpose are not presented at all. We feel comfortable about not covering the entire language because there are excellent online resources describing almost every aspect of the language. When we teach the course on which this book is based, we suggest to the students that they rely on these free online resources for Python reference material. Python is a living language. Since its introduction by Guido von Rossum in 1990, it has undergone many changes. For the first decade of its life, Python was a little known and little used language. That changed with the arrival of Python 2.0 in 2000. In addition to incorporating a number of important improvements to the language itself, it marked a shift in the evolutionary path of the language. A large number of people began developing libraries that interfaced seamlessly with Python, and continuing support and development of the Python ecosystem became a community-based activity. Python 3.0 was released at the end of 2008. This version of Python cleaned up many of the inconsistencies in the design of the various releases of Python 2 (often referred to as Python 2.x). However, it was not backward compatible. That meant that most programs written for earlier versions of Python could not be run using implementations of Python 3.0. The backward incompatibility presents a problem for this book. In our view, Python 3.0 is clearly superior to Python 2.x. However, at the time of this writing, some important Python libraries still do not work with Python 3. We will, therefore, use Python 2.7 (into which many of the most important features of Python 3 have been “back ported”) throughout this book. 2.1 The Basic Elements of Python A Python program, sometimes called a script, is a sequence of definitions and commands. These definitions are evaluated and the commands are executed by the Python interpreter in something called the shell. Typically, a new shell is created whenever execution of a program begins. In most cases, a window is associated with the shell. We recommend that you start a Python shell now, and use it to try the examples contained in the remainder of the chapter. And, for that matter, later in the book as well. A command, often called a statement, instructs the interpreter to do something. For example, the statement print 'Yankees rule!' instructs the interpreter to output the string Yankees rule! to the window associated with the shell. Chapter 2. Introduction to Python 9 The sequence of commands print 'Yankees rule!' print 'But not in Boston!' print 'Yankees rule,', 'but not in Boston!' causes the interpreter to produce the output Yankees rule! But not in Boston! Yankees rule, but not in Boston! Notice that two values were passed to print in the third statement. The print command takes a variable number of values and prints them, separated by a space character, in the order in which they appear.6 2.1.1 Objects, Expressions, and Numerical Types Objects are the core things that Python programs manipulate. Every object has a type that defines the kinds of things that programs can do with objects of that type. Types are either scalar or non-scalar. Scalar objects are indivisible. Think of them as the atoms of the language.7 Non-scalar objects, for example strings, have internal structure. Python has four types of scalar objects: int is used to represent integers. Literals of type int are written in the way we typically denote integers (e.g., -3 or 5 or 10002). float is used to represent real numbers. Literals of type float always include a decimal point (e.g., 3.0 or 3.17 or -28.72). (It is also possible to write literals of type float using scientific notation. For example, the literal 1.6E3 stands for 1.6*103, i.e., it is the same as 1600.0.) You might wonder why this type is not called real. Within the computer, values of type float are stored in the computer as floating point numbers. This representation, which is used by all modern programming languages, has many advantages. However, under some situations it causes floating point arithmetic to behave in ways that are slightly different from arithmetic on real numbers. We discuss this in Section 3.4. bool is used to represent the Boolean values True and False. None is a type with a single value. We will say more about this when we get to variables. Objects and operators can be combined to form expressions, each of which evaluates to an object of some type. We will refer to this as the value of the expression. For example, the expression 3 + 2 denotes the object 5 of type int, and the expression 3.0 + 2.0 denotes the object 5.0 of type float. 6 In Python 3, print is a function rather than a command. One would therefore write print('Yankees rule!', 'but not in Boston'). 7 Yes, atoms are not truly indivisible. However, splitting them is not easy, and doing so can have consequences that are not always desirable. 10 Chapter 2. Introduction to Python The == operator is used to test whether two expressions evaluate to the same value, and the != operator is used to test whether two expressions evaluate to different values. The symbol >>> is a shell prompt indicating that the interpreter is expecting the user to type some Python code into the shell. The line below the line with the prompt is produced when the interpreter evaluates the Python code entered at the prompt, as illustrated by the following interaction with the interpreter: >>> 3 + 2 5 >>> 3.0 + 2.0 5.0 >>> 3 != 2 True The built-in Python function type can be used to find out the type of an object: >>> type(3) >>> type(3.0) The operators on types int and float are listed in Figure 2.1. i+j is the sum of i and j. If i and j are both of type int, the result is an int. If either of them is a float, the result is a float. i–j is i minus j. If i and j are both of type int, the result is an int. If either of them is a float, the result is a float. i*j is the product of i and j. If i and j are both of type int, the result is an int. If either of them is a float, the result is a float. i//j is integer division. For example, the value of 6//2 is the int 3 and the value of 6//4 is the int 1. The value is 1 because integer division returns the quotient and ignores the remainder. i/j is i divided by j. In Python 2.7, when i and j are both of type int, the result is also an int, otherwise the result is a float. In this book, we will never use / to divide one int by another. We will use // to do that. (In Python 3, the / operator, thank goodness, always returns a float. For example, in Python 3 the value of 6/4 is 1.5.) i%j is the remainder when the int i is divided by the int j. It is typically pronounced “i mod j,” which is short for “i modulo j.” i**j is i raised to the power j. If i and j are both of type int, the result is an int. If either of them is a float, the result is a float. The comparison operators are == (equal), != (not equal), > (greater), >= (at least), >> name = raw_input('Enter your name: ') Enter your name: George Washington >>> print 'Are you really', name, '?' Are you really George Washington ? >>> print 'Are you really ' + name + '?' Are you really George Washington? Notice that the first print statement introduces a blank before the “?” It does this because when print is given multiple arguments it places a blank space between the values associated with the arguments. The second print statement uses concatenation to produce a string that does not contain the superfluous blank and passes this as the only argument to print. Now consider, >>> n = raw_input('Enter an int: ') Enter an int: 3 >>> print type(n) Notice that the variable n is bound to the str '3' not the int 3. So, for example, the value of the expression n*4 is '3333' rather than 12. The good news is that whenever a string is a valid literal of some type, a type conversion can be applied to it. Type conversions (also called type casts) are used often in Python code. We use the name of a type to convert values to that type. So, for example, the value of int('3')*4 is 12. When a float is converted to an int, the number is truncated (not rounded), e.g., the value of int(3.9) is the int 3. 2.4 Iteration A generic iteration (also called looping) mechanism is depicted in Figure 2.4. Like a conditional statement it begins with a test. If the test evaluates to True, the program executes the loop body once, and then goes back to reevaluate the test. This process is repeated until the test evaluates to False, after which control passes to the code following the iteration statement. 12 Python 3 has only one command, input. Somewhat confusingly, Python 3’s input has the same semantics as raw_input in Python 2.7. Go figure. Chapter 2. Introduction to Python 19 Figure 2.4 Flow chart for iteration Consider the following example: # Square an integer, the hard way x = 3 ans = 0 itersLeft = x while (itersLeft != 0): ans = ans + x itersLeft = itersLeft - 1 print str(x) + '*' + str(x) + ' = ' + str(ans) The code starts by binding the variable x to the integer 3. It then proceeds to square x by using repetitive addition. The following table shows the value associated with each variable each time the test at the start of the loop is reached. We constructed it by hand-simulating the code, i.e., we pretended to be a Python interpreter and executed the program using pencil and paper. Using pencil and paper might seem kind of quaint, but it is an excellent way to understand how a program behaves.13 test # x ans itersLeft 1 3 0 3 2 3 3 2 3 3 6 1 4 3 9 0 The fourth time the test is reached, it evaluates to False and flow of control proceeds to the print statement following the loop. For what values of x will this program terminate? If x == 0, the initial value of itersLeft will also be 0, and the loop body will never be executed. If x > 0, the initial value of itersLeft will be greater than 0, and the loop body will be executed. 13 It is also possible to hand-simulate a program using pen and paper, or even a text editor. 20 Chapter 2. Introduction to Python Each time the loop body is executed, the value of itersLeft is decreased by exactly 1. This means that if itersLeft started out greater than 0, after some finite number of iterations of the loop, itersLeft == 0. At this point the loop test evaluates to False, and control proceeds to the code following the while statement. What if the value of x is -1? Something very bad happens. Control will enter the loop, and each iteration will move itersLeft farther from 0 rather than closer to it. The program will therefore continue executing the loop forever (or until something else bad, e.g., an overflow error, occurs). How might we remove this flaw in the program? Initializing itersLeft to the absolute value of x almost works. The loop terminates, but it prints a negative value. If the assignment statement inside the loop is also changed, to ans = ans+abs(x), the code works properly. We have now covered pretty much everything about Python that we need to know to start writing interesting programs that deal with numbers and strings. We now take a short break from learning the language. In the next chapter, we use Python to solve some simple problems. Finger exercise: Write a program that asks the user to input 10 integers, and then prints the largest odd number that was entered. If no odd number was entered, it should print a message to that effect. 3 SOME SIMPLE NUMERICAL PROGRAMS Now that we have covered some basic Python constructs, it is time to start thinking about how we can combine those constructs to write some simple programs. Along the way, we’ll sneak in a few more language constructs and some algorithmic techniques. 3.1 Exhaustive Enumeration The code in Figure 3.1 prints the integer cube root, if it exists, of an integer. If the input is not a perfect cube, it prints a message to that effect. #Find the cube root of a perfect cube x = int(raw_input('Enter an integer: ')) ans = 0 while ans**3 < abs(x): ans = ans + 1 if ans**3 != abs(x): print x, 'is not a perfect cube' else: if x < 0: ans = -ans print 'Cube root of', x,'is', ans Figure 3.1 Using exhaustive enumeration to find the cube root For what values of x will this program terminate? The answer is, “all integers.” This can be argued quite simply. The value of the expression ans**3 starts at 0, and gets larger each time through the loop. When it reaches or exceeds abs(x), the loop terminates. Since abs(x) is always positive there are only a finite number of iterations before the loop must terminate. Whenever you write a loop, you should think about an appropriate decrementing function. This is a function that has the following properties: 1. It maps a set of program variables into an integer. 2. When the loop is entered, its value is nonnegative. 3. When its value is = abs(x): break if ans**3 != abs(x): print x, 'is not a perfect cube' else: if x < 0: ans = -ans print 'Cube root of', x,'is', ans Figure 3.2 Using for and break statements The for statement can be used to conveniently iterate over characters of a string. For example, total = 0 for c in '123456789': total = total + int(c) print total sums the digits in the string denoted by the literal '123456789' and prints the total. Finger exercise: Let s be a string that contains a sequence of decimal numbers separated by commas, e.g., s = '1.23,2.4,3.123'. Write a program that prints the sum of the numbers in s. 3.3 Approximate Solutions and Bisection Search Imagine that someone asks you to write a program that finds the square root of any nonnegative number. What should you do? You should probably start by saying that you need a better problem statement. For example, what should the program do if asked to find the square root of 2? The square root of 2 is not a rational number. This means that there is no way to precisely represent its value as a finite string of digits (or as a float), so the problem as initially stated cannot be solved. The right thing to have asked for is a program that finds an approximation to the square root—i.e., an answer that is close enough to the actual square root to be useful. We will return to this issue in considerable detail later in the book. But for now, let’s think of “close enough” as an answer that lies within some constant, call it epsilon, of the actual answer. The code in Figure 3.3 implements an algorithm that finds an approximation to a square root. It uses an operator, +=, that we have not previously used. The code ans += step is semantically equivalent to the more verbose code ans = ans+step. The operators -= and *= work similarly. 26 Chapter 3. Some Simple Numerical Programs x = 25 epsilon = 0.01 step = epsilon**2 numGuesses = 0 ans = 0.0 while abs(ans**2 - x) >= epsilon and ans = epsilon: print 'Failed on square root of', x else: print ans, 'is close to square root of', x Figure 3.3 Approximating the square root using exhaustive enumeration Once again, we are using exhaustive enumeration. Notice that this method for finding the square root has nothing in common with the way of finding square roots using a pencil that you might have learned in middle school. It is often the case that the best way to solve a problem with a computer is quite different from how one would approach the problem by hand. When the code is run, it prints numGuesses = 49990 4.999 is close to square root of 25 Should we be disappointed that the program didn’t figure out that 25 is a perfect square and print 5? No. The program did what it was intended to do. Though it would have been OK to print 5, doing so is no better than printing any value close enough to 5. What do you think will happen if we set x = 0.25? Will it find a root close to 0.5? Nope. Exhaustive enumeration is a search technique that works only if the set of values being searched includes the answer. In this case, we are enumerating the values between 0 and x. When x is between 0 and 1, the square root of x does not lie in this interval. One way to fix this is to change the first line of the while loop to while abs(ans**2 - x) >= epsilon and ans*ans n2. So, we can think of the square root of x as lying somewhere on the line 0_________________________________________________________max and start searching that interval. Since we don’t necessarily know where to start searching, let’s start in the middle. 0__________________________guess__________________________max If that is not the right answer (and it won’t be most of the time), ask whether it is too big or too small. If it is too big, we know that the answer must lie to the left. If it is too small, we know that the answer must lie to the right. We then repeat the process on the smaller interval. Figure 3.4 contains an implementation and test of this algorithm. 28 Chapter 3. Some Simple Numerical Programs x = 25 epsilon = 0.01 numGuesses = 0 low = 0.0 high = max(1.0, x) ans = (high + low)/2.0 while abs(ans**2 - x) >= epsilon: print 'low =', low, 'high =', high, 'ans =', ans numGuesses += 1 if ans**2 < x: low = ans else: high = ans ans = (high + low)/2.0 print 'numGuesses =', numGuesses print ans, 'is close to square root of', x Figure 3.4 Using bisection search to approximate square root When run, it prints low = 0.0 high = 25 ans = 12.5 low = 0.0 high = 12.5 ans = 6.25 low = 0.0 high = 6.25 ans = 3.125 low = 3.125 high = 6.25 ans = 4.6875 low = 4.6875 high = 6.25 ans = 5.46875 low = 4.6875 high = 5.46875 ans = 5.078125 low = 4.6875 high = 5.078125 ans = 4.8828125 low = 4.8828125 high = 5.078125 ans = 4.98046875 low = 4.98046875 high = 5.078125 ans = 5.029296875 low = 4.98046875 high = 5.029296875 ans = 5.0048828125 low = 4.98046875 high = 5.0048828125 ans = 4.99267578125 low = 4.99267578125 high = 5.0048828125 ans = 4.99877929688 low = 4.99877929688 high = 5.0048828125 ans = 5.00183105469 numGuesses = 13 5.00030517578 is close to square root of 25 Notice that it finds a different answer than our earlier algorithm. That is perfectly fine, since it still meets the problem statement. More important, notice that at each iteration the size of the space to be searched is cut in half. Because it divides the search space in half at each step, it is called a bisection search. Bisection search is a huge improvement over our earlier algorithm, which reduced the search space by only a small amount at each iteration. Let us try x = 123456 again. This time the program takes only thirty guesses to find an acceptable answer. How about x = 123456789 ? It takes only forty-five guesses. There is nothing special about the fact that we are using this algorithm to find square roots. For example, by changing a couple of 2’s to 3’s, we can use it to approximate a cube root of a nonnegative number. In the next chapter we will introduce a language mechanism that allows us to generalize this code to find any root. Finger exercise: What would the code in Figure 3.4 do if the statement x = 25 were replaced by x = -25? Chapter 3. Some Simple Numerical Programs 29 Finger exercise: What would have to be changed to make the code in Figure 3.4 work for finding an approximation to the cube root of both negative and positive numbers? (Hint: think about changing low to ensure that the answer lies within the region being searched.) 3.4 A Few Words About Using Floats Most of the time, numbers of type float provide a reasonably good approximation to real numbers. But “most of the time” is not all of the time, and when they don’t it can lead to surprising consequences. For example, try running the code x = 0.0 for i in range(10): x = x + 0.1 if x == 1.0: print x, '= 1.0' else: print x, 'is not 1.0' Perhaps you, like most people, find it doubly surprising that it prints, 1.0 is not 1.0 Why does it get to the else clause in the first place? And if it somehow does get there, why is it printing such a nonsensical phrase? To understand why this happens, we need to understand how floating point numbers are represented in the computer during a computation. To understand that, we need to understand binary numbers. When you first learned about decimal numbers, i.e., numbers base 10, you learned that a decimal number is represented by a sequence of the digits 0123456789. The rightmost digit is the 100 place, the next digit towards the left the 101 place, etc. For example, the sequence of decimal digits 302 represents 3*100 + 0*10 + 2*1. How many different numbers can be represented by a sequence of length n? A sequence of length one can represent any one of ten numbers (0 - 9). A sequence of length two can represent one hundred different numbers (0-99). More generally, with a sequence of length n, one can represent 10n different numbers. Binary numbers—numbers base 2—work similarly. A binary number is represented by a sequence of digits each of which is either 0 or 1. These digits are often called bits. The rightmost digit is the 20 place, the next digit towards the left the 21 place, etc. For example, the sequence of binary digits 101 represents 1*4 + 0*2 + 1*1 = 5. How many different numbers can be represented by a sequence of length n? 2n. Finger exercise: What is the decimal equivalent of the binary number 10011? 30 Chapter 3. Some Simple Numerical Programs Perhaps because most people have ten fingers, we seem to like to use decimals to represent numbers. On the other hand, all modern computer systems represent numbers in binary. This is not because computers are born with two fingers. It is because it is easy to build hardware switches, i.e., devices that can be in only one of two states, on or off. That the computer uses a binary representation and people a decimal representation can lead to occasional cognitive dissonance. In almost modern programming languages non-integer numbers are implemented using a representation called floating point. For the moment, let’s pretend that the internal representation is in decimal. We would represent a number as a pair of integers—the significant digits of the number and an exponent. For example, the number 1.949 would be represented as the pair (1949, -3), which stands for the product 1949 X 10-3. The number of significant digits determines the precision with which numbers can be represented. If for example, there were only two significant digits, the number 1.949 could not be represented exactly. It would have to be converted to some approximation of 1.949, in this case 1.9. That approximation is called the rounded value. Modern computers use binary, not decimal, representations. We represent the significant digits and exponents in binary rather than decimal and raise 2 rather than 10 to the exponent. For example, the number 0.625 (5/8) would be represented as the pair (101, -11); because 5/8 is 0.101 in binary and -11 is the binary representation of -3, the pair (101, -11) stands for 5 X 2-3 = 5/8 = 0.625. What about the decimal fraction 1/10, which we write in Python as 0.1? The best we can do with four significant binary digits is (0011, -101). This is equivalent to 3/32, i.e., 0.09375. If we had five significant binary digits, we would represent 0.1 as (11001, -1000), which is equivalent to 25/256, i.e., 0.09765625. How many significant digits would we need to get an exact floating point representation of 0.1? An infinite number of digits! There do not exist integers sig and exp such that sig * 2-exp equals 0.1. So no matter how many bits Python (or any other language) chooses to use to represent floating point numbers, it will be able to represent only an approximation to 0.1. In most Python implementations, there are 53 bits of precision available for floating point numbers, so the significant digits stored for the decimal number 0.1 will be 11001100110011001100110011001100110011001100110011001 This is equivalent to the decimal number 0.1000000000000000055511151231257827021181583404541015625 Pretty close to 1/10, but not exactly 1/10. Chapter 3. Some Simple Numerical Programs 31 Returning to the original mystery, why does x = 0.0 for i in range(10): x = x + 0.1 if x == 1.0: print x, '= 1.0' else: print x, 'is not 1.0' print 1.0 is not 1.0 We now see that the test x == 1.0 produces the result False because the value to which x is bound is not exactly 1.0. What gets printed if we add to the end of the else clause the code print x == 10.0*0.1? It prints False because during at least one iteration of the loop Python ran out of significant digits and did some rounding. It’s not what our elementary school teachers taught us, but adding 0.1 ten times does not produce the same value as multiplying 0.1 by 10. Finally, why does the code print x print 1.0 rather than the actual value of the variable x? Because the designers of Python thought that would be convenient for users if print did some automatic rounding. This is probably an accurate assumption most of the time. However, it is important to keep in mind that what is being displayed does not necessarily exactly match the value stored in the machine. By the way, if you want to explicitly round a floating point number, use the round function. The expression round(x, numDigits) returns the floating point number equivalent to rounding the value of x to numDigits decimal digits following the decimal point. For example print round(2**0.5, 3) will print 1.414 as an approximation to the square root of 2. Does the difference between real and floating point numbers really matter? Most of the time, mercifully, it does not. However, one thing that is almost always worth worrying about is tests for equality. As we have seen, using == to compare two floating point values can produce a surprising result. It is almost always more appropriate to ask whether two floating point values are close enough to each other, not whether they are identical. So, for example, it is better to write abs(x-y) < 0.0001 rather than x == y. Another thing to worry about is the accumulation of rounding errors. Most of the time things work out OK because sometimes the number stored in the computer is a little bigger than intended, and sometimes it is a little smaller than intended. However, in some programs, the errors will all be in the same direction and accumulate over time. 32 Chapter 3. Some Simple Numerical Programs 3.5 Newton-Raphson The most commonly used approximation algorithm is usually attributed to Isaac Newton. It is typically called Newton’s method, but is sometimes referred to as the Newton-Raphson method.15 It can be used to find the real roots of many functions, but we shall look at it only in the context of finding the real roots of a polynomial with one variable. The generalization to polynomials with multiple variables is straightforward both mathematically and algorithmically. A polynomial with one variable (by convention, we will write the variable as x) is either zero or the sum of a finite number of nonzero terms, e.g., 3x2 + 2x + 3. Each term, e.g., 3x2, consists of a constant (the coefficient of the term, 3 in this case) multiplied by the variable (x in this case) raised to a nonnegative integer exponent (2 in this case). The exponent on a variable in a term is called the degree of that term. The degree of a polynomial is the largest degree of any single term. Some examples are, 3 (degree 0), 2.5x + 12 (degree 1), and 3x2 (degree 2). In contrast, 2/x and x0.5 are not polynomials. If p is a polynomial and r a real number, we will write p(r) to stand for the value of the polynomial when x = r. A root of the polynomial p is a solution to the equation p = 0, i.e., an r such that p(r) = 0. So, for example, the problem of finding an approximation to the square root of 24 can be formulated as finding an x such that x2 – 24 ≈ 0. Newton proved a theorem that implies that if a value, call it guess, is an approximation to a root of a polynomial, then guess – p(guess)/p’(guess), where p’ is the first derivative of p, is a better approximation.16 For any constant k and any coefficient c, the first derivative of cx2 + k is 2cx. For example, the first derivative of x2 – k is 2x. Therefore, we know that we can improve on the current guess, call it y, by choosing as our next guess y - (y2 - k)/2y. This is called successive approximation. Figure 3.5 contains code illustrating how to use this idea to quickly find an approximation to the square root. 15 Joseph Raphson published a similar method about the same time as Newton. 16 The first derivative of a function f(x) can be thought of as expressing how the value of f(x) changes with respect to changes in x. If you haven’t previously encountered derivatives, don’t worry. You don’t need to understand them, or for that matter polynomials, to understand the implementation of Newton’s method. Chapter 3. Some Simple Numerical Programs 33 #Newton-Raphson for square root #Find x such that x**2 - 24 is within epsilon of 0 epsilon = 0.01 k = 24.0 guess = k/2.0 while abs(guess*guess - k) >= epsilon: guess = guess - (((guess**2) - k)/(2*guess)) print 'Square root of', k, 'is about', guess Figure 3.5 Newton-Raphson method Finger exercise: Add some code to the implementation of Newton-Raphson that keeps track of the number of iterations used to find the root. Use that code as part of a program that compares the efficiency of Newton-Raphson and bisection search. (You should discover that Newton-Raphson is more efficient.) 4 FUNCTIONS, SCOPING, AND ABSTRACTION So far, we have introduced numbers, assignments, input/output, comparisons, and looping constructs. How powerful is this subset of Python? In a theoretical sense, it is as powerful as you will ever need. Such languages are called Turing complete. This means that if a problem can be solved via computation, it can be solved using only those statements you have already seen. Which isn’t to say that you should use only these statements. At this point we have covered a lot of language mechanisms, but the code has been a single sequence of instructions, all merged together. For example, in the last chapter we looked at the code in Figure 4.1. x = 25 epsilon = 0.01 numGuesses = 0 low = 0.0 high = max(1.0, x) ans = (high + low)/2.0 while abs(ans**2 - x) >= epsilon: numGuesses += 1 if ans**2 < x: low = ans else: high = ans ans = (high + low)/2.0 print 'numGuesses =', numGuesses print ans, 'is close to square root of', x Figure 4.1 Using bisection search to approximate square root This is a reasonable piece of code, but it lacks general utility. It works only for values denoted by the variables x and epsilon. This means that if we want to reuse it, we need to copy the code, possibly edit the variable names, and paste it where we want it. Because of this we cannot easily use this computation inside of some other, more complex, computation. Furthermore, if we want to compute cube roots rather than square roots, we have to edit the code. If we want a program that computes both square and cube roots (or for that matter square roots in two different places), the program would contain multiple chunks of almost identical code. This is a very bad thing. The more code a program contains, the more chance there is for something to go wrong, and the harder the code is to maintain. Imagine, for example, that there was an error in the initial implementation of square root, and that the error came to light when testing the program. It would be all too easy to fix the implementation of square root in one place and forget that there was similar code elsewhere that was also in need of repair. Python provides several linguistic features that make it relatively easy to generalize and reuse code. The most important is the function. Chapter 4. Functions, Scoping, and Abstraction 35 4.1 Functions and Scoping We’ve already used a number of built-in functions, e.g., max and abs in Figure 4.1. The ability for programmers to define and then use their own functions, as if they were built-in, is a qualitative leap forward in convenience. 4.1.1 Function Definitions In Python each function definition is of the form17 def name of function (list of formal parameters): body of function For example, we could define the function max18 by the code def max(x, y): if x > y: return x else: return y def is a reserved word that tells Python that a function is about to be defined. The function name (max in this example) is simply a name that is used to refer to the function. The sequence of names (x,y in this example) within the parentheses following the function name are the formal parameters of the function. When the function is used, the formal parameters are bound (as in an assignment statement) to the actual parameters (often referred to as arguments) of the function invocation (also referred to as a function call). For example, the invocation max(3, 4) binds x to 3 and y to 4. The function body is any piece of Python code. There is, however, a special statement, return, that can be used only within the body of a function. A function call is an expression, and like all expressions it has a value. That value is the value returned by the invoked function. For example, the value of the expression max(3,4)*max(3,2) is 12, because the first invocation of max returns the int 4 and the second returns the int 3. Note that execution of a return statement terminates that invocation of the function. To recapitulate, when a function is called 1. The expressions that make up the actual parameters are evaluated, and the formal parameters of the function are bound to the resulting values. For example, the invocation max(3+4, z) will bind the formal parameter x to 7 and the formal parameter y to whatever value the variable z has when the invocation is evaluated. 17 Recall that italics is used to describe Python code. 18 In practice, you would probably use the built-in function max, rather than define your own. 36 Chapter 4. Functions, Scoping, and Abstraction 2. The point of execution (the next instruction to be executed) moves from the point of invocation to the first statement in the body of the function. 3. The code in the body of the function is executed until either a return statement is encountered, in which case the value of the expression following the return becomes the value of the function invocation, or there are no more statements to execute, in which case the function returns the value None. (If no expression follows the return, the value of the invocation is None.) 4. The value of the invocation is the returned value. 5. The point of execution is transferred back to the code immediately following the invocation. Parameters provide something called lambda abstraction,19 allowing programmers to write code that manipulates not specific objects, but instead whatever objects the caller of the function chooses to use as actual parameters. Finger exercise: Write a function isIn that accepts two strings as arguments and returns True if either string occurs anywhere in the other, and False otherwise. Hint: you might want to use the built-in str operation in. 4.1.2 Keyword Arguments and Default Values In Python, there are two ways that formal parameters get bound to actual parameters. The most common method, which is the only one we have used thus far, is called positional—the first formal parameter is bound to the first actual parameter, the second formal to the second actual, etc. Python also supports what it calls keyword arguments, in which formals are bound to actuals using the name of the formal parameter. Consider the function definition in Figure 4.2. The function printName assumes that firstName and lastName are strings and that reverse is a Boolean. If reverse == True, it prints lastName, firstName, otherwise it prints firstName lastName. def printName(firstName, lastName, reverse): if reverse: print lastName + ', ' + firstName else: print firstName, lastName Figure 4.2 Function that prints a name Each of the following is an equivalent invocation of printName: printName('Olga', 'Puchmajerova', False) printName('Olga', 'Puchmajerova', False) printName('Olga', 'Puchmajerova', reverse = False) printName('Olga', lastName = 'Puchmajerova', reverse = False) printName(lastName='Puchmajerova', firstName='Olga', reverse=False) 19 The name “lambda abstraction” is derived from some mathematics developed by Alonzo Church in the 1930s and 1940s. Chapter 4. Functions, Scoping, and Abstraction 37 Though the keyword arguments can appear in any order in the list of actual parameters, it is not legal to follow a keyword argument with a non-keyword argument. Therefore, an error message would be produced by printName('Olga', lastName = 'Puchmajerova', False) Keyword arguments are commonly used in conjunction with default parameter values. We can, for example, write def printName(firstName, lastName, reverse = False): if reverse: print lastName + ', ' + firstName else: print firstName, lastName Default values allow programmers to call a function with fewer than the specified number of arguments. For example, printName('Olga', 'Puchmajerova') printName('Olga', 'Puchmajerova', True) printName('Olga', 'Puchmajerova', reverse = True) will print Olga Puchmajerova Puchmajerova, Olga Puchmajerova, Olga The last two invocations of printName are semantically equivalent. The last one has the advantage of providing some documentation for the perhaps mysterious parameter True. 4.1.3 Scoping Let’s look at another small example, def f(x): #name x used as formal parameter y = 1 x = x + y print 'x =', x return x x = 3 y = 2 z = f(x) #value of x used as actual parameter print 'z =', z print 'x =', x print 'y =', y When run, this code prints, x = 4 z = 4 x = 3 y = 2 What is going on here? At the call of f, the formal parameter x is locally bound to the value of the actual parameter x. It is important to note that though the actual and formal parameters have the same name, they are not the same variable. Each function defines a new name space, also called a scope. The 38 Chapter 4. Functions, Scoping, and Abstraction formal parameter x and the local variable y that are used in f exist only within the scope of the definition of f. The assignment statement x = x + y within the function body binds the local name x to the object 4. The assignments in f have no effect at all on the bindings of the names x and y that exist outside the scope of f. Here’s one way to think about this: At top level, i.e., the level of the shell, a symbol table keeps track of all names defined at that level and their current bindings. When a function is called, a new symbol table (sometimes called a stack frame) is created. This table keeps track of all names defined within the function (including the formal parameters) and their current bindings. If a function is called from within the function body, yet another stack frame is created. When the function completes, its stack frame goes away. In Python, one can always determine the scope of a name by looking at the program text. This is called static or lexical scoping. Figure 4.3 contains a slightly more elaborate example. def f(x): def g(): x = 'abc' print 'x =', x def h(): z = x print 'z =', z x = x + 1 print 'x =', x h() g() print 'x =', x return g x = 3 z = f(x) print 'x =', x print 'z =', z z() Figure 4.3 Nested scopes The history of the stack frames associated with the code in Figure 4.3 is depicted in Figure 4.4. Chapter 4. Functions, Scoping, and Abstraction 39 Figure 4.4 Stack frames The first column contains the set of names known outside the body of the function f, i.e., the variables x and z, and the function name f. The first assignment statement binds x to 3. The assignment statement z = f(x) first evaluates the expression f(x) by invoking the function f with the value to which x is bound. When f is entered, a stack frame is created, as shown in column 2. The names in the stack frame are x (the formal parameter, not the x in the calling context), g and h. The variables g and h are bound to objects of type function. The properties of each of these functions are given by the function definitions within f. When h is invoked from within f, yet another stack frame is created, as shown in column 3. This frame contains only the local variable z. Why does it not also contain x? A name is added to the scope associated with a function only if that name is either a formal parameter of the function or a variable that is bound to an object within the body of the function. In the body of h, x occurs only on the right-hand side of an assignment statement. The appearance of a name (x in this case) that is not bound anywhere in the function body (the body of h) causes the interpreter to search the previous stack frame associated with the scope within which the function is defined (the stack frame associated with f). If the name is found (which it is in this case) the value to which it is bound (4) is used. If it is not found there, an error message is produced. When h returns, the stack frame associated with the invocation of h goes away (i.e., it is popped off the top of the stack), as depicted in column 4. Note that we never remove frames from the middle of the stack, but only the most recently added frame. It is because it has this “last in first out” behavior that we refer to it as a stack (think of a stack of trays waiting to be taken in a cafeteria). Next g is invoked, and a stack frame containing g’s local variable x is added (column 5). When g returns, that frame is popped (column 6). When f returns, the stack frame containing the names associated with f is popped, getting us back to the original stack frame (column 7). Notice that when f returns, even though the variable g no longer exists, the object of type function to which that name was once bound still exists. This is 40 Chapter 4. Functions, Scoping, and Abstraction because functions are objects, and can be returned just like any other kind of object. So, z can be bound to the value returned by f, and the function call z() can be used to invoke the function that was bound to the name g within f—even though the name g is not known outside the context of f. So, what does the code in Figure 4.3 print? It prints x = 4 z = 4 x = abc x = 4 x = 3 z = x = abc The order in which references to a name occur is not germane. If an object is bound to a name anywhere in the function body (even if it occurs in an expression before it appears as the left-hand-side of an assignment), it is treated as local to that function.20 Consider, for example, the code def f(): print x def g(): print x x = 1 x = 3 f() x = 3 g() It prints 3 when f is invoked, but an error message is printed when it encounters the print statement in g because the assignment statement following the print statement causes x to be local to g. And because x is local to g, it has no value when the print statement is executed. Confused yet? It takes most people a bit of time to get their head around scope rules. Don’t let this bother you. For now, charge ahead and start using functions. Most of the time you will find that you only want to use variables that are local to a function, and the subtleties of scoping will be irrelevant. 20 The wisdom of this language design decision is debatable. Chapter 4. Functions, Scoping, and Abstraction 41 4.2 Specifications Figure 4.5 defines a function, findRoot, that generalizes the bisection search we used to find square roots in Figure 4.1. It also contains a function, testFindRoot, that can be used to test whether or not findRoot works as intended. The function testFindRoot is almost as long as findRoot itself. To inexperienced programmers, writing test functions such as this often seems to be a waste of effort. Experienced programmers know, however, that an investment in writing testing code often pays big dividends. It certainly beats typing test cases into the shell over and over again during debugging (the process of finding out why a program does not work, and then fixing it). It also forces us to think about which tests are likely to be most illuminating. The text between the triple quotation marks is called a docstring in Python. By convention, Python programmers use docstrings to provide specifications of functions. These docstrings can be accessed using the built-in function help. If we enter the shell and type help(abs), the system will display Help on built-in function abs in module __builtin__: abs(...) abs(number) -> number Return the absolute value of the argument. If the code in Figure 4.5 (below) has been loaded into IDLE, typing help(findRoot) in the shell will display Help on function findRoot in module __main__: findRoot(x, power, epsilon) Assumes x and epsilon int or float, power an int, epsilon > 0 & power >= 1 Returns float y such that y**power is within epsilon of x. If such a float does not exist, it returns None If we type findRoot( in either the shell or the editor, the list of formal parameters and the first line of the docstring will be displayed. 42 Chapter 4. Functions, Scoping, and Abstraction def findRoot(x, power, epsilon): """Assumes x and epsilon int or float, power an int, epsilon > 0 & power >= 1 Returns float y such that y**power is within epsilon of x. If such a float does not exist, it returns None""" if x < 0 and power%2 == 0: return None low = min(-1.0, x) high = max(1.0, x) ans = (high + low)/2.0 while abs(ans**power - x) >= epsilon: if ans**power < x: low = ans else: high = ans ans = (high + low)/2.0 return ans def testFindRoot(): epsilon = 0.0001 for x in (0.25, -0.25, 2, -2, 8, -8): for power in range(1, 4): print 'Testing x = ' + str(x) +\ ' and power = ' + str(power) result = findRoot(x, power, epsilon) if result == None: print ' No root' else: print ' ', result**power, '~=', x Figure 4.5 Finding an approximation to a root A specification of a function defines a contract between the implementer of a function and those who will be writing programs that use the function. We will refer to the users of a function as its clients. This contract can be thought of as containing two parts: 1. Assumptions: These describe conditions that must be met by clients of the function. Typically, they describe constraints on the actual parameters. Almost always, they specify the acceptable set of types for each parameter, and not infrequently some constraints on the value of one or

Introduction to Computation and Programming Using Python (2nd Edition) PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue