Algorithms 4th Edition by Robert Sedgewick, Kevin Wayne PDF
Document Details
Robert Sedgewick, Kevin Wayne
Tags
Related
Summary
This book covers a wide variety of computer algorithms and data structures suited for use in computational areas. It is based on complete implementations in Java and detailed performance descriptions. The algorithms are suitable for advanced undergraduate computer science students and those pursuing computer science related fields.
Full Transcript
Algorithms FOURTH EDITION http://avaxhome.ws/blogs/ChrisRedfield This page intentionally left blank Algorithms FOURTH EDITION Robert Sedgewick and Kevin Wayne Princeton University Upper Saddle River,...
Algorithms FOURTH EDITION http://avaxhome.ws/blogs/ChrisRedfield This page intentionally left blank Algorithms FOURTH EDITION Robert Sedgewick and Kevin Wayne Princeton University Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid Capetown Sydney Tokyo Singapore Mexico City Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals. The authors and publisher have taken care in the preparation of this book, but make no expressed or im- plied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions and/or custom covers and content particular to your business, training goals, marketing focus, and branding interests. For more information, please contact: U.S. Corporate and Government Sales (800) 382-3419 [email protected] For sales outside the United States, please contact: International Sales [email protected] Visit us on the Web: informit.com/aw Cataloging-in-Publication Data is on file with the Library of Congress. Copyright © 2011 Pearson Education, Inc. All rights reserved. Printed in the United States of America. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. For information regarding permissions, write to: Pearson Education, Inc. Rights and Contracts Department 501 Boylston Street, Suite 900 Boston, MA 02116 Fax: (617) 671-3447 ISBN-13: 978-0-321-57351-3 ISBN-10: 0-321-57351-X Text printed in the United States on recycled paper at Courier in Westford, Massachusetts. First printing, March 2011 ______________________________ To Adam, Andrew, Brett, Robbie and especially Linda ______________________________ ___________________ To Jackie and Alex ___________________ CONTENTS Preface.........................viii 1 Fundamentals......................3 1.1 Basic Programming Model 8 1.2 Data Abstraction 64 1.3 Bags, Queues, and Stacks 120 1.4 Analysis of Algorithms 172 1.5 Case Study: Union-Find 216 2 Sorting....................... 243 2.1 Elementary Sorts 244 2.2 Mergesort 270 2.3 Quicksort 288 2.4 Priority Queues 308 2.5 Applications 336 3 Searching...................... 361 3.1 Symbol Tables 362 3.2 Binary Search Trees 396 3.3 Balanced Search Trees 424 3.4 Hash Tables 458 3.5 Applications 486 vi 4 Graphs....................... 515 4.1 Undirected Graphs 518 4.2 Directed Graphs 566 4.3 Minimum Spanning Trees 604 4.4 Shortest Paths 638 5 Strings....................... 695 5.1 String Sorts 702 5.2 Tries 730 5.3 Substring Search 758 5.4 Regular Expressions 788 5.5 Data Compression 810 6 Context....................... 853 Index......................... 933 Algorithms...................... 954 Clients........................ 955 vii PREFACE T his book is intended to survey the most important computer algorithms in use today, and to teach fundamental techniques to the growing number of people in need of knowing them. It is intended for use as a textbook for a second course in computer science, after students have acquired basic programming skills and familiarity with computer systems. The book also may be useful for self-study or as a reference for people engaged in the development of computer systems or applications programs, since it contains implemen- tations of useful algorithms and detailed information on performance characteristics and clients. The broad perspective taken makes the book an appropriate introduction to the field. the study of algorithms and data structures is fundamental to any computer- science curriculum, but it is not just for programmers and computer-science students. Every- one who uses a computer wants it to run faster or to solve larger problems. The algorithms in this book represent a body of knowledge developed over the last 50 years that has become indispensable. From N-body simulation problems in physics to genetic-sequencing problems in molecular biology, the basic methods described here have become essential in scientific research; from architectural modeling systems to aircraft simulation, they have become es- sential tools in engineering; and from database systems to internet search engines, they have become essential parts of modern software systems. And these are but a few examples—as the scope of computer applications continues to grow, so grows the impact of the basic methods covered here. Before developing our fundamental approach to studying algorithms, we develop data types for stacks, queues, and other low-level abstractions that we use throughout the book. Then we survey fundamental algorithms for sorting, searching, graphs, and strings. The last chapter is an overview placing the rest of the material in the book in a larger context. viii Distinctive features The orientation of the book is to study algorithms likely to be of practical use. The book teaches a broad variety of algorithms and data structures and pro- vides sufficient information about them that readers can confidently implement, debug, and put them to work in any computational environment. The approach involves: Algorithms. Our descriptions of algorithms are based on complete implementations and on a discussion of the operations of these programs on a consistent set of examples. Instead of presenting pseudo-code, we work with real code, so that the programs can quickly be put to practical use. Our programs are written in Java, but in a style such that most of our code can be reused to develop implementations in other modern programming languages. Data types. We use a modern programming style based on data abstraction, so that algo- rithms and their data structures are encapsulated together. Applications. Each chapter has a detailed description of applications where the algorithms described play a critical role. These range from applications in physics and molecular biology, to engineering computers and systems, to familiar tasks such as data compression and search- ing on the web. A scientific approach. We emphasize developing mathematical models for describing the performance of algorithms, using the models to develop hypotheses about performance, and then testing the hypotheses by running the algorithms in realistic contexts. Breadth of coverage. We cover basic abstract data types, sorting algorithms, searching al- gorithms, graph processing, and string processing. We keep the material in algorithmic con- text, describing data structures, algorithm design paradigms, reduction, and problem-solving models. We cover classic methods that have been taught since the 1960s and new methods that have been invented in recent years. Our primary goal is to introduce the most important algorithms in use today to as wide an audience as possible. These algorithms are generally ingenious creations that, remarkably, can each be expressed in just a dozen or two lines of code. As a group, they represent problem- solving power of amazing scope. They have enabled the construction of computational ar- tifacts, the solution of scientific problems, and the development of commercial applications that would not have been feasible without them. ix Booksite An important feature of the book is its relationship to the booksite algs4.cs.princeton.edu. This site is freely available and contains an extensive amount of material about algorithms and data structures, for teachers, students, and practitioners, in- cluding: An online synopsis. The text is summarized in the booksite to give it the same overall struc- ture as the book, but linked so as to provide easy navigation through the material. Full implementations. All code in the book is available on the booksite, in a form suitable for program development. Many other implementations are also available, including advanced implementations and improvements described in the book, answers to selected exercises, and client code for various applications. The emphasis is on testing algorithms in the context of meaningful applications. Exercises and answers. The booksite expands on the exercises in the book by adding drill exercises (with answers available with a click), a wide variety of examples illustrating the reach of the material, programming exercises with code solutions, and challenging problems. Dynamic visualizations. Dynamic simulations are impossible in a printed book, but the website is replete with implementations that use a graphics class to present compelling visual demonstrations of algorithm applications. Course materials. A complete set of lecture slides is tied directly to the material in the book and on the booksite. A full selection of programming assignments, with check lists, test data, and preparatory material, is also included. Links to related material. Hundreds of links lead students to background information about applications and to resources for studying algorithms. Our goal in creating this material was to provide a complementary approach to the ideas. Generally, you should read the book when learning specific algorithms for the first time or when trying to get a global picture, and you should use the booksite as a reference when pro- gramming or as a starting point when searching for more detail while online. x Use in the curriculum The book is intended as a textbook in a second course in com- puter science. It provides full coverage of core material and is an excellent vehicle for stu- dents to gain experience and maturity in programming, quantitative reasoning, and problem- solving. Typically, one course in computer science will suffice as a prerequisite—the book is intended for anyone conversant with a modern programming language and with the basic features of modern computer systems. The algorithms and data structures are expressed in Java, but in a style accessible to people fluent in other modern languages. We embrace modern Java abstractions (including generics) but resist dependence upon esoteric features of the language. Most of the mathematical material supporting the analytic results is self-contained (or is labeled as beyond the scope of this book), so little specific preparation in mathematics is required for the bulk of the book, although mathematical maturity is definitely helpful. Ap- plications are drawn from introductory material in the sciences, again self-contained. The material covered is a fundamental background for any student intending to major in computer science, electrical engineering, or operations research, and is valuable for any student with interests in science, mathematics, or engineering. Context The book is intended to follow our introductory text, An Introduction to Pro- gramming in Java: An Interdisciplinary Approach, which is a broad introduction to the field. Together, these two books can support a two- or three-semester introduction to computer sci- ence that will give any student the requisite background to successfully address computation in any chosen field of study in science, engineering, or the social sciences. The starting point for much of the material in the book was the Sedgewick series of Al- gorithms books. In spirit, this book is closest to the first and second editions of that book, but this text benefits from decades of experience teaching and learning that material. Sedgewick’s current Algorithms in C/C++/Java, Third Edition is more appropriate as a reference or a text for an advanced course; this book is specifically designed to be a textbook for a one-semester course for first- or second-year college students and as a modern introduction to the basics and a reference for use by working programmers. xi Acknowledgments This book has been nearly 40 years in the making, so full recogni- tion of all the people who have made it possible is simply not feasible. Earlier editions of this book list dozens of names, including (in alphabetical order) Andrew Appel, Trina Avery, Marc Brown, Lyn Dupré, Philippe Flajolet, Tom Freeman, Dave Hanson, Janet Incerpi, Mike Schid- lowsky, Steve Summit, and Chris Van Wyk. All of these people deserve acknowledgement, even though some of their contributions may have happened decades ago. For this fourth edition, we are grateful to the hundreds of students at Princeton and several other institutions who have suffered through preliminary versions of the work, and to readers around the world for sending in comments and corrections through the booksite. We are grateful for the support of Princeton University in its unwavering commitment to excellence in teaching and learning, which has provided the basis for the development of this work. Peter Gordon has provided wise counsel throughout the evolution of this work almost from the beginning, including a gentle introduction of the “back to the basics” idea that is the foundation of this edition. For this fourth edition, we are grateful to Barbara Wood for her careful and professional copyediting, to Julie Nahil for managing the production, and to many others at Pearson for their roles in producing and marketing the book. All were ex- tremely responsive to the demands of a rather tight schedule without the slightest sacrifice to the quality of the result. Robert Sedgewick Kevin Wayne Princeton, NJ January, 2011 xii This page intentionally left blank ONE Fundamentals 1.1 Basic Programming Model......... 8 1.2 Data Abstraction.............. 64 1.3 Bags, Queues, and Stacks....... 120 1.4 Analysis of Algorithms......... 172 1.5 Case Study: Union-Find......... 216 T he objective of this book is to study a broad variety of important and useful algorithms—methods for solving problems that are suited for computer imple- mentation. Algorithms go hand in hand with data structures—schemes for or- ganizing data that leave them amenable to efficient processing by an algorithm. This chapter introduces the basic tools that we need to study algorithms and data structures. First, we introduce our basic programming model. All of our programs are imple- mented using a small subset of the Java programming language plus a few of our own libraries for input/output and for statistical calculations. Section 1.1 is a summary of language constructs, features, and libraries that we use in this book. Next, we emphasize data abstraction, where we define abstract data types (ADTs) in the service of modular programming. In Section 1.2 we introduce the process of im- plementing an ADT in Java, by specifying an applications programming interface (API) and then using the Java class mechanism to develop an implementation for use in client code. As important and useful examples, we next consider three fundamental ADTs: the bag, the queue, and the stack. Section 1.3 describes APIs and implementations of bags, queues, and stacks using arrays, resizing arrays, and linked lists that serve as models and starting points for algorithm implementations throughout the book. Performance is a central consideration in the study of algorithms. Section 1.4 de- scribes our approach to analyzing algorithm performance. The basis of our approach is the scientific method: we develop hypotheses about performance, create mathematical models, and run experiments to test them, repeating the process as necessary. We conclude with a case study where we consider solutions to a connectivity problem that uses algorithms and data structures that implement the classic union-find ADT. 3 4 CHAPTER 1 Fundamentals Algorithms When we write a computer program, we are generally implementing a method that has been devised previously to solve some problem. This method is often independent of the particular programming language being used—it is likely to be equally appropriate for many computers and many programming languages. It is the method, rather than the computer program itself, that specifies the steps that we can take to solve the problem. The term algorithm is used in computer science to describe a finite, deterministic, and effective problem-solving method suitable for implementa- tion as a computer program. Algorithms are the stuff of computer science: they are central objects of study in the field. We can define an algorithm by describing a procedure for solving a problem in a natural language, or by writing a computer program that implements the procedure, as shown at right for Euclid’s algorithm for finding the greatest common divisor of two numbers, a variant of which was devised over 2,300 years ago. If you are not familiar English-language description with Euclid’s algorithm, you are encour- Compute the greatest common divisor of aged to work Exercise 1.1.24 and Exercise two nonnegative integers p and q as follows: 1.1.25, perhaps after reading Section 1.1. In If q is 0, the answer is p. If not, divide p by q and take the remainder r. The answer is the this book, we use computer programs to de- greatest common divisor of q and r. scribe algorithms. One important reason for doing so is that it makes easier the task of Java-language description checking whether they are finite, determin- public static int gcd(int p, int q) { istic, and effective, as required. But it is also if (q == 0) return p; important to recognize that a program in a int r = p % q; return gcd(q, r); particular language is just one way to express } an algorithm. The fact that many of the al- Euclid’s algorithm gorithms in this book have been expressed in multiple programming languages over the past several decades reinforces the idea that each algorithm is a method suitable for implementation on any computer in any programming language. Most algorithms of interest involve organizing the data involved in the computa- tion. Such organization leads to data structures, which also are central objects of study in computer science. Algorithms and data structures go hand in hand. In this book we take the view that data structures exist as the byproducts or end products of algorithms and that we must therefore study them in order to understand the algorithms. Simple algorithms can give rise to complicated data structures and, conversely, complicated algorithms can use simple data structures. We shall study the properties of many data structures in this book; indeed, we might well have titled the book Algorithms and Data Structures. CHAPTER 1 Fundamentals 5 When we use a computer to help us solve a problem, we typically are faced with a number of possible approaches. For small problems, it hardly matters which approach we use, as long as we have one that correctly solves the problem. For huge problems (or applications where we need to solve huge numbers of small problems), however, we quickly become motivated to devise methods that use time and space efficiently. The primary reason to learn about algorithms is that this discipline gives us the potential to reap huge savings, even to the point of enabling us to do tasks that would otherwise be impossible. In an application where we are processing millions of objects, it is not unusual to be able to make a program millions of times faster by using a well- designed algorithm. We shall see such examples on numerous occasions throughout the book. By contrast, investing additional money or time to buy and install a new computer holds the potential for speeding up a program by perhaps a factor of only 10 or 100. Careful algorithm design is an extremely effective part of the process of solving a huge problem, whatever the applications area. When developing a huge or complex computer program, a great deal of effort must go into understanding and defining the problem to be solved, managing its complex- ity, and decomposing it into smaller subtasks that can be implemented easily. Often, many of the algorithms required after the decomposition are trivial to implement. In most cases, however, there are a few algorithms whose choice is critical because most of the system resources will be spent running those algorithms. These are the types of algorithms on which we concentrate in this book. We study fundamental algorithms that are useful for solving challenging problems in a broad variety of applications areas. The sharing of programs in computer systems is becoming more widespread, so although we might expect to be using a large fraction of the algorithms in this book, we also might expect to have to implement only a small fraction of them. For example, the Java libraries contain implementations of a host of fundamental algorithms. However, implementing simple versions of basic algorithms helps us to understand them bet- ter and thus to more effectively use and tune advanced versions from a library. More important, the opportunity to reimplement basic algorithms arises frequently. The pri- mary reason to do so is that we are faced, all too often, with completely new computing environments (hardware and software) with new features that old implementations may not use to best advantage. In this book, we concentrate on the simplest reasonable implementations of the best algorithms. We do pay careful attention to coding the criti- cal parts of the algorithms, and take pains to note where low-level optimization effort could be most beneficial. The choice of the best algorithm for a particular task can be a complicated process, perhaps involving sophisticated mathematical analysis. The branch of computer sci- ence that comprises the study of such questions is called analysis of algorithms. Many 6 CHAPTER 1 Fundamentals of the algorithms that we study have been shown through analysis to have excellent theoretical performance; others are simply known to work well through experience. Our primary goal is to learn reasonable algorithms for important tasks, yet we shall also pay careful attention to comparative performance of the methods. We should not use an algorithm without having an idea of what resources it might consume, so we strive to be aware of how our algorithms might be expected to perform. Summary of topics As an overview, we describe the major parts of the book, giv- ing specific topics covered and an indication of our general orientation toward the material. This set of topics is intended to touch on as many fundamental algorithms as possible. Some of the areas covered are core computer-science areas that we study in depth to learn basic algorithms of wide applicability. Other algorithms that we discuss are from advanced fields of study within computer science and related fields. The algo- rithms that we consider are the products of decades of research and development and continue to play an essential role in the ever-expanding applications of computation. Fundamentals (Chapter 1) in the context of this book are the basic principles and methodology that we use to implement, analyze, and compare algorithms. We consider our Java programming model, data abstraction, basic data structures, abstract data types for collections, methods of analyzing algorithm performance, and a case study. Sorting algorithms (Chapter 2) for rearranging arrays in order are of fundamental importance. We consider a variety of algorithms in considerable depth, including in- sertion sort, selection sort, shellsort, quicksort, mergesort, and heapsort. We also en- counter algorithms for several related problems, including priority queues, selection, and merging. Many of these algorithms will find application as the basis for other algo- rithms later in the book. Searching algorithms (Chapter 3) for finding specific items among large collections of items are also of fundamental importance. We discuss basic and advanced methods for searching, including binary search trees, balanced search trees, and hashing. We note relationships among these methods and compare performance. Graphs (Chapter 4) are sets of objects and connections, possibly with weights and orientation. Graphs are useful models for a vast number of difficult and important problems, and the design of algorithms for processing graphs is a major field of study. We consider depth-first search, breadth-first search, connectivity problems, and sev- eral algorithms and applications, including Kruskal’s and Prim’s algorithms for finding minimum spanning tree and Dijkstra’s and the Bellman-Ford algorithms for solving shortest-paths problems. CHAPTER 1 Fundamentals 7 Strings (Chapter 5) are an essential data type in modern computing applications. We consider a range of methods for processing sequences of characters. We begin with faster algorithms for sorting and searching when keys are strings. Then we consider substring search, regular expression pattern matching, and data-compression algo- rithms. Again, an introduction to advanced topics is given through treatment of some elementary problems that are important in their own right. Context (Chapter 6) helps us relate the material in the book to several other advanced fields of study, including scientific computing, operations research, and the theory of computing. We survey event-based simulation, B-trees, suffix arrays, maximum flow, and other advanced topics from an introductory viewpoint to develop appreciation for the interesting advanced fields of study where algorithms play a critical role. Finally, we describe search problems, reduction, and NP-completeness to introduce the theoretical underpinnings of the study of algorithms and relationships to material in this book. The study of algorithms is interesting and exciting because it is a new field (almost all the algorithms that we study are less than 50 years old, and some were just recently discovered) with a rich tradition (a few algorithms have been known for hun- dreds of years). New discoveries are constantly being made, but few algorithms are completely understood. In this book we shall consider intricate, complicated, and diffi- cult algorithms as well as elegant, simple, and easy ones. Our challenge is to understand the former and to appreciate the latter in the context of scientific and commercial ap- plications. In doing so, we shall explore a variety of useful tools and develop a style of algorithmic thinking that will serve us well in computational challenges to come. 1.1 BASIC PROGRAMMING MODEL Our study of algorithms is based upon implementing them as programs written in the Java programming language. We do so for several reasons: Our programs are concise, elegant, and complete descriptions of algorithms. You can run the programs to study properties of the algorithms. You can put the algorithms immediately to good use in applications. These are important and significant advantages over the alternatives of working with English-language descriptions of algorithms. A potential downside to this approach is that we have to work with a specific pro- gramming language, possibly making it difficult to separate the idea of the algorithm from the details of its implementation. Our implementations are designed to mitigate this difficulty, by using programming constructs that are both found in many modern languages and needed to adequately describe the algorithms. We use only a small subset of Java. While we stop short of formally defining the subset that we use, you will see that we make use of relatively few Java constructs, and that we emphasize those that are found in many modern programming languages. The code that we present is complete, and our expectation is that you will download it and execute it, on our test data or test data of your own choosing. We refer to the programming constructs, software libraries, and operating system features that we use to implement and describe algorithms as our programming model. In this section and Section 1.2, we fully describe this programming model. The treat- ment is self-contained and primarily intended for documentation and for your refer- ence in understanding any code in the book. The model we describe is the same model introduced in our book An Introduction to Programming in Java: An Interdisciplinary Approach, which provides a slower-paced introduction to the material. For reference, the figure on the facing page depicts a complete Java program that illustrates many of the basic features of our programming model. We use this code for examples when discussing language features, but defer considering it in detail to page 46 (it implements a classic algorithm known as binary search and tests it for an applica- tion known as whitelist filtering). We assume that you have experience programming in some modern language, so that you are likely to recognize many of these features in this code. Page references are included in the annotations to help you find answers to any questions that you might have. Since our code is somewhat stylized and we strive to make consistent use of various Java idioms and constructs, it is worthwhile even for experienced Java programmers to read the information in this section. 8 1.1 Basic Programming Model 9 import a Java library (see page 27) import java.util.Arrays; code must be in file BinarySearch.java (see page 26) parameter public class BinarySearch variables { static method (see page 22) public static int rank(int key, int[] a) initializing { return type parameter type declaration statement int lo = 0; (see page 16) int hi = a.length - 1; while (lo a[mid]) lo = mid + 1; else return mid; } return -1; } return statement system calls main() unit test client (see page 26) public static void main(String[] args) { no return value; just side effects (see page 24) int[] whitelist = In.readInts(args); Arrays.sort(whitelist); call a method in a Java library (see page 27) call a method in our standard library; while (!StdIn.isEmpty()) need to download code (see page 27) { int key = StdIn.readInt(); call a local method conditional statement if (rank(key, whitelist) == -1) (see page 27) (see page 15) StdOut.println(key); } } } system passes argument value "whitelist.txt" to main() command line (see page 36) file name (args) % java BinarySearch largeW.txt < largeT.txt StdOut 499569 (see page 37) file redirectd from StdIn 984875 (see page 40)... Anatomy of a Java program and its invocation from the command line 10 CHAPTER 1 Fundamentals Basic structure of a Java program A Java program (class) is either a library of static methods (functions) or a data type definition. To create libraries of static methods and data-type definitions, we use the following five components, the basis of program- ming in Java and many other modern languages: Primitive data types precisely define the meaning of terms like integer, real num- ber, and boolean value within a computer program. Their definition includes the set of possible values and operations on those values, which can be combined into expressions like mathematical expressions that define values. Statements allow us to define a computation by creating and assigning values to variables, controlling execution flow, or causing side effects. We use six types of statements: declarations, assignments, conditionals, loops, calls, and returns. Arrays allow us to work with multiple values of the same type. Static methods allow us to encapsulate and reuse code and to develop programs as a set of independent modules. Strings are sequences of characters. Some operations on them are built in to Java. Input/output sets up communication between programs and the outside world. Data abstraction extends encapsulation and reuse to allow us to define non- primitive data types, thus supporting object-oriented programming. In this section, we will consider the first five of these in turn. Data abstraction is the topic of the next section. Running a Java program involves interacting with an operating system or a program development environment. For clarity and economy, we describe such actions in terms of a virtual terminal, where we interact with programs by typing commands to the system. See the booksite for details on using a virtual terminal on your system, or for information on using one of the many more advanced program development environ- ments that are available on modern systems. For example, BinarySearch is two static methods, rank() and main(). The first static method, rank(), is four statements: two declarations, a loop (which is itself an as- signment and two conditionals), and a return. The second, main(), is three statements: a declaration, a call, and a loop (which is itself an assignment and a conditional). To invoke a Java program, we first compile it using the javac command, then run it us- ing the java command. For example, to run BinarySearch, we first type the command javac BinarySearch.java (which creates a file BinarySearch.class that contains a lower-level version of the program in Java bytecode in the file BinarySearch.class). Then we type java BinarySearch (followed by a whitelist file name) to transfer con- trol to the bytecode version of the program. To develop a basis for understanding the effect of these actions, we next consider in detail primitive data types and expressions, the various kinds of Java statements, arrays, static methods, strings, and input/output. 1.1 Basic Programming Model 11 Primitive data types and expressions A data type is a set of values and a set of operations on those values. We begin by considering the following four primitive data types that are the basis of the Java language: Integers, with arithmetic operations (int) Real numbers, again with arithmetic operations (double) Booleans, the set of values { true, false } with logical operations (boolean) Characters, the alphanumeric characters and symbols that you type (char) Next we consider mechanisms for specifying values and operations for these types. A Java program manipulates variables that are named with identifiers. Each variable is associated with a data type and stores one of the permissible data-type values. In Java code, we use expressions like familiar mathematical expressions to apply the operations associated with each type. For primitive types, we use identifiers to refer to variables, operator symbols such as + - * / to specify operations, literals such as 1 or 3.14 to specify values, and expressions such as (x + 2.236)/2 to specify operations on values. The purpose of an expression is to define one of the data-type values. term examples definition a set of values and a set of primitive int double boolean char operations on those values data type (built in to the Java language) a sequence of letters, digits, identifier a abc Ab$ a_b ab123 lo hi _, and $, the first of which is not a digit variable [any identifier] names a data-type value operator + - * / names a data-type operation int 1 0 -42 double 2.0 1.0e-15 3.14 source-code representation literal boolean true false of a value char 'a' '+' '9' '\n' int lo + (hi - lo)/2 a literal, a variable, or a sequence of operations on expression double 1.0e-15 * t boolean lo ) a construct that we have already defined, to indicate that we can use any instance of that construct where specified. In this case, represents an expression that has a boolean value, such as one involving a comparison operation, and represents a sequence of Java statements. It is possible to make formal definitions of and , but we refrain from going into that level of detail. The meaning of an if statement is self- explanatory: the statement(s) in the block are to be executed if and only if the boolean expression is true. The if-else statement: if () { } else { } allows for choosing between two alternative blocks of statements. Loops. Many computations are inherently repetitive. The basic Java construct for han- dling such computations has the following format: while () { } The while statement has the same form as the if statement (the only difference being the use of the keyword while instead of if), but the meaning is quite different. It is an instruction to the computer to behave as follows: if the boolean expression is false, do nothing; if the boolean expression is true, execute the sequence of statements in the block (just as with if) but then check the boolean expression again, execute the se- quence of statements in the block again if the boolean expression is true, and continue as long as the boolean expression is true. We refer to the statements in the block in a loop as the body of the loop. Break and continue. Some situations call for slightly more complicated control flow than provide by the basic if and while statements. Accordingly, Java supports two ad- ditional statements for use within while loops: The break statement, which immediately exits the loop The continue statement, which immediately begins the next iteration of the loop We rarely use these statements in the code in this book (and many programmers never use them), but they do considerably simplify code in certain instances. 16 CHAPTER 1 Fundamentals Shortcut notations There are several ways to express a given computation; we seek clear, elegant, and efficient code. Such code often takes advantage of the following widely used shortcuts (that are found in many languages, not just Java). Initializing declarations. We can combine a declaration with an assignment to ini- tialize a variable at the same time that it is declared (created). For example, the code int i = 1; creates an int variable named i and assigns it the initial value 1. A best practice is to use this mechanism close to first use of the variable (to limit scope). Implicit assignments. The following shortcuts are available when our purpose is to modify a variable’s value relative to its current value: Increment/decrement operators: i++ is the same as i = i + 1 and has the value i in an expression. Similarly, i-- is the same as i = i - 1. The code ++i and --i are the same except that the expression value is taken after the increment/ decrement, not before. Other compound operations: Prepending a binary operator to the = in an as- signment is equivalent to using the variable on the left as the first operand. For example, the code i/=2; is equivalent to the code i = i/2; Note that i += 1; has the same effect as i = i+1; (and i++). Single-statement blocks. If a block of statements in a conditional or a loop has only a single statement, the curly braces may be omitted. For notation. Many loops follow this scheme: initialize an index variable to some val- ue and then use a while loop to test a loop continuation condition involving the index variable, where the last statement in the while loop increments the index variable. You can express such loops compactly with Java’s for notation: for (; ; ) { } This code is, with only a few exceptions, equivalent to ; while () { ; } We use for loops to support this initialize-and-increment programming idiom. 1.1 Basic Programming Model 17 statement examples definition int i; create a variable of a specified type, declaration double c; named with a given identifier a = b + 3; assignment assign a data-type value to a variable discriminant = b*b - 4.0*c; initializing int i = 1; declaration that also assigns an declaration double c = 3.141592625; initial value implicit i++; i = i + 1; assignment i += 1; execute a statement, conditional (if) if (x < 0) x = -x; depending on boolean expression conditional if (x > y) max = x; execute one or the other statement, (if-else) else max = y; depending on boolean expression int v = 0; while (v 1e-15*t) t = (c/t + t) / 2.0; for (int i = 1; i err * t) statements, enclosed in curly braces). Ex- body t = (c/t + t) / 2.0; amples of static methods are shown in the return t; table on the facing page. } call on another method return statement Invoking a static method. A call on a static Anatomy of a static method method is its name followed by expressions that specify argument values in parenthe- ses, separated by commas. When the method call is part of an expression, the method computes a value and that value is used in place of the call in the expression. For ex- ample the call on rank() in BinarySearch() returns an int value. A method call followed by a semicolon is a statement that generally causes side effects. For example, the call Arrays.sort() in main() in BinarySearch is a call on the system method Arrays.sort() that has the side effect of putting the entries in the array in sorted order. When a method is called, its argument variables are initialized with the values of the corresponding expressions in the call. A return statement terminates a static method, returning control to the caller. If the static method is to compute a value, that value must be specified in a return statement (if such a static method can reach the end of its sequence of statements without a return, the compiler will report the error). 1.1 Basic Programming Model 23 task implementation public static int abs(int x) { absolute value of an if (x < 0) return -x; int value else return x; } public static double abs(double x) { absolute value of a if (x < 0.0) return -x; double value else return x; } public static boolean isPrime(int N) { if (N < 2) return false; primality test for (int i = 2; i*i 0) return Double.NaN; double err = 1e-15; square root double t = c; (Newton’s method) while (Math.abs(t - c/t) > err * t) t = (c/t + t) / 2.0; return t; } hypotenuse of public static double hypotenuse(double a, double b) a right triangle { return Math.sqrt(a*a + b*b); } public static double H(int N) { double sum = 0.0; Harmonic number for (int i = 1; i hi) return -1; int mid = lo + (hi - lo) / 2; if (key < a[mid]) return rank(key, a, lo, mid - 1); else if (key > a[mid]) return rank(key, a, mid + 1, hi); else return mid; } Recursive implementation of binary search 26 CHAPTER 1 Fundamentals Basic programming model. A library of static methods is a set of static methods that are defined in a Java class, by creating a file with the keywords public class followed by the class name, followed by the static methods, enclosed in braces, kept in a file with the same name as the class and a.java extension. A basic model for Java programming is to develop a program that addresses a specific computational task by creating a li- brary of static methods, one of which is named main(). Typing java followed by a class name followed by a sequence of strings leads to a call on main() in that class, with an array containing those strings as argument. After the last statement in main() executes, the program terminates. In this book, when we talk of a Java program for accomplishing a task, we are talking about code developed along these lines (possibly also including a data-type definition, as described in Section 1.2). For example, BinarySearch is a Java program composed of two static methods, rank() and main(), that accomplishes the task of printing numbers on an input stream that are not found in a whitelist file given as command-line argument. Modular programming. Of critical importance in this model is that libraries of stat- ic methods enable modular programming where we build libraries of static methods (modules) and a static method in one library can call static methods defined in other libraries. This approach has many important advantages. It allows us to Work with modules of reasonable size, even in program involving a large amount of code Share and reuse code without having to reimplement it Easily substitute improved implementations Develop appropriate abstract models for addressing programming problems Localize debugging (see the paragraph below on unit testing) For example, BinarySearch makes use of three other independently developed librar- ies, our StdIn and In library and Java’s Arrays library. Each of these libraries, in turn, makes use of several other libraries. Unit testing. A best practice in Java programming is to include a main() in every li- brary of static methods that tests the methods in the library (some other programming languages disallow multiple main() methods and thus do not support this approach). Proper unit testing can be a significant programming challenge in itself. At a minimum, every module should contain a main() method that exercises the code in the module and provides some assurance that it works. As a module matures, we often refine the main() method to be a development client that helps us do more detailed tests as we develop the code, or a test client that tests all the code extensively. As a client becomes more complicated, we might put it in an independent module. In this book, we use main() to help illustrate the purpose of each module and leave test clients for exercises. 1.1 Basic Programming Model 27 External libraries. We use static methods from four different kinds of libraries, each requiring (slightly) differing procedures for code reuse. Most of these are libraries of static methods, but a few are data-type definitions that also include some static methods. The standard system libraries java.lang.*. These include Math, which contains methods for commonly used mathematical functions; Integer and Double, which we use for converting between strings of characters and int and double values; String and StringBuilder, which standard system libraries we discuss in detail later in this section and in Chapter 5; and Math dozens of other libraries that we do not use. Integer† Imported system libraries such as java.util.Arrays. There Double† are thousands of such libraries in a standard Java release, but String† we make scant use of them in this book. An import statement StringBuilder at the beginning of the program is needed to use such libraries System (and signal that we are doing so). imported system libraries Other libraries in this book. For example, another program can java.util.Arrays use rank() in BinarySearch. To use such a program, down- our standard libraries load the source from the booksite into your working directory. The standard libraries Std* that we have developed for use StdIn in this book (and our introductory book An Introduction to StdOut Programming in Java: An Interdisciplinary Approach). These StdDraw libraries are summarized in the following several pages. Source StdRandom code and instructions for downloading them are available on StdStats the booksite. In† To invoke a method from another library (one in the same directory Out† or a specified directory, a standard system library, or a system library † data type definitions that that is named in an import statement before the class definition), we include some static methods prepend the library name to the method name for each call. For ex- Libraries with static ample, the main() method in BinarySearch calls the sort() method methods used in this book in the system library java.util.Arrays, the readInts() method in our library In, and the println() method in our library StdOut. Libraries of methods implemented by ourselves and by others in a modular programming environment can vastly expand the scope of our programming model. Beyond all of the libraries available in a standard Java release, thousands more are avail- able on the web for applications of all sorts. To limit the scope of our programming model to a manageable size so that we can concentrate on algorithms, we use just the libraries listed in the table at right on this page, with a subset of their methods listed in APIs, as described next. 28 CHAPTER 1 Fundamentals APIs A critical component of modular programming is documentation that explains the operation of library methods that are intended for use by others. We will consis- tently describe the library methods that we use in this book in application programming interfaces (APIs) that list the library name and the signatures and short descriptions of each of the methods that we use. We use the term client to refer to a program that calls a method in another library and the term implementation to describe the Java code that implements the methods in an API. Example. The following example, the API for commonly used static methods from the standard Math library in java.lang, illustrates our conventions for APIs: public class Math static double abs(double a) absolute value of a static double max(double a, double b) maximum of a and b static double min(double a, double b) minimum of a and b Note 1: abs(), max(), and min() are defined also for int, long, and float. static double sin(double theta) sine function static double cos(double theta) cosine function static double tan(double theta) tangent function Note 2: Angles are expressed in radians. Use toDegrees() and toRadians() to convert. Note 3: Use asin(), acos(), and atan() for inverse functions. static double exp(double a) exponential (e a) static double log(double a) natural log (loge a, or ln a) static double pow(double a, double b) raise a to the bth power (ab ) static double random() random number in [0, 1) static double sqrt(double a) square root of a static double E value of e (constant) static double PI value of (constant) See booksite for other available functions. API for Java’s mathematics library (excerpts) 1.1 Basic Programming Model 29 These methods implement mathematical functions—they use their arguments to com- pute a value of a specified type (except random(), which does not implement a math- ematical function because it does not take an argument). Since they all operate on double values and compute a double result, you can consider them as extending the double data type—extensibility of this nature is one of the characteristic features of modern programming languages. Each method is described by a line in the API that specifies the information you need to know in order to use the method. The Math li- brary also defines the precise constant values PI (for ) and E (for e), so that you can use those names to refer to those constants in your programs. For example, the value of Math.sin(Math.PI/2) is 1.0 and the value of Math.log(Math.E) is 1.0 (because Math.sin() takes its argument in radians and Math.log() implements the natural logarithm function). Java libraries. Extensive online descriptions of thousands of libraries are part of every Java release, but we excerpt just a few methods that we use in the book, in order to clear- ly delineate our programming model. For example, BinarySearch uses the sort() method from Java’s Arrays library, which we document as follows: public class Arrays static void sort(int[] a) put the array in increasing order Note : This method is defined also for other primitive types and Object. Excerpt from Java’s Arrays library (java.util.Arrays) The Arrays library is not in java.lang, so an import statement is needed to use it, as in BinarySearch. Actually, Chapter 2 of this book is devoted to implementations of sort() for arrays, including the mergesort and quicksort algorithms that are imple- mented in Arrays.sort(). Many of the fundamental algorithms that we consider in this book are implemented in Java and in many other programming environments. For example, Arrays also includes an implementation of binary search. To avoid confusion, we generally use our own implementations, although there is nothing wrong with using a finely tuned library implementation of an algorithm that you understand. 30 CHAPTER 1 Fundamentals Our standard libraries. We have developed a number of libraries that provide useful functionality for introductory Java programming, for scientific applications, and for the development, study, and application of algorithms. Most of these libraries are for input and output; we also make use of the following two libraries to test and analyze our implementations. The first extends Math.random() to allow us to draw random values from various distributions; the second supports statistical calculations: public class StdRandom static void initialize(long seed) initialize static double random() real between 0 and 1 static int uniform(int N) integer between 0 and N-1 static int uniform(int lo, int hi) integer between lo and hi-1 static double uniform(double lo, double hi) real between lo and hi static boolean bernoulli(double p) true with probability p static double gaussian() normal, mean 0, std dev 1 static double gaussian(double m, double s) normal, mean m, std dev s static int discrete(double[] a) i with probability a[i] static void shuffle(double[] a) randomly shuffle the array a[] Note: overloaded implementations of shuffle() are included for other primitive types and for Object. API for our library of static methods for random numbers public class StdStats static double max(double[] a) largest value static double min(double[] a) smallest value static double mean(double[] a) average static double var(double[] a) sample variance static double stddev(double[] a) sample standard deviation static double median(double[] a) median API for our library of static methods for data analysis 1.1 Basic Programming Model 31 The initialize() method in StdRandom allows us to seed the random number gen- erator so that we can reproduce experiments involving random numbers. For reference, implementations of many of these methods are given on page 32. Some of these methods are extremely easy to implement; why do we bother including them in a library? An- swers to this question are standard for well-designed libraries: They implement a level of abstraction that allow us to focus on implement- ing and testing the algorithms in the book, not generating random objects or calculating statistics. Client code that uses such methods is clearer and easier to understand than homegrown code that does the same calculation. Library implementations test for exceptional conditions, cover rarely encoun- tered situations, and submit to extensive testing, so that we can count on them to operate as expected. Such implementations might involve a significant amount of code. For example, we often want implementations for various types of data. For example, Java’s Arrays library includes multiple overloaded implementa- tions of sort(), one for each type of data that you might need to sort. These are bedrock considerations for modular programming in Java, but perhaps a bit overstated in this case. While the methods in both of these libraries are essentially self- documenting and many of them are not difficult to implement, some of them represent interesting algorithmic exercises. Accordingly, you are well-advised to both study the code in StdRandom.java and StdStats.java on the booksite and to take advantage of these tried-and-true implementations. The easiest way to use these libraries (and to examine the code) is to download the source code from the booksite and put them in your working directory; various system-dependent mechanisms for using them with- out making multiple copies are also described on the booksite. Your own libraries. It is worthwhile to consider every program that you write as a li- brary implementation, for possible reuse in the future. Write code for the client, a top-level implementation that breaks the computa- tion up into manageable parts. Articulate an API for a library (or multiple APIs for multiple libraries) of static methods that can address each part. Develop an implementation of the API, with a main() that tests the methods independent of the client. Not only does this approach provide you with valuable software that you can later reuse, but also taking advantage of modular programming in this way is a key to suc- cessfully addressing a complex programming task. 32 CHAPTER 1 Fundamentals intended result implementation random double public static double uniform(double a, double b) value in [a, b) { return a + StdRandom.random() * (b-a); } random int public static int uniform(int N) value in [0..N) { return (int) (StdRandom.random() * N); } random int public static int uniform(int lo, int hi) value in [lo..hi) { return lo + StdRandom.uniform(hi - lo); } public static int discrete(double[] a) { // Entries in a[] must sum to 1. double r = StdRandom.random(); double sum = 0.0; random int value drawn for (int i = 0; i < a.length; i++) from discrete distribution { (i with probability a[i]) sum = sum + a[i]; if (sum >= r) return i; } return -1; } public static void shuffle(double[] a) { int N = a.length; randomly shuffle the for (int i = 0; i < N; i++) { // Exchange a[i] with random element in a[i..N-1] elements in an array of int r = i + StdRandom.uniform(N-i); double values double temp = a[i]; (See Exercise 1.1.36) a[i] = a[r]; a[r] = temp; } } Implementations of static methods in StdRandom library 1.1 Basic Programming Model 33 The purpose of an API is to separate the client from the implementation: the client should know nothing about the implementation other than information given in the API, and the implementation should not take properties of any particular client into account. APIs enable us to separately develop code for various purposes, then reuse it widely. No Java library can contain all the methods that we might need for a given computation, so this ability is a crucial step in addressing complex programming ap- plications. Accordingly, programmers normally think of the API as a contract between the client and the implementation that is a clear specification of what each method is to do. Our goal when developing an implementation is to honor the terms of the contract. Often, there are many ways to do so, and separating client code from implementation code gives us the freedom to substitute new and improved implementations. In the study of algorithms, this ability is an important ingredient in our ability to understand the impact of algorithmic improvements that we develop. 34 CHAPTER 1 Fundamentals Strings A String is a sequence of characters (char values). A literal String is a sequence of characters within double quotes, such as "Hello, World". The data type String is a Java data type but it is not a primitive type. We consider String now be- cause it is a fundamental data type that almost every Java program uses. Concatenation. Java has a built-in concatenation operator (+) for String like the built-in operators that it has for primitive types, justifying the addition of the row in the table below to the primitive-type table on page 12. The result of concatenating two String values is a single String value, the first string followed by the second. typical expressions type set of values typical literals operators expression value "AB" "Hi, " + "Bob" "Hi, Bob" String character "Hello" + "12" + "34" "1234" sequences "2.5" (concatenate) "1" + "+" + "2" "1+2" Java’s String data type Conversion. Two primary uses of strings are to convert values that we can enter on a keyboard into data-type values and to convert data-type values to values that we can read on a display. Java has built-in operations for String to facilitate these operations. In particular, the language includes libraries Integer and Double that contain static methods to convert between String values and int values and between String values and double values, respectively. public class Integer static int parseInt(String s) convert s to an int value static String toString(int i) convert i to a String value public class Double static double parseDouble(String s) convert s to a double value static String toString(double x) convert x to a String value APIs for conversion between numbers and String values 1.1 Basic Programming Model 35 Automatic conversion. We rarely explicitly use the static toString() methods just described because Java has a built-in mechanism that allows us to convert from any data type value to a String value by using concatenation: if one of the arguments of + is a String, Java automatically converts the other argument to a String (if it is not already a String). Beyond usage like "The square root of 2.0 is " + Math.sqrt(2.0) this mechanism enables conversion of any data-type value to a String, by concatenat- ing it with the empty string "". Command-line arguments. One important use of strings in Java programming is to enable a mechanism for passing information from the command line to the program. The mechanism is simple. When you type the java command followed by a library name followed by a sequence of strings, the Java system invokes the main() method in that library with an array of strings as argument: the strings typed after the library name. For example, the main() method in BinarySearch takes one command-line argument, so the system creates an array of size one. The program uses that value, args, to name the file containing the whitelist, for use as the argument to In.readInts(). An- other typical paradigm that we often use in our code is when a command-line argu- ment is intended to represent a number, so we use parseInt() to convert to an int value or parseDouble() to convert to a double value. Computing with strings is an essential component of modern computing. For the moment, we make use of String just to convert between external representation of numbers as sequences of characters and internal representation of numeric data-type values. In Section 1.2, we will see that Java supports many, many more operations on String values that we use throughout the book; in Section 1.4, we will examine the internal representation of String values; and in Chapter 5, we consider in depth al- gorithms that process String data. These algorithms are among the most interesting, intricate, and impactful methods that we consider in this book. 36 CHAPTER 1 Fundamentals Input and output The primary purpose of our standard libraries for input, out- put, and drawing is to support a simple model for Java programs to interact with the outside world. These libraries are built upon extensive capabilities that are available in Java libraries, but are generally much more complicated and much more difficult to learn and use. We begin by briefly reviewing the model. standard input command-line arguments In our model, a Java program takes input values from command-line arguments or from an abstract stream of characters known as the standard input stream and writes to another abstract stream of characters known as the standard output standard output stream. Necessarily, we need to consider the interface between Java and the operating system, so we need to briefly dis- file I/O cuss basic mechanisms that are provided by most modern standard drawing operating systems and program-development environ- ments. You can find more details about your particular A bird’s-eye view of a Java program system on the booksite. By default, command-line argu- ments, standard input, and standard output are associated with an application supported by either the operating system or the program develop- ment environment that takes commands. We use the generic term terminal window to refer to the window maintained by this application, where we type and read text. Since early Unix systems in the 1970s this model has proven to be a convenient and direct way for us to interact with our programs and data. We add to the classical model a standard drawing that allows us to create visual representations for data analysis. Commands and arguments. In the terminal window, we see a prompt, where we type commands to the operating system that may take arguments. We use only a few com- mands in this book, shown in the table below. Most often, we use the.java com- mand, to run our programs. As mentioned on page 35, Java classes have a main() static method that takes a String array args[] as its argument. That array is the sequence of command-line arguments that we type, provided to Java by the operating system. By convention, both Java and command arguments purpose the operating system process javac.java file name compile Java program the arguments as strings. If.class file name (no extension) we intend for an argument to java run Java program be a number, we use a method and command-line arguments such as Integer.parseInt() more any text file name print file contents to convert it from String to Typical operating-system commands the appropriate type. 1.1 Basic Programming Model 37 Standard output. Our StdOut library provides sup- call the static method main() in RandomSeq port for standard output. By default, the system con- prompt nects standard output to the terminal window. The % java RandomSeq 5 100.0 200.0 print() method puts its argument on standard out- args put; the println() method adds a newline; and the invoke args Java printf() method supports formatted output, as de- runtime args scribed next. Java provides a similar method in its Anatomy of a command System.out library; we use StdOut to treat standard input and standard output in a uniform manner (and to provide a few technical improvements). public class StdOut static void print(String s) print s static void println(String s) print s, followed by newline static void println() print a new line static void printf(String f,... ) formatted print Note: overloaded implementations are included for primitive types and for Object. API for our library of static methods for standard output To use these methods, download into public class RandomSeq your working directory StdOut.java { from the booksite and use code such as public static void main(String[] args) { // Print N random values in (lo, hi). StdOut.println("Hello, World"); int N = Integer.parseInt(args); to call them. A sample client is shown double lo = Double.parseDouble(args); at right. double hi = Double.parseDouble(args); for (int i = 0; i < N; i++) Formatted output. In its simplest { double x = StdRandom.uniform(lo, hi); form, printf() takes two arguments. StdOut.printf("%.2f\n", x); The first argument is a format string } that describes how the second argu- } } ment is to be converted to a string for output. The simplest type of format Sample StdOut client string begins with % and ends with a one-letter conversion code. The conversion codes that we % java RandomSeq 5 100.0 200.0 use most frequently are d (for decimal values from Java’s 123.43 153.13 integer types), f (for floating-point values), and s (for 144.38 String values). Between the % and the conversion code 155.18 is an integer value that specifies the field width of the 104.02 38 CHAPTER 1 Fundamentals converted value (the number of characters in the converted output string). By default, blank spaces are added on the left to make the length of the converted output equal to the field width; if we want the spaces on the right, we can insert a minus sign before the field width. (If the converted output string is bigger than the field width, the field width is ignored.) Following the width, we have the option of including a period followed by the number of digits to put after the decimal point (the precision) for a double value or the number of characters to take from the beginning of the string for a String value. The most important thing to remember about using printf() is that the conversion code in the format and the type of the corresponding argument must match. That is, Java must be able to convert from the type of the argument to the type required by the con- version code. The first argument of printf() is a String that may contain characters other than a format string. Any part of the argument that is not part of a format string passes through to the output, with the format string replaced by the argument value (converted to a String as specified). For example, the statement StdOut.printf("PI is approximately %.2f\n", Math.PI); prints the line PI is approximately 3.14 Note that we need to explicitly include the newline character \n in the argument in order to print a new line with printf(). The printf() function can take more than two arguments. In this case, the format string will have a format specifier for each ad- ditional argument, perhaps separated by other characters to pass through to the out- put. You can also use the static method String.format() with arguments exactly as just described for printf() to get a formatted string without printing it. Formatted printing is a convenient mechanism that allows us to develop compact code that can produce tabulated experimental data (our primary use in this book). typical sample converted string type code literal format strings values for output "%14d" " 512" int d 512 "%-14d" "512 " "%14.2f" " 1595.17" double f 1595.1680010754388 "%.7f" "1595.1680011" e "%14.4e" " 1.5952e+03" "%14s" " Hello, World" String s "Hello, World" "%-14s" "Hello, World " "%-14.5s" "Hello " Format conventions for printf() (see the booksite for many other options) 1.1 Basic Programming Model 39 Standard input. Our StdIn library public class Average takes data from the standard input { stream that may be empty or may public static void main(String[] args) { // Average the numbers on StdIn. contain a sequence of values sepa- double sum = 0.0; rated by whitespace (spaces, tabs, int cnt = 0; newline characters, and the like). By while (!StdIn.isEmpty()) { // Read a number and cumulate the sum. default, the system connects stan- sum += StdIn.readDouble(); dard output to the terminal win- cnt++; dow—what you type is the input } double avg = sum / cnt; stream (terminated by or StdOut.printf("Average is %.5f\n", avg); , depending on your termi- } nal window application). Each value } is a String or a value from one of Sample StdIn client Java’s primitive types. One of the key features of the standard input stream % java Average is that your program consumes values when it reads them. Once 1.23456 your program has read a value, it cannot back up and read it again. 2.34567 This assumption is restrictive, but it reflects physical characteristics 3.45678 4.56789 of some input devices and simplifies implementing the abstrac- tion. Within the input stream model, the static methods in this li- Average is 2.90123 brary are largely self-documenting (described by their signatures). public class StdIn static boolean isEmpty() true if no more values, false otherwise static int readInt() read a value of type int static double readDouble() read a value of type double static float readFloat() read a value of type float static long readLong() read a value of type long static boolean readBoolean() read a value of type boolean static char readChar() read a value of type char static byte readByte() read a value of type byte static String readString() read a value of type String static boolean hasNextLine() is there another line in the input stream? static String readLine() read the rest of the line static String readAll() read the rest of the input stream API for our library of static methods for standard input 40 CHAPTER 1 Fundamentals Redirection and piping. Standard input and output enable us to take advantage of command-line extensions supported by many operating-systems. By adding a simple directive to the command that invokes a program, we can redirect its standard output to a file, either for permanent storage or for input to another program at a later time: % java RandomSeq 1000 100.0 200.0 > data.txt This command specifies that the standard output stream is not to be printed in the ter- minal window, but instead is to be written to a text file named data.txt. Each call to StdOut.print() or StdOut.println() redirecting from a file to standard input appends text at the end of that file. In % java Average < data.txt this example, the end result is a file that data.txt contains 1,000 random values. No out- standard input put appears in the terminal window: it goes directly into the file named after Average the > symbol. Thus, we can save away information for later retrieval. Not that redirecting standard output to a file we do not have to change RandomSeq in % java RandomSeq 1000 100.0 200.0 > data.txt any way—it is using the standard out- RandomSeq put abstraction and is unaffected by our use of a different implementation of data.txt standard output that abstraction. Similarly, we can redi- rect standard input so that StdIn reads data from a file instead of the terminal piping the output of one program to the input of another application: % java RandomSeq 1000 100.0 200.0 | java Average % java Average < data.txt RandomSeq This command reads a sequence of standard output standard input numbers from the file data.txt and computes their average value. Specifi- Average cally, the < symbol is a directive that tells the operating system to implement the Redirection and piping from the command line standard input stream by reading from the text file data.txt instead of waiting for the user to type something into the terminal window. When the program calls StdIn.readDouble(), the operating system reads the value from the file. Combining these to redirect the output of one program to the input of another is known as piping: % java RandomSeq 1000 100.0 200.0 | java Average 1.1 Basic Programming Model 41 This command specifies that standard output for RandomSeq and standard input for Average are the same stream. The effect is as if RandomSeq were typing the numbers it generates into the terminal window while Average is running. This difference is pro- found, because it removes the limitation on the size of the input and output streams that we can process. For example, we could replace 1000 in our example with 1000000000, even though we might not have the space to save a billion numbers on our computer (we do need the time to process them). When RandomSeq calls StdOut.println(), a string is added to the end of the stream; when Average calls StdIn.readInt(), a string is removed from the beginning of the stream. The timing of precisely what happens is up to the operating system: it might run RandomSeq until it produces some output, and then run Average to consume that output, or it might run Average until it needs some output, and then run RandomSeq until it produces the needed output. The end result is the same, but our programs are freed from worrying about such details because they work solely with the standard input and standard output abstractions. Input and output from a file. Our In and Out libraries provide static methods that implement the abstraction of reading from and writing to a file the contents of an ar- ray of values of a primitive type (or String). We use readInts(), readDoubles(), and readStrings() in the In library and writeInts(), writeDoubles(), and writeStrings() in the Out library. The named argument can be a file or a web page. For example, this ability allows us to use a file and standard input for two different pur- poses in the same program, as in BinarySearch. The In and Out libraries also imple- ment data types with instance methods that allow us the more general ability to treat multiple files as input and output streams, and web pages as input streams, so we will revisit them in Section 1.2. public class In static int[] readInts(String name) read int values static double[] readDoubles(String name) read double values static String[] readStrings(String name) read String values public class Out static void write(int[] a, String name) write int values static void write(double[] a, String name) write double values static void write(String[] a, String name) write String values Note 1: Other primitive types are supported. Note 2: StdIn and StdOut are supported (omit name argument). APIs for our static methods for reading and writing arrays 42 CHAPTER 1 Fundamentals Standard drawing (basic methods). Up to this point, StdDraw.point(x0, y0); StdDraw.line(x0, y0, x1, y1); our input/output abstractions have focused exclusively on text strings. Now we introduce an abstraction for (1, 1) producing drawings as output. This library is easy to (x1, y1) (x0, y0) use and allows us to take advantage of a visual medi- um to cope with far more information than is possible with just text. As with standard input/output, our stan- (0, 0) (x2, y2) dard drawing abstraction is implemented in a library StdDraw that you can access by downloading the file StdDraw.circle(x, y, r); StdDraw.java from the booksite into your working directory. Standard draw is very simple: we imagine an abstract drawing device capable of drawing lines and r points on a two-dimensional canvas. The device is ca- pable of responding to the commands to draw basic (x, y) geometric shapes that our programs issue in the form of calls to static methods in StdDraw, including meth- ods for drawing lines, points, text strings, circles, rect- StdDraw.square(x, y, r); angles, and polygons. Like the methods for standard input and standard output, these methods are nearly self-documenting: StdDraw.line() draws a straight r line segment connecting the point (x0 , y0) with the r point (x1 , y1) whose coordinates are given as arguments. StdDraw.point() draws a spot centered on the point (x, y) (x, y) whose coordinates are given as arguments, and so forth, as illustrated in the diagrams at right. Geometric shapes can be filled (in black, by default). The default double[] x = {x0, x1, x2, x3}; double[] y = {y0, y1, y2, y3}; scale is the unit square (all coordinates are between 0 StdDraw.polygon(x, y); and 1). The standard implementation displays the can- vas in a window on your computer’s screen, with black (x0, y0) lines and points on a white background. (x1, y1) (x3, y3) (x2, y2) StdDraw examples 1.1 Basic Programming Model 43 public class StdDraw static void line(double x0, double y0, double x1, double y1) static void point(double x, double y) static void text(double x, double y, String s) static void circle(double x, double y, doubl