Software Development 500 PDF
Document Details
Uploaded by Deleted User
Richfield Graduate Institute of Technology
Tags
Summary
This learner guide outlines topics in software development, including computer components, programming process, data types, modules, and decisions, for a first semester curriculum. It's intended for a Diploma in Information Technology.
Full Transcript
Faculty of Information Technology SOFTWARE DEVELOPMENT 500 Year 1 Semester 1 Registered with the Department of Higher Education as a Private Higher Education Institution under the Higher Education Act, 1997. Registration Certificate No. 2000/HE07/008...
Faculty of Information Technology SOFTWARE DEVELOPMENT 500 Year 1 Semester 1 Registered with the Department of Higher Education as a Private Higher Education Institution under the Higher Education Act, 1997. Registration Certificate No. 2000/HE07/008 FACULTY OF INFORMATION TECHNOLOGY LEARNER GUIDE MODULES: 62)7:$5( '(9(/23(17 5 (1ST SEMESTER) PREPARED ON BEHALF OF RICHFIELD GRADUATE INSTITUTE OF TECHNOLOGY (PTY) LTD RICHFIELD GRADUATE INDTITUTE OF TECHNOLOGY (PTY) LTD Registration Number: 2000/000757/07 All rights reserved; no part of this publication may be reproduced in any form or by any means, including photocopying machines, without the written permission of the Institution. LEARNER GUIDE MODULE: SOFTWARE DEVELOPMENT 500 (1ST SEMESTER) TOPIC 1 : AN OVERVIEW OF COMPUTERS AND PROGRAMMING TOPIC 2 : WORKING WITH DATA, CREATING MODULES AND DESIGNING HIGH QUALITY PROGRAMS TOPIC 3 : UNDERSTANDING STRUCTURE TOPIC 4 : MODULARIZATION TOPIC 5 : MAKING DECISIONS TOPIC 6 : DESIGNING AND WRITING A COMPLETE PROGRAM TOPIC 7 : LOOPING TOPIC 8 : ARRAYS TOPIC 9 : FILE HANDLING AND APPLICATIONS Diploma in Information Topic 1: An overview of computers and programming Technology 1.1 Understanding Computer Components and Operations 1.2 Understanding the Programming Process 1.2.1 Understanding the Program 1.2.2 Planning the Logic 1.2.3 Coding the Program 1.2.4 Using Software to Translate the Program into Machine Language Lecture 1.2.5 Testing the Program 6-10 1.2.6 Putting the Program into Production Topic Summary Key Terms Review Questions Topic 2: Working with Data, Creating Modules and Designing High Quality Programs 2.1 Understanding the Data Hierarchy 13 2.2 Using Flowchart Symbols and Pseudocode Statements 2.3 Using and Naming Variables 2.4 Ending a Program by Using Sentinel Values 2.5 Using the Connector 2.6 Assigning Values to Variables 2.7 Understanding Data Types 2.8 Understanding the Evolution of Programming Techniques 2.8.1 Problem Solving Approach Lecture Topic Summary 11-14 Key Terms Review Questions Find The Errors Exercises Topic 3: Understanding Structure 3.1 Understanding the Three Basic Structures 3.2 Understanding the Reasons for Structure 3.3 THREE SPECIAL STRUCTURES—CASE, DO WHILE, AND DO UNTIL Topic Summary Key Terms Lecture Review Questions 15-16 Find the Errors Topic 4: Modularization 4.1 Modules, Subroutines, Procedures, Functions, or Methods 4.2 Modularization Allows Multiple Programmers to Work on a Problem 4.3 Modularization Allows You to Reuse Your Work 4.4 Modularization Makes it Easier to Identify Structures 4.5 Modularizing a Program 4.6 Modules Calling Other Modules 4.7 Declaring Variables Lecture Topic Summary 17-20 Key Terms Review Questions Find the Errors Exercise Topic 5:Designing and Writing a Complete Program 5.1 Understanding the Mainline Logical Flow Through a Program 5.2 Declaring Variables 5.3 Opening Files 5.4 A One-Time-Only-Task—Printing Headings 5.5 Reading the First Input Record 5.6 Checking for the End of the File 14 5.6.1 Writing the Main Loop 5.6.2 Performing End-Of-Job Tasks 5.6.3 Understanding the Need for Good Program Design 5.6.4 Storing Program Components in Separate Files 5.6.5 Selecting Variable and Module Names 5.6.6 Designing Clear Module Statements 5.7 Avoiding Confusing Line Breaks 5.8 Using Temporary Variables to Clarify Long Statements Lecture 5.9 Using Constants Where Appropriate 21 - 25 5.9.1 Maintaining Good programming Habits Chapter Summary Key Terms Review Questions Find the Bugs Exercise Topic 6: Making Decisions 6.1 Evaluating Boolean Expressions to Make Comparisons 6.2 Using the Relational Comparison Operators 6.3 Writing Nested and Decisions for Efficiency 6.4 Combining Decisions in an AND Selection 6.5 Avoiding Common Errors in an AND Selection 6.5.1 Understanding OR Logic 6.6 Avoiding Common Errors in an OR Selection 6.7 Writing OR Decisions for Efficiency 6.8 Combining Decisions in an OR Selection 6.8.1 Using Selections within Ranges 6.9 Common Errors Using Range Checks 6.9.1 Understanding Precedence When Combining AND and OR Selections 6.9.2 Understanding the Case Structure Lecture 6.9.3 Using Decision Tables 26 - 30 Chapter Summary Key Terms Review Questions Find the Bugs Exercise Topic 7: Looping 7.1 Understanding the Advantages of Looping 7.2 using a While loop with a Loop Control Variable 7.2.1 Using a Counter to Control Looping Lecture 7.2.2 Looping with a Variable Sentinel Value 31-35 7.2.3 Looping by Decrementing 15 7.3 Neglecting to Initialize the Loop Control Variable 7.4 Neglecting to Alter the loop Control Variable 7.5 Using the Wrong Comparison with the Loop Control Variable 7.6 Including Statements Inside the Loop that Belong Outside the Loop 7.7 Initializing a Variable that does Not Require Initialization 7.7.1 Using the For Statement 7.7.2 Using the Do While and Do until Loops 7.7.3 Recognizing the Characteristics Shared by All Loops 7.7.4 Nesting Loops 7.7.5 Using a Loop to Accumulate Totals 7.8 Understanding Control Break Logic 7.8.1 Performing a Single-Level Control Break to Start a New Page 7.8.2 Performing Multiple-Level Control Breaks Chapter Summary Key Terms Review Questions Find the Bugs Exercise Topic 8: Arrays 8.1 Understanding Arrays 8.2 How Arrays Occupy Computer Memory 8.3 Manipulating an Array to Replace Nested Decisions 8.4 Array Declaration and Initialization 8.5 Declaring and Initializing Constant Arrays 8.6 Loading an Array from a File 8.7 Searching for an Exact Match in an Array 8.8 Using Parallel Arrays 8.9 Remaining Within Array Bounds 8.10 Improving Search Efficiency using an Early Exit 8.11 Searching an Array for a Range Match Lecture Chapter Summary 36-40 Key Terms Review Questions Find the Bugs Exercise Topic 9: File Handling and Applications 9.1 Understanding Sequential Data Files and the Need for Merging Files 9.2 Creating the Mainline and housekeeping() Logic for a Merge Program 9.3 Creating the mergeFiles() nad finishUp() Modules for a Merge program 9.4 Modifying the housekeeping() Module in the Merge Program to Check for eof 16 9.5 Master and Transaction File Processing 9.6 Matching Files to Upgrade Fields in Master File Records 9.7 Allowing Multiple Transactions for a Single Master File Record Lecture 9.8 Updating Records in Sequential Files 41-44 Chapter Summary Key Terms Review Questions Find the Bugs Exercise Review and Mock Exam Lecture 45 17 TOPIC 1 1. AN OVERVIEW OF COMPUTERS AND PROGRAMMING LEARNING OUTCOMES After Studying this topic you should be able to: Understand computer components and operations Describe the steps involved in the programming process 1.1 UNDERSTANDING COMPUTER COMPONENTS AND OPERATIONS Hardware and software are the two major components of any computer system. Hardware is the equipment, or the devices, associated with a computer. For a computer to be useful, however, it needs more than equipment; a computer needs to be given instructions. The instructions that tell the computer what to do are called software, or programs, and are written by programmers. Software can be classified as application software or system software. Application software comprises all the programs you apply to a task—word-processing programs, spread sheets, payroll and inventory programs, and even games. System software comprises the programs that you use to manage your computer—operating systems, such as Windows, Linux, or UNIX. Understanding the evolution of programming techniques together, computer hardware and software accomplish four major operations: 1. Input 2. Processing 3. Output 4. Storage Hardware devices that perform input include keyboards and mouse. Through these devices, data, or facts, enter the computer system. Processing data items may involve organizing them, checking them for accuracy, or performing mathematical operations on them. The piece of hardware that performs these sorts of tasks is the central processing unit, or CPU. After data items have been processed, the resulting information is sent to a printer, monitor, or some other output device so people can view, interpret, and use the results. Often, you also want to store the output information on storage hardware, such as magnetic disks, tapes, compact discs, or flash media. Computer software consists of all the instructions that control how and when the data items are input, how they are processed, and the form in which they are output or stored. Data includes all the text, numbers, and other information that are processed by a computer. However, many computer professionals reserve the term “information” for data that has been processed. For example, your name, Social Security number, and hourly pay rate are data items, but your pay check holds information. 18 Computer hardware by itself is useless without a programmer’s instructions, or software, just as your stereo equipment doesn’t do much until you provide music on a CD or tape. You can buy prewritten software that is stored on a disk or that you download from the Internet or you can write your own software instructions. You can enter instructions into a computer system through any of the hardware devices you use for data; most often, you type your instructions using a keyboard and store them on a device such as a disk or CD. You write computer instructions in a computer programming language, such as Visual Basic, C#, C++, Java, or COBOL. Just as some people speak English and others speak Japanese, programmers also write programs in different languages. Some programmers work exclusively in one language, whereas others know several and use the one that seems most appropriate for the task at hand. No matter which programming language a computer programmer uses, the language has rules governing its word usage and punctuation. Unless the syntax is perfect, the computer cannot interpret the programming language instruction at all. Every computer operates on circuitry that consists of millions of on/off switches. Each programming language uses a piece of software to translate the specific programming language into the computers on/off circuitry language, or machine language. The language translation software is called a compiler or interpreter, and it tells you if you have used a programming language incorrectly. Therefore, syntax errors are relatively easy to locate and correct—the compiler or interpreter you use highlights every syntax error. If you write a computer program using a language such as C++ but spell one of its words incorrectly or reverse the proper order of two words, the translator lets you know that it found a mistake by displaying an error message as soon as you try to translate the program. Although there are differences in how compilers and interpreters work, their basic function is the same— to translate your programming statements into code the computer can use. When you use a compiler, an entire program is translated before it can execute; when you use an interpreter, each instruction is translated just prior to execution. Usually, you do not choose which type of translation to use—it depends on the programming language. However, there are some languages for which both compilers and interpreters are available. A program without syntax errors can be executed on a computer, but it might not produce correct results. For a program to work properly, you must give the instructions to the computer in a specific sequence, you must not leave any instructions out, and you must not add extraneous instructions. By doing this, you are developing the logic of the computer program. Programmers often call logical errors semantic errors. For example, if you misspell a programming language word, you commit a syntax error, but if you use an otherwise correct word that does not make any sense in the current context, you commit a semantic error. Once instructions have been input to the computer and translated into machine language, a program can be run, or executed. You can write a program that takes a number (an input step), doubles it (processing), and tells you the answer (output) in a programming language such as Java or C++, but if you were to write it using English-like statements, it would look like this: Get inputNumber. Compute calculatedAnswer as inputNumber times 2. Print calculatedAnswer. 19 The instruction to Get inputNumber is an example of an input operation. When the computer interprets this instruction, it knows to look to an input device to obtain a number. Computers often have several input devices, perhaps a keyboard, a mouse, a CD drive, and two or more disk drives. Logically, however, it doesn’t really matter which hardware device is used, as long as the computer knows to look for a number. The logic of the input operation—that the computer must obtain a number for input, and that the computer must obtain it before multiplying it by two—remains the same regardless of any specific input hardware device. Many computer professionals categorize disk drives and CD drives as storage devices rather than input devices. Such devices actually can be used for input, storage, and output. Processing is the step that occurs when the arithmetic is performed to double the inputNumber; the statement Compute calculatedAnswer as inputNumber times 2 represents processing. Mathematical operations are not the only kind of processing, but they are very typical. After you write a program, the program can be used on computers of different brand names, sizes, and speeds. Whether you use an IBM, Macintosh, Linux, or UNIX operating system, and whether you use a personal computer that sits on your desk or a mainframe that costs hundreds of thousands of dollars and resides in a special building in a university, multiplying by 2 is the same process. The hardware is not important; the processing will be the same. In the number-doubling program, the Print calculatedAnswer statement represents output. Within a particular program, this statement could cause the output to appear on the monitor (which might be a flat panel screen or a cathode-ray tube), or the output could go to a printer (which could be laser or ink- jet), or the output could be written to a disk or CD. The logic of the process called “Print” is the same no matter what hardware device you use. Besides input, processing, and output, the fourth operation in any computer system is storage. When computers produce output, it is for human consumption. For example, output might be displayed on a monitor or sent to a printer. Storage, on the other hand, is meant for future computer use (for example, when data items are saved on a disk). Computer storage comes in two broad categories. All computers have internal storage, often referred to as memory, main memory, primary memory, or random access memory (RAM). This storage is located inside the system unit of the machine. (For example, if you own a microcomputer, the system unit is the large case that holds your CD or other disk drives. On a laptop computer, the system unit is located beneath the keyboard.) Computers also use external storage, which is persistent (relatively permanent) storage on a device such as a floppy disk, hard disk, flash media, or magnetic tape. In other words, external storage is outside the main memory, not necessarily outside the computer. Both programs and data sometimes are stored on each of these kinds of media. To use computer programs, you must first load them into memory. You might type a program into memory from the keyboard, or you might use a program that has already been written and stored on a disk. Either way, a copy of the instructions must be placed in memory before the program can be run. A computer system needs both internal memory and external storage. Internal memory is needed to run the programs, but internal memory is volatile—that is, its contents are lost every time the computer loses power. Therefore, if you are going to use a program more than once, you must store it, or save it, on some non-volatile medium. Otherwise, the program in main memory is lost forever when the computer is turned off. External storage (usually disks or tape) provides a non-volatile (or persistent) medium. 20 Even though a hard disk drive is located inside your computer, the hard disk is not main, internal memory. Internal memory is temporary and volatile; a hard drive is permanent, non-volatile storage. After one or two “tragedies” of losing several pages of a typed computer program due to a power failure or other hardware problem, most programmers learn to periodically save the programs they are in the process of writing, using a non-volatile medium such as a disk. Once you have a copy of a program in main memory, you want to execute, or run, the program. To do so, you must also place any data that the program requires into memory. For example, after you place the following program into memory and start to run it, you need to provide an actual inputNumber—for example, 8—that you also place in main memory. Get inputNumber. Compute calculatedAnswer as inputNumber times 2. Print calculatedAnswer. The inputNumber is placed in memory at a specific memory location that the program will call inputNumber. Then, and only then, can the calculatedAnswer, in this 16, be calculated and printed. 1.2 UNDERSTANDING THE PROGRAMMING PROCESS A programmer’s job involves writing instructions, but a professional programmer usually does not just sit down at a computer keyboard and start typing. The programmer’s job can be broken down into six programming steps: 1. Understanding the problem 2. Planning the logic 3. Coding the program 4. Using software to translate the program into machine language 5. Testing the program 6. Putting the program into production 1.2.1 UNDERSTANDING THE PROBLEM Professional computer programmers write programs to satisfy the needs of others. Examples could include a Human Resources Department that needs a printed list of all employees, a Billing Department that wants a list of clients who are 30 or more days overdue on their payments, and an office manager who wants to be notified when specific supplies reach the reorder point. Because programmers are providing a service to these users, programmers must first understand what it is the users want. 1.2.2 PLANNING THE LOGIC The heart of the programming process lies in planning the program’s logic. During this phase of the programming process, the programmer plans the steps of the program, deciding what steps to include and how to order them. You can plan the solution to a problem in many ways. The two most common planning tools are flowcharts and pseudocode. 21 You may hear programmers refer to planning a program as “developing an algorithm.” An algorithm is the sequence of steps necessary to solve any problem. The programmer doesn’t worry about the syntax of any particular language at this point, just about figuring out what sequence of events will lead from the available input to the desired output. Planning the logic includes thinking carefully about all the possible data values a program might encounter and how you want the program to handle each scenario. The process of walking through a program’s logic on paper before you actually write the program is called desk-checking. 1.2.3 CODING THE PROGRAM Once the programmer has developed the logic of a program, only then can he or she write the program in one of more than 400 programming languages. Programmers choose a particular language because some languages have built-in capabilities that make them more efficient than others at handling certain types of operations. Despite their differences, programming languages are quite alike—each can handle input operations, arithmetic processing, output operations, and other standard functions. The logic developed to solve a programming problem can be executed using any number of languages. It is only after a language is chosen that the programmer must worry about each command being spelled correctly and all of the punctuation getting into the right spots—in other words, using the correct syntax. Some very experienced programmers can successfully combine the logic planning and the actual instruction writing, or coding, of the program in one step. This may work for planning and writing a very simple program, just as you can plan and write a postcard to a friend using one step. 1.2.4 USING SOFTWARE TO TRANSLATE THE PROGRAM INTO MACHINE LANGUAGE Even though there are many programming languages, each computer knows only one language, its machine language, which consists of many 1s and 0s. Computers understand machine language because computers themselves are made up of thousands of tiny electrical switches, each of which can be set in either the on or off state, which is represented by a 1 or 0, respectively. Languages like Java or Visual Basic are available for programmers to use because someone has written a translator program (a compiler or interpreter) that changes the English-like high-level programming language in which the programmer writes into the low-level machine language that the computer understands. If you write a programming language statement incorrectly (for example, by misspelling a word, using a word that doesn’t exist in the language, or using “illegal” grammar), the translator program doesn’t know what to do and issues an error message identifying a syntax error, or misuse of a language’s grammar rules. You receive the same response when you speak nonsense to a human- language translator. Imagine trying to look up a list of words in a Spanish-English dictionary if some of the listed words are misspelled—you can’t complete the task until the words are spelled correctly. Although making errors is never desirable, syntax errors are not a major concern to programmers, because the compiler or interpreter catches every syntax error, and the computer will not execute a program that contains them. A computer program must be free of syntax errors before you can execute it. Typically, a programmer develops a program’s logic, writes the code, and then compiles the program, receiving a list of syntax 22 errors. The programmer then corrects the syntax errors, and compiles the program again. Correcting the first set of errors frequently reveals a new set of errors that originally were not apparent to the compiler. When writing a program, a programmer might need to recompile the code several times. An executable program is created only when the code is free of syntax errors. When you run an executable program, it typically also might require input data. Figure 1-1 shows a diagram of this entire process. FIGURE 1-1: CREATING AN EXECUTABLE PROGRAM 1.2.5 TESTING THE PROGRAM A program that is free of syntax errors is not necessarily free of logical errors. Once a program is free from syntax errors, the programmer can test it—that is, execute it with some sample data to see whether the results are logically correct. Programs should be tested with many sets of data. 1.2.6 PUTTING THE PROGRAM INTO PRODUCTION Once the program is tested adequately, it is ready for the organization to use. Putting the program into production might mean simply running the program once, if it was written to satisfy a user’s request for a special list. However, the process might take months if the program will be run on a regular basis, or if it is one of a large system of programs being developed. Perhaps data-entry people must be trained to prepare the input for the new program, users must be trained to understand the output, or existing data in the company must be changed to an entirely new format to accommodate this program. Conversion, the entire set of actions an organization must take to switch over to using a new program or set of programs, can sometimes take months or years to accomplish. You might consider maintaining programs as a seventh step in the programming process. After programs are put into production, making required changes is called maintenance. TOPIC SUMMARY Together, computer hardware (equipment) and software (instructions) accomplish four major operations: input, processing, output, and storage. You write computer instructions in a computer programming language that requires specific syntax; the instructions are translated into 23 machine language by a compiler or interpreter. When both the syntax and logic of a program are correct, you can run, or execute, the program to produce the desired results. A programmer’s job involves understanding the problem, planning the logic, coding the program, translating the program into machine language, testing the program, and putting the program into production. KEY TERMS Hardware is the equipment of a computer system. Software consists of the programs that tell the computer what to do. Input devices include keyboards and mice; through these devices, data items enter the computer system. Data can also enter a system from storage devices such as magnetic disks and CDs. Data includes all the text, numbers, and other information that are processed by a computer. Processing data items may involve organizing them, checking them for accuracy, or performing mathematical operations on them. The central processing unit, or CPU, is the piece of hardware that processes data. Information is sent to a printer, monitor, or some other output device so people can view, interpret, and work with the results. Programming languages, such as Visual Basic, C#, C++, Java, or COBOL, are used to write programs. The syntax of a language consists of its rules. Machine language is a computer’s on/off circuitry language. A compiler or interpreter translates a high-level language into machine language and tells you if you have used a programming language incorrectly. You develop the logic of the computer program when you give instructions to the computer in a specific sequence, without leaving any instructions out or adding extraneous instructions. A semantic error occurs when a correct word is used in an incorrect context. The running, or executing, of a program occurs when the computer actually uses the written and compiled program. Internal storage is called memory, main memory, primary memory, or random access memory (RAM). External storage is persistent (relatively permanent) storage outside the main memory of the machine, on a device such as a floppy disk, hard disk, or magnetic tape. Internal memory is volatile—that is, its contents are lost every time the computer loses power. You save a program on some non-volatile medium. An algorithm is the sequence of steps necessary to solve any problem. Desk-checking is the process of walking through a program solution on paper. Coding a program means writing the statements in a programming language. High-level programming languages are English-like. Machine language is the low-level language made up of 1s and 0s that the computer understands. 24 A syntax error is an error in language or grammar. Logical errors occur when incorrect instructions are performed, or when instructions are performed in the wrong order. Conversion is the entire set of actions an organization must take to switch over to using a new program or set of programs. REVIEW QUESTIONS 1. The two major components of any computer system are its. A. input and output B. data and programs C. hardware and software D. memory and disk drives 2. The major computer operations include. A. hardware and software B. input, processing, output, and storage C. sequence and looping D. spread sheets, word processing, and data communications 3. Another term meaning “computer instructions” is. A. hardware B. software C. queries D. data 4. Visual Basic, C++, and Java are all examples of computer. A. operating systems B. hardware C. machine languages D. programming languages 5. A programming language’s rules are its. A. syntax B. logic C. format D. options 6. The most important task of a compiler or interpreter is to. A. create the rules for a programming language B. translate English statements into a language such as Java C. translate programming language statements into machine language D. execute machine language programs to perform useful tasks 7. Which of the following is not associated with internal storage? A. main memory B. hard disk C. primary memory 25 D. volatile 8. Which of the following pairs of steps in the programming process is in the correct order? A. code the program, plan the logic B. test the program, translate it into machine language C. put the program into production, understand the problem D. code the program, translate it into machine language 9. The two most commonly used tools for planning a program’s logic are. A. flowcharts and pseudocode B. ASCII and EBCDIC C. Java and Visual Basic D. word processors and spread sheets 10. The most important thing a programmer must do before planning the logic to a program is. A. decide which programming language to use B. code the problem C. train the users of the program D. understand the problem 11. Writing a program in a language such as C++ or Java is known as the program. A. translating B. coding C. interpreting D. compiling 12. A compiler would find all of the following programming errors except. A. the misspelled word “print” in a language that includes the word “print” B. the use of an “X” for multiplication in a language that requires an asterisk C. a newBalanceDue calculated by adding a customerPayment to an oldBalanceDueinstead of subtracting it D. an arithmetic statement written as regularSales + discountedSales = totalSales 26 TOPIC 2 2. WORKING WITH DATA, CREATING MODULES AND DESIGNING HIGH QUALITY PROGRAMS LEARNING OUTCOMES After Studying this topic you should be able to: Describe the data hierarchy Understand how to use flowchart symbols and Pseudocode statements Use and name variables Use a sentinel, or dummy value, to end a program Use a connector symbol Assign values to variables Recognize the proper format of assignment statements Understanding the evolution of programming techniques 2.1 UNDERSTANDING THE DATA HIERARCHY Some very simple programs require very simple data. For example, the number-doubling program requires just one value as input. Most business programs, however, use much more data—inventory files list thousands of items, personnel and customer files list thousands of people. When data items are stored for use on computer systems, they are often stored in what is known as a data hierarchy, where the smallest usable unit of data is the character. Characters are letters, numbers, and special symbols, such as “A”, “7”, and “$”. Anything you can type from the keyboard in one keystroke (including a space or a tab) is a character. Characters are made up of smaller elements called bits, but just as most hu man beings can use a pencil without caring whether atoms are flying around inside it, most computer users can store characters without caring about these bits. Characters are grouped together to form a field. A field is a single data item, such as lastName, streetAddress, or annualSalary. Related fields are often grouped together to form a record. Records are groups of fields that go together for some logical reason. A random name, address, and salary aren’t very useful, but if they’re your name, your address, and your salary, then that’s your record. An inventory record might contain fields for item number, color, size, and price; a student record might contain ID number, grade point average, and major. Related records, in turn, are grouped together to form a file. Files are groups of records that go together for some logical reason. The individual records of each student in your class might go together in a file called STUDENTS. Records of each person at your company might be in a file called PERSONNEL. Items you sell might be in an INVENTORY file. 27 Some files can have just a few records; others, such as the file of credit-card holders for a major department-store chain or policyholders of an insurance company, can contain thousands or even millions of records. Finally, many organizations use database software to organize many files. A database holds a group of files, often called tables, which together serve the information needs of an organization. Database software establishes and maintains relationships between fields in these tables, so that users can write questions called queries. Queries pull related data items together in a format that allows businesspeople to make managerial decisions efficiently. A database contains many files. A file contains many records. Each record in a file has the same fields. Each record’s fields contain different data items that consist of one or more stored characters in each field. 2.2 USING FLOWCHART SYMBOLS AND PSEUDOCODE STATEMENTS When programmers plan the logic for a solution to a programming problem, they often use one of two tools, flowcharts or pseudocode (pronounced “sue-doe-code”). A flowchart is a pictorial representation of the logical steps it takes to solve a problem. Pseudocode is an English-like representation of the same thing. Pseudo is a prefix that means “false,” and to code a program means to put it in a programming language; therefore, pseudocode simply means “false code,” or sentences that appear to have been written in a computer programming language but don’t necessarily follow all the syntax rules of any specific language. The following five statements constitute a pseudocode representation of a number-doubling problem: start get inputNumber compute calculatedAnswer as inputNumber times 2 print calculatedAnswer stop Using pseudocode involves writing down all the steps you will use in a program. Usually, programmers preface their pseudocode statements with a beginning statement like “start” and end them with a terminating statement like “stop”. The statements between “start” and “stop” look like English and are indented slightly so that “start” and “stop” standout. Some professional programmers prefer writing pseudocode to drawing flowcharts, because using pseudocode is more similar to writing the final statements in the programming language. Others prefer drawing flowcharts to represent the logical flow, because flowcharts allow programmers to visualize more easily how the program statements will connect. Especially for beginning programmers, flowcharts are an excellent tool to help visualize how the statements in a program are interrelated. Almost every program involves the steps of input, processing, and output. Therefore, most flowcharts need some graphical way to separate these three steps. When you create a flowchart, you draw geometric shapes around the individual statements and connect them with arrows. When you draw a flowchart, you use a parallelogram to represent an input symbol, which indicates an input operation. You write an input statement, in English, inside the parallelogram, as shown in Figure 2-1 28 FIGURE 2-1: INPUT SYMBOL When you want to represent entering two or more values in a program, you can use one or multiple flowchart symbols or pseudocode statements—whichever seems more reasonable and clear to you. For example, the pseudocode to input a user’s name and address might be written as: get inputName get inputAddress or as: get inputName,inputAddress The first version implies two separate input operations, whereas the second implies a single input operation retrieving two data items. If your application will accept user input from a keyboard, using two separate input statements might make sense, because the user will type one item at a time. If your application will accept data from a storage device, obtaining all the data at once is more common. Logically, either format represents the retrieval of two data items. The end result is the same in both cases—after the statements have executed, inputName and inputAddress will have received values from an input device. Arithmetic operation statements are examples of processing. In a flowchart, you use a rectangle as the processing symbol that contains a processing statement, as shown in Figure 2-2. FIGURE 2-2: PROCESSING SYMBOL To represent an output statement, you use the same symbol as for input statements—the output symbol is a parallelogram, as shown in Figure 2-3. FIGURE 2-3: OUTPUT SYMBOL As with input, output statements can be organized in whatever way seems most reasonable. A program that prints the length and width of a room might use the statement: print length print width or: print length, width In some programming languages, using two print statements places the output values on two separate lines on the monitor or printer, whereas using a single print statement places the values next to each other on the same line. To show the correct sequence of these statements, you use arrows, or flow lines, 29 to connect the steps. Whenever possible, most of a flowchart should read from top to bottom or from left to right on a page. To complete, a flowchart should include two more elements: a terminal symbol, or start/stop symbol, at each end. Often, you place a word like “start” or “begin” in the first terminal symbol and a word like “end” or “stop” in the other. The standard terminal symbol is shaped like a racetrack; many programmers refer to this shape as a lozenge, because it resembles the shape of a medicated candy lozenge you might use to soothe a sore throat. Figure 2-4 shows a complete flowchart for the program that doubles a number, and the pseudocode for the same problem. FIGURE 2-4: FLOWCHART AND PSEUDOCODE OF PROGRAM THAT DOUBLES A NUMBER The logic for the program represented by the flowchart and pseudocode in Figure 2-4 is correct no matter what programming language the programmer eventually uses to write the corresponding code. After the flowchart or pseudocode has been developed, the programmer only needs to: buy a computer, buy a language compiler, learn a programming language, code the program, attempt to compile it, fix the syntax errors, compile it again, test it with several sets of data, and put it into production. 2.3 USING AND NAMING VARIABLES 30 Programmers commonly refer to the locations in memory called inputNumber and calculatedAnswer as variables. Variables are memory locations, whose contents can vary or differ over time. Sometimes, inputNumber can hold a 2 and calculatedAnswer will hold a 4; at other times, inputNumber can hold a 6 and calculatedAnswer will hold a 12. It is the ability of memory variables to change in value that makes computers and programming worthwhile. Because one memory location can be used over and over again with different values, you can write program instructions once and then use them for thousands of separate calculations. The number-doubling example requires two variables, inputNumber and calculatedAnswer. These can just as well be named userEntry and programSolution, or inputValue and twiceTheValue. As a programmer, you choose reasonable names for your variables. The language interpreter then associates the names you choose with specific memory addresses. A variable name is also called an identifier. Every computer programming language has its own set of rules for naming identifiers. Most languages allow both letters and digits within variable names. Some languages allow hyphens in variable names—for example, hourly-wage. Others allow underscores, as in hourly_wage. Still others allow neither. Some languages allow dollar signs or other special characters in variable names (for example, hourly$); others allow foreign alphabet characters, such as π or Ω. Even though every language has its own rules for naming variables, when designing the logic of a computer program, you should not concern yourself with the specific syntax of any particular computer language. The logic, after all, works with any language. The variable names used throughout this book follow only two rules: 1. Variable names must be one word. The name can contain letters, digits, hyphens, underscores, or any other characters you choose, with the exception of spaces. Therefore, r is a legal variable name, as is rate, as is interestRate. The variable name interest rate is not allowed because of the space. No programming language allows spaces within a variable name. If you see a name such as interest rate in a flowchart or pseudocode, you should assume that the programmer is discussing two variables, interest and rate, each of which individually would be a fine variable name. 2. Variable names should have some appropriate meaning. This is not a rule of any programming language. When computing an interest rate in a program, the computer does not care if you call the variable g, u84, or fred. As long as the correct numeric result is placed in the variable, its actual name doesn’t really matter. However, it’s much easier to follow the logic of a program with a statement in it like compute finalBalance as equal to initialInvestment time’s interestRate as one with a statement in it like compute someBanana as equal to j89 times myFriendLinda. You might think you will remember how you intended to use a cryptic variable name within a program, but several months or years later when a program requires changes, you, and other programmers working with you, will appreciate clear, descriptive variable names. 2.4 ENDING A PROGRAM BY USING SENTINEL VALUES Recall that the logic in the flowchart for doubling numbers has a major flaw—the program never ends. This programming situation is known as an infinite loop—a repeating flow of logic with no end. If, for example, the input numbers are being entered at the keyboard, the program will keep accepting numbers and printing doubles forever. Of course, the user could refuse to type in any more numbers. But the computer is very patient, and if you refuse to give it any more numbers, it will sit and wait forever. When you finally type in a number, the program will double it, print the result, and wait for another. The program cannot progress any further while it is waiting for input; meanwhile, the program is occupying computer memory and tying up operating system resources. Refusing to enter any more numbers is not 31 a practical solution. Another way to end the program is simply to turn the computer off. But again, that’s neither the best nor an elegant way to bring the program to an end. A superior way to end the program is to set a predetermined value for inputNumber that means “Stop the program!” For example, the programmer and the user could agree that the user will never need to know the double of 0 (zero), so the user could enter a 0 when he or she wants to stop. The program could then test any incoming value contained in inputNumber and, if it is a 0, stop the program. Testing a value is also called making a decision. You represent a decision in a flowchart by drawing a decision symbol, which is shaped like a diamond. The diamond usually contains a question, the answer to which is one of two mutually exclusive options—often yes or no. All good computer questions have only two mutually exclusive answers, such as yes and no or true and false. One drawback to using 0 to stop a program, of course, is that it won’t work if the user does need to find the double of 0. In that case, some other data-entry value that the user never will need, such as 999 or – 1, could be selected to signal that the program should end. A preselected value that stops the execution of a program is often called a dummy value because it does not represent real data, but just a signal to stop. Sometimes, such a value is called a sentinel value because it represents an entry or exit point, like a sentinel who guards a fortress. Not all programs rely on user data entry from a keyboard; many read data from an input device, such as a disk or tape drive. When organizations store data on a disk or other storage device, they do not commonly use a dummy value to signal the end of the file. For one thing, an input record might have hundreds of fields, and if you store a dummy record in every file, you are wasting a large quantity of storage on “non-data.” Additionally, it is often difficult to choose sentinel values for fields in a company’s data files. 32 FIGURE 2-5: FLOWCHART FOR NUMBER-DOUBLING PROGRAM WITH SENTINEL VALUE OF 0 Fortunately, programming languages can recognize the end of data in a file automatically, through a code that is stored at the end of the data. Many programming languages use the term eof (for “end of file”) to talk about this marker that automatically acts as a sentinel. Therefore, uses eof to indicate the end of data, regardless of whether the code is a special disk marker or a dummy value such as 0 that comes from the keyboard. 33 FIGURE 2-6: FLOWCHART USING eof 2.5 USING THE CONNECTOR By using just the input, processing, output, decision, and terminal symbols, you can represent the flowcharting logic for many diverse applications. When drawing a flowchart segment, you might use another symbol, the connector. You can use a connector when limited page size forces you to continue a flowchart in an unconnected location or on another page. If a flowchart has six processing steps and a page provides room for only three, you might represent the logic as shown in Figure 2-7. 34 FIGURE 2-7: FLOWCHART USING THE CONNECTOR By convention, programmers use a circle as an on-page connector symbol, and a symbol that looks like a square with a pointed bottom as an off-page connector symbol. The on-page connector at the bottom of the left column in Figure 2-7 tells someone reading the flowchart that there is more to the flowchart. The circle should contain a number or letter that can then be matched to another number or letter somewhere else, in this case on the right. If a large flowchart needed more connectors, new numbers or letters would be assigned in sequence (1, 2, 3... or A, B, C...) to each successive pair of connectors. The off-page connector at the bottom of the right column in Figure 2-7 tells a reader that there is more to the flowchart on another page. When you are creating your own flowcharts, you should avoid using any connectors, if at all possible; flowcharts are more difficult to follow when their segments do not fit together on a page. Some programmers would even say that if a flowchart must connect to another page, it is a sign of poor design. Your instructor or future programming supervisor may require that long flowcharts be redrawn so you don’t need to use the connector symbol. However, when continuing to a new location or page is unavoidable, the connector provides the means. 2.6 ASSIGNING VALUES TO VARIABLES When you create a flowchart or pseudocode for a program that doubles numbers, you can include the statement compute calculatedAnswer as inputNumber times 2. This statement incorporates two actions. First, the computer calculates the arithmetic value of inputNumber times 2. Second, the computed value is stored in the calculatedAnswer memory location. Most programming languages allow a shorthand expression for assignment statements such as compute calculatedAnswer as inputNumber times 2. The shorthand takes the form calculatedAnswer = inputNumber * 2. The equal sign is the assignment operator; it always requires the name of a memory location on its left side—the name of the location where the result will be stored. 35 According to the rules of algebra, a statement like calculatedAnswer = inputNumber * 2 should be exactly equivalent to the statement inputNumber * 2 = calculatedAnswer. That’s because in algebra, the equal sign always represents equivalency. In most programming languages, however, the equal sign represents assignment, and calculatedAnswer = inputNumber * 2 means “multiply inputNumber by 2 and store the result in the variable called calculatedAnswer.” whatever operation is performed to the right of the equal sign results in a value that is placed in the memory location to the left of the equal sign. Therefore, the incorrect statement inputNumber * 2 = calculatedAnswer means to attempt to take the value of calculatedAnswer and store it in a location called inputNumber * 2, but there can’t be a location called inputNumber * 2. For one thing, you should recognize that the expression inputNumber * 2 can’t be a variable because it has spaces in it. For another, a location can’t be multiplied. Its contents can be multiplied, but the location itself cannot be. The backward statement inputNumber * 2 = calculatedAnswer contains a syntax error, no matter what programming language you use; a program with such a statement will not execute. When you create an assignment statement, it may help to imagine the word “let” in front of the statement. Thus, you can read the statement calculatedAnswerƒ= inputNumberƒ*ƒ2 as “Let calculatedAnswer equal inputNumber times two.” The BASIC programming language allows you to use the word “let” in such statements. You might also imagine the word “gets” or “receives” in place of the assignment operator. In other words, calculatedAnswerƒ=ƒinputNumberƒ*ƒ2 means both calculatedAnswerƒ gets ƒinputNumberƒ*ƒ2 and calculatedAnswer receives ƒinputNumberƒ*ƒ2. Many programming languages allow you to create named constants. A named constant is a named memory location, similar to a variable, except its value never changes during the execution of a program. If you are working with a programming language that allows it, you might create a constant for a value such as PIƒ=ƒ3.14 or COUNTY_SALES_TAX_RATEƒ=ƒ.06. Many programmers follow the convention of using camel casing for variable identifiers but all capital letters for constant identifiers. 2.7 UNDERSTANDING DATA TYPES Computers deal with two basic types of data—text and numeric. When you use a specific numeric value, such as 43, within a program, you write it using the digits and no quotation marks. A specific numeric value is often called a numeric constant, because it does not change—a 43 always has the value 43. When you use a specific text value, or string of characters, such as “Amanda”, you enclose the string constant, or character constant, within quotation marks. Some languages require single quotation marks surrounding character constants, whereas others require double quotation marks. Many languages, including C++, C#, and Java, reserve single quotes for a single character such as ‘A’, and double quotes for a character string such as “Amanda”. Similarly, most computer languages allow at least two distinct types of variables. A variable’s data type describes the kind of values the variable can hold and the types of operations that can be performed with it. One type of variable can hold a number, and is often called a numeric variable. A numeric variable is one that can have mathematical operations performed on it; it can hold digits, and usually can hold a decimal point and a sign indicating positive or negative if you want. In the statement calculatedAnswer = inputNumber * 2, both calculatedAnswer and inputNumber are numeric variables; that is, their intended contents are numeric values, such as 6 and 3, 150 and 75, or –18 and –9. 36 Most programming languages have a separate type of variable that can hold letters of the alphabet and other special characters such as punctuation marks. Depending on the language, these variables are called character, text, or string variables. If a working program contains the statement lastName = “Lincoln”, then lastName is a character or string variable. Programmers must distinguish between numeric and character variables, because computers handle the two types of data differently. Therefore, means are provided within the syntax rules of computer programming languages to tell the computer which type of data to expect. How this is done is different in every language; some languages have different rules for naming the variables, but with others you must include a simple statement (called a declaration) telling the computer which type of data to expect. Some languages allow for several types of numeric data. Languages such as C++, C#, Visual Basic, and Java distinguish between integer (whole number) numeric variables and floating-point (fractional) numeric variables that contain a decimal point. Thus, in some languages, the values 4 and 4.3 would be stored in different types of numeric variables. Some programming languages allow even more specific variable types, but the character versus numeric distinction is universal. For the programs you develop in this book, assume that each variable is one of the two broad types. If a variable called taxRate is supposed to hold a value of 2.5, assume that it is a numeric variable. If a variable called inventoryItem is supposed to hold a value of “monitor”, assume that it is a character variable. Values such as “monitor” and 2.5 are called constants or literal constants because they never change. A variable value can change. Thus, inventoryItem can hold “monitor” at one moment during the execution of a program, and later you can change its value to “modem”. By convention, character data like “monitor” within quotation marks to distinguish the characters from yet another variable name. Also by convention, numeric data values are not enclosed within quotation marks. According to these conventions, then, taxRate = 2.5 and inventoryItem = “monitor” are both valid statements. The statement inventoryItem = monitor is a valid statement only if monitor is also a character variable. In other words, if monitor =”color”, and subsequently inventoryItem = monitor, then the end result is that the memory address named inventoryItem contains the string of characters “color”. Every computer handles text or character data differently from the way it handles numeric data. You may have experienced these differences if you have used application software such as spreadsheets or database programs. For example, in a spreadsheet, you cannot sum a column of words. Similarly, every programming language requires that you distinguish variables as to their correct type, and that you use each type of variable appropriately. Identifying your variables correctly as numeric or character is one of the first steps you have to take when writing programs in any programming language. The process of naming program variables and assigning a type to them is called making declarations, or declaring variables. Table 2-1 provides you with a few examples of legal and illegal variable assignment statements. 37 2.8 UNDERSTANDING THE EVOLUTION OF PROGRAMMING TECHNIQUES People have been writing computer programs since the 1940s. The oldest programming languages required programmers to work with memory addresses and to memorize awkward codes associated with machine languages. Newer programming languages look much more like natural language and are easier for programmers to use. Part of the reason it is easier to use newer programming languages is that they allow programmers to name variables instead of using awkward memory addresses. Another reason is that newer programming languages provide programmers with the means to create self- contained modules or program segments that can be pieced together in a variety of ways. The oldest computer programs were written in one piece, from start to finish; modern programs are rarely written that way— they are created by teams of programmers, each developing his or her own reusable and connectable program procedures. Writing several small modules is easier than writing one large program, and most large tasks are easier when you break the work into units and get other workers to help with some of the units. Currently, there are two major techniques used to develop programs and their procedures. One technique, called procedural programming, focuses on the procedures that programmers create. That is, procedural programmers focus on the actions that are carried out—for example, getting input data for an employee and writing the calculations needed to produce a paycheck from the data. Procedural programmers would approach the job of producing a paycheck by breaking down the paycheck- producing process into manageable subtasks. The other popular programming technique, called object- oriented 38 programming, focuses on objects, or “things,” and describes their features, or attributes, and their behaviors. For example, object-oriented programmers might design a payroll application by thinking about employees and paychecks, and describing their attributes (such as last name or check amount) and behaviors (such as the calculations that result in the check amount). With either approach, procedural or object-oriented, you can produce a correct paycheck, and both techniques employ reusable program modules. The major difference lies in the focus the programmer takes during the earliest planning stages of a project. The skills you gain in programming procedurally— declaring variables, accepting input, making decisions, producing output, and so on—will serve you well whether you eventually write programs in a procedural or object-oriented fashion, or in both. 2.8.1 PROBLEM SOLVING APPROACH There are two ways to solve the problems any programming languages: top-down and bottom-up. The problem is broken up into procedures which will call each other, and pass information back and forth using arguments (input values) and return values (output values). Ideally the arguments and return values should be the only method of interaction between two procedures. Then when you use a procedure there’s no need to concern yourself with how it actually works, just what the result is – i.e. you only need to know about the interface to the procedure, not its internals. That is, you can think of the procedure as a “black box” that takes one or more input values, performs some process on them, and produces one or more output values. In programming this is particularly important when it comes to testing, working as part of a large programming team, or when trying out different ways of implementing a procedure – as it allows each procedure to be dealt with in isolation, whilst still giving some degree of confidence that they will operate as expected when all connected together. Top-Down Approach The top-down approach is the most useful when the overall problem is known in advance, but the intricate detail isn’t. Start by splitting the whole problem into slightly smaller, slightly better defined, procedures, then keep splitting each of these into even smaller, even better defined, procedures – until you get to a point where the procedures contain very specific low-level steps that you know how to perform. The key to top-down problem solving is: Break the problem down into smaller and smaller problems. Look for parts of the problem which are sufficiently similar to each other that they can be replaced by a generalised procedure that uses one or more arguments to specify how it behaves. Bottom-Up The bottom-up approach is most useful when the low-level details of the problem are known in advance, but the overall problem isn’t necessarily. You start by taking procedures that you have already developed (reusing them), and connecting them together so as to produce a new procedure that solves a more complex problem. These new procedures can then be combined to produce solutions to yet more complex problems, and so on. The key to bottom-up problem solving is: Reuse procedures that you have already devised. Combine them in different ways to solve more complex problems than they were originally designed for. Algorithms Algorithm is the name for a formal description of how to go about solving a problem. An algorithm is a finite list of instructions, based on a finite set of atomic commands, which specifically relate to the actions 39 of some device and are executed in a definite, consistent, order. Usually, to be of use, an algorithm must finish executing in a finite time. This means that: An algorithm must have a beginning and an end – otherwise we wouldn’t know where to start and stop. The commands must be as basic as possible – there must be no room for interpretation (either through experience, or guesswork) as there might be with, say, the recipe for baking a cake. The number of possible instructions is limited – otherwise it would take forever to determine what each instruction means. There must be some practical mechanism capable of executing the algorithm; otherwise it is of no use. We must be sure that the instructions will be executed in the same order every time the algorithm operates, otherwise its behaviour will be completely unpredictable. Usually we want the algorithm to have an outcome – so it must come to an end at some point. However this is not always the case, e.g. an algorithm for monitoring a system should probably operate “forever”. Flow Charts In a few weeks we will be describing algorithms using high level programming languages like C, and C++. Often it is not a good idea to start thinking about an algorithm using a programming language, as they require you to be very precise about the description, and the process of getting the program exactly right (debugging it) can get in the way of formulating the algorithm. Initially it is much better to use a less formal description, such as a flow chart. Flow charts provide a convenient graphical way of describing how an algorithm works. The symbols are used as follows: Start/End: Used for the start and end points of the algorithm as a whole, or of one of its procedures. Usually there can only be one start point, but there may be more than one end point (and there should be at least one of each). Process: Any processing that the algorithm does. This could be a specific calculation, or it could be something quite vague that you haven’t figured out how to describe in detail yet. Input/ Output: Any point where data flows into, or out of, the algorithm (possibly both at the same time). Connector: Crossing flow lines in a flow chart leads to confusion, so use numbered connector symbols to jump around complicated diagrams as necessary (N.B. there shouldn’t be more than two connectors with the same number). TOPIC SUMMARY 40 When data items are stored for use on computer systems, they are stored in a data hierarchy of character, field, record, file, and database. When programmers plan the logic for a solution to a programming problem, they often use flowcharts or pseudocode. When you draw a flowchart, you use parallelograms to represent input and output operations, and rectangles to represent processing. Variables are named memory locations, the contents of which can vary. As a programmer, you choose reasonable names for your variables. Every computer programming language has its own set of rules for naming variables; however, all variable names must be written as one word without embedded spaces, and should have appropriate meaning. Testing a value involves making a decision. You represent a decision in a flowchart by drawing a diamond-shaped decision symbol containing a question, the answer to which is either yes or no. You can stop a program’s execution by using a decision to test for a sentinel value. A connector symbol is used to continue a flowchart that does not fit together on a page, or must continue on an additional page. Most programming languages use the equal sign to assign values to variables. Assignment always takes place from right to left. Programmers must distinguish between numeric and character variables, because computers handle the two types of data differently. A variable declaration tells the computer which type of data to expect. By convention, character data values are included within quotation marks. Procedural and object-oriented programmers approach program problems differently. Procedural programmers concentrate on the actions performed with data. Object-oriented programmers focus on objects and their behaviors and attributes. KEY TERMS The data hierarchy represents the relationship of databases, files, records, fields, and characters. Characters are letters, numbers, and special symbols such as “A”, “7”, and “$”. A field is a single data item, such as lastName, streetAddress, or annualSalary. Records are groups of fields that go together for some logical reason. Files are groups of records that go together for some logical reason. A database holds a group of files, often called tables, which together serve the information needs of an organization. Queries are questions that pull related data items together from a database in a format that enhances efficient management decision making. A flowchart is a pictorial representation of the logical steps it takes to solve a problem. Pseudocode is an English-like representation of the logical steps it takes to solve a problem. Input symbols, which indicate input operations, are represented as parallelograms in flowcharts. Processing symbols are represented as rectangles in flowcharts. Output symbols, which indicate output operations, are represented as parallelograms in flowcharts. Flow lines, or arrows, connect the steps in a flowchart. A terminal symbol, or start/stop symbol, is used at each end of a flowchart. Its shape is a lozenge. Variables are memory locations, whose contents can vary or differ over time. 41 A variable name is also called an identifier. A mnemonic is a memory device; variable identifiers act as mnemonics for hard-to-remember memory addresses. Camel casing is the format for naming variables in which multiple-word variable names are run together, and each new word within the variable name begins with an uppercase letter. An infinite loop is a repeating flow of logic without an ending. Testing a value is also called making a decision. You represent a decision in a flowchart by drawing a decision symbol, which is shaped like a diamond. A yes-or-no decision is called a binary decision, because there are two possible outcomes. A dummy value is a preselected value that stops the execution of a program. Such a value is sometimes called a sentinel value because it represents an entry or exit point, like a sentinel who guards a fortress. Many programming languages use the term eof (for “end of file”) to talk about an end-of-data file marker. A connector is a flowchart symbol used when limited page size forces you to continue the flowchart elsewhere on the same page or on the following page. An assignment statement stores the result of any calculation performed on its right side to the named location on its left side. The equal sign is the assignment operator; it always requires the name of a memory location on its left side. A numeric constant is a specific numeric value. A string constant, or character constant, is enclosed within quotation marks. A variable’s data type describes the kind of values the variable can hold and the types of operations that can be performed with it. Numeric variables hold numeric values. Character, text, or string variables hold character values. If a working program contains the statement lastName = “Lincoln”, then lastName is a character or string variable. A declaration is a statement that names a variable and tells the computer which type of data to expect. Integer values are whole-number, numeric variables. Floating-point values are fractional, numeric variables that contain a decimal point. The process of naming program variables and assigning a type to them is called making declarations, or declaring variables. The technique known as procedural programming focuses on the procedures that programmers create. The technique known as object-oriented programming focuses on objects, or “things,” and describes their features, or attributes, and their behaviors. REVIEW QUESTIONS MULTIPLE CHOICE QUESTIONS 42 1. Which of the following is a typical input instruction? A. get accountNumber B. calculate balanceDue C. print customerIdentificationNumber D. total = janPurchase + febPurchase 2. Which of the following is a typical processing instruction? A. print answer B. get userName C. pctCorrect = rightAnswers / allAnswers D. print calculatedPercentage 3. Which of the following is true regarding the data hierarchy? A. files contain records B. characters contain fields C. fields contain files D. fields contain records 4. The parallelogram is the flowchart symbol representing. A. input B. output C. both a and b D. none of the above 5. Which of the following is not a legal variable name in any programming language? A. semester grade B. fall2005_grade C. GradeInCIS100 D. MY_GRADE 6. In flowcharts, the decision symbol is a. A. parallelogram B. rectangle C. lozenge D. diamond 7. The term “eof” represents. A. a standard input device B. a generic sentinel value C. a condition in which no more memory is available for storage D. the logical flow in a program 8. The two broadest types of data are. A. internal and external B. volatile and constant C. character and numeric D. permanent and temporary FIND THE ERRORS 43 Since the early days of computer programming, program errors have been called “bugs.” The term is often said to have originated from an actual moth that was discovered trapped in the circuitry of a computer at Harvard University in 1945. Actually, the term “bug” was in use prior to 1945 to mean trouble with any electrical apparatus; even during Thomas Edison’s life, it meant an “industrial defect.” However, the process of finding and correcting program errors has come to be known as debugging. Each of the following pseudocode segments contains one or more bugs that you must find and correct. 1. This pseudocode segment is intended to describe computing your average score of two classroom tests. input midtermGrade input finalGrade average = (inputGrade + final) / 3 print average 2. This pseudocode segment is intended to describe computing the number of miles per gallon you get with your automobile. input milesTraveled input gallonsOfGasUsed gallonsOfGasUsed / milesTravelled = milesPerGallon print milesPerGal 3. This pseudocode segment is intended to describe computing the cost per day and the cost per week for a vacation. input totalDollarsSpent input daysOnTrip costPerDay = totalMoneySpent * daysOnTrip weeks = daysOnTrip / 7 costPerWeek = daysOnTrip / numberOfWeeks print costPerDay, week EXERCISES 1. Match the definition with the appropriate term. 1. Computer system equipment a. compiler 2. Another word for programs b. syntax 3. Language rules c. logic 4. Order of instructions d. hardware 5. Language translator e. software 2. Describe the steps to write a computer program. 3. Consider a student file that contains the following data: 44 4. Would this set of data be suitable and sufficient to use to test each of the following programs? Explain why or why not. a. a program that prints a list of Psychology majors b. a program that prints a list of Art majors c. a program that prints a list of students on academic probation—those with a grade point average under 2.0 d. a program that prints a list of students on the dean’s list e. a program that prints a list of students from Wisconsin f. a program that prints a list of female students 5. Suggest a good set of test data to use for a program that gives an employee a R50 bonus check if the employee has produced more than 1,000 items in a week. 6. Suggest a good set of test data for a program that computes gross paychecks (that is, before any taxes or other deductions) based on hours worked and rate of pay. The program computes gross as hour’s times rate, unless hours are over 40. If so, the program computes gross as regular rate of pay for 40 hours, plus one and a half times the rate of pay for the hours over 40. 7. Suggest a good set of test data for a program that is intended to output a student’s grade point average based on letter grades (A, B, C, D, or F) in five courses. 8. Suggest a good set of test data for a program for an automobile insurance company that wants to increase its premiums by R50 per month for every ticket a driver receives in a three-year period. 9. Assume that a grocery store keeps a file for inventory, where each grocery item has its own record. two fields within each record are the name of the manufacturer and the weight of the item. Name at least six more fields that might be stored for each record. Provide an example of the data for one record. For example, for one product the manufacturer is Del Monte, and the weight is 12 ounces. 10. Assume that a library keeps a file with data about its collection, one record for each item the library lends out. Name at least eight fields that might be stored for each record. Provide an example of the data for one record. 45 11. Which of the following names seem like good variable names to you? If a name doesn’t seem like a good variable name, explain why not. A. c B. cost C. costAmount D. cost amount E. cstofdngbsns F. costOfDoingBusinessThisFiscalYear G. cost2004 12. If myAge and yourRate are numeric variables, and departmentCode is a character variable, which of the following statements are valid assignments? If a statement is not valid, explain why not. A. myAge = 23 B. myAge = yourRate C. myAge = departmentCode D. myAge = “departmentCode” E. 42 = myAge F. yourRate = 3.5 G. yourRate = myAge H. yourRate = departmentCode I. 6.91 = yourRate J. departmentCode = Personnel K. departmentCode = “Personnel” L. departmentCode = 413 M. departmentCode = “413” N. departmentCode = myAge O. departmentCode = yourRate 46 P. 413 = departmentCode Q. “413” = departmentCode 13. Complete the following tasks: A. Draw a flowchart to represent the logic of a program that allows the user to enter a value. The program multiplies the value by 10 and prints the result. B. Write pseudocode for the same problem. 14. Complete the following tasks: A. Draw a flowchart to represent the logic of a program that allows the user to enter a value that represents the radius of a circle. The program calculates the diameter (by multiplying the radius by 2), and then calculates the circumference (by multiplying the diameter by 3.14). The program prints both the diameter and the circumference. B. Write pseudocode for the same problem. 15. Complete the following tasks: A. Draw a flowchart to represent the logic of a program that allows the user to enter two values. The program prints the sum of the two values. B. Write pseudocode for the same problem. 16. Complete the following tasks: A. Draw a flowchart to represent the logic of a program that allows the user to enter three values. The first value represents hourly pay rate, the second represents the number of hours worked this pay period, and the third represents the percentage of gross salary that is withheld. The program multiplies the hourly pay rate by the number of hours worked, giving the gross pay; then, it multiplies the gross pay by the withholding percentage, giving the withholding amount. Finally, it subtracts the withholding amount from the gross pay, giving the net pay after taxes. The program prints the net pay. B. Write pseudocode for the same problem. TOPIC 3 3. UNDERSTANDING STRUCTURE LEARNING OUTCOMES After Studying this topic you should be able to: Describe the three basic structures—sequence, selection, and loop Understand the need for structure Describe three special structures—case, do-while, and do-until 47 3.1 UNDERSTANDING THE THREE BASIC STRUCTURES In the mid-1960s, mathematicians proved that any program, no matter how complicated, can be constructed using one or more of only three structures. A structure is a basic unit of programming logic; each structure is a sequence, selection, or loop. With these three structures alone, you can diagram any task, from doubling a number to performing brain surgery. You can diagram each structure with a specific configuration of flowchart symbols. The first of these structures is a sequence, as shown in Figure 3.1. With a sequence structure, you perform an action or task, and then you perform the next action, in order. A sequence can contain any number of tasks, but there is no chance to branch off and skip any of the tasks. Once you start a series of actions in a sequence, you must continue step-by-step until the sequence ends. FIGURE 3-1: SEQUENCE STRUCTURE The second structure is called a selection structure or decision structure, as shown in Figure 3-2. With this structure, you ask a question, and, depending on the answer, you take one of two courses of action. Then, no matter which path you follow, you continue with the next task. FIGURE 3-2: SELECTION STRUCTURE Some people call the selection structure an if-then-else because it fits the following statement: if someCondition is true then do oneProcess else do theOtherProcess For example, while cooking you may decide the following: 48 if we have brownSugar then use brownSugar else use whiteSugar Similarly, a payroll program might include a statement such as: if hoursWorked is more than 40 then calculate regularPay and overtimePay else calculate regularPay The previous examples can also be called dual-alternative ifs, because they contain two alternatives— the action taken when the tested condition is true and the action taken when it is false. Note that it is perfectly correct for one branch of the selection to be a “do nothing” branch. For example: if it is raining then take anUmbrella or if employee belongs to dentalPlan then deduct R40 from employeeGrossPay The previous examples are single-alternative ifs, and a diagram of their structure is shown in Figure 3-3. In these cases, you don’t take any special action if it is not raining or if the employee does not belong to the dental plan. The case where nothing is done is often called the null case. FIGURE 3-3: SINGLE-ALTERNATIVE DECISION STRUCTURE The third structure, shown in Figure 3-4, is a loop. In a loop structure, you continue to repeat actions based on the answer to a question. In the most common type of loop, you first ask a question; if the answer requires an action, you perform the action and ask the original question again. If the answer requires that the action be taken again, you take the action and then ask the original question again. This continues until the answer to the question is such that the action is no longer required; then you exit the structure. You may hear programmers refer to looping as repetition or iteration. FIGURE 3-4: LOOP STRUCTURE 49 Some programmers call this structure a while...do, or more simply, a while loop, because it fits the following statement: while testCondition continues to be true do someProcess You encounter examples of looping every day, as in: while you continue to beHungry take anotherBiteOfFood or while unreadPages remain in the readingAssignment read another unreadPage In a business program, you might write: while quantityInInventory remains low continue to orderItems or while there are more retailPrices to be discounted compute a discount All logic problems can be solved using only these three structures—sequence, selection, and loop. The three structures, of course, can be combined in an infinite number of ways. For example, you can have a sequence of tasks followed by a selection, or a loop followed by a sequence. Attaching structures end- to- end is called stacking structures. For example, Figure 3-5 shows a structured flowchart achieved by stacking structures, and shows pseudocode that might follow that flowchart logic. FIGURE 3-5: STRUCTURED FLOWCHART AND PSEUDOCODE 50 The pseudocode in Figure 3-5 shows two end-structure statements—endif and endwhile. You can use an endif statement to clearly show where the actions that depend on a decision end. The instruction that follows if occurs when its tested condition is true, the instruction that follows else occurs when the tested condition is false, and the instruction that follows endif occurs in either case—it is not dependent on the if statement at all. In other words, statements beyond the endif statement are “outside” the decision structure. Similarly, you use an endwhile statement to show where a loop structure ends. In Figure 3-5, while conditionF continues to be true, stepG continues to execute. If any statements followed the endwhile statement, they would be outside of, and not a part of, the loop. Besides stacking structures, you can replace any individual tasks or steps in a structured flowchart diagram or pseudocode segment with additional structures. In other words, any sequence, selection, or loop can contain other sequences, selections, or loops. For example, you can have a sequence of three tasks on one side of a selection, as shown in Figure 3-6. Placing a structure within another structure is called nesting the structures. FIGURE 3-6: FLOWCHART AND PSEUDOCODE SHOWING A SEQUENCE NESTED WITHIN A SELECTION When you write the pseudocode for the logic shown in Figure 3-6, the convention is to indent all statements that depend on one branch of the decision, as shown in the pseudocode. The indentation and the endif statement both show that all three statements (do stepB, do stepC, and do stepD) must execute if conditionA is not true. The three statements constitute a block, or a group of statements that execute as a single unit. In place of one of the steps in the sequence in Figure 3-6, you can insert a selection. In Figure 3-7, the process named stepC has been replaced with a selection structure that begins with a test of the condition named conditionF. 51 FIGURE 3-7: SELECTION IN A SEQUENCE WITHIN A SELECTION In the pseudocode shown in Figure 3-7, notice that do stepB, if conditionF is true then, else, endif, and do stepD all align vertically with each other. This shows that they are all “on the same level.” If you look at the same problem flowcharted in Figure 3-7, you see that you could draw a vertical line through the symbols containing stepB, conditionF, and stepD. The flowchart and the pseudocode represent exactly the same logic. The stepH and stepG processes, on the other hand, are one level “down”; they are dependent on the answer to the conditionF question. Therefore, the do stepH and do stepG statements are indented one additional level in the pseudocode. Also notice that the pseudocode in Figure 3-7 has two endif statements. Each is aligned to correspond to an if. An endif always partners with the most recent if that does not already have an endif partner, and an endif should always align vertically with its if partner. In place of do stepH on one side of the new selection in Figure 3-7, you can insert a loop. This loop, based on conditionI, appears inside the selection that is within the sequence that constitutes the “No” side of the original conditionA selection. In the pseudocode in Figure 3-8, notice that the while aligns with the endwhile, and that the entire while structure is indented within the true (“Yes”) half of the if structure that begins with the decision based on conditionF. The indentation used in the pseudocode reflects the logic you can see laid out graphically in the flowchart. 52 FIGURE 3-8: FLOWCHART AND PSEUDOCODE FOR LOOP WITHIN SELECTION WITHIN SEQUENCE WITHIN SELECTION The combinations are endless, but each of a structured program’s segments is a sequence, a selection, or a loop. The three structures are shown together in Figure 3-9. Notice that each structure has one entry and one exit point. One structure can attach to another only at one of these points. FIGURE 3-9: THE THREE STRUCTURES In summary, a structured program has the following characteristics: A structured program includes only combinations of the three basic structures—sequence, selection, and loop. Any structured program might contain one, two, or all three types of structures. 53 Structures can be stacked or connected to one another only at their entry or exit points. Any structure can be nested within another structure. A structured program is never required to contain examples of all three structures; a structured program might contain only one or two of them. For example, many simple programs contain only a sequence of several tasks that execute from start to finish without any needed selections or loops. 3.2 UNDERSTANDING THE REASONS FOR STRUCTURE The reasons for using only the three structures— sequence, selection, and loop. However, staying with these three structures is better for the following reasons: Clarity—the number-doubling program is a small program. As programs get bigger, they get more confusing if they’re not structured. Professionalism—All other programmers (and programming teachers you might encounter) expect your programs to be structured. It’s the way things are done professionally. Efficiency—most newer computer languages are structured languages with syntax that lets you deal efficiently with sequence, selection, and looping. Older languages, such as assembly languages, COBOL, and RPG, were developed before the principles of structured programming were discovered. However, even programs that use those older languages can be written in a structured form, and structured programming is expected on the job today. Newer languages such as C#, C++, and Java enforce structure by their syntax. Maintenance—You, as well as other programmers, will find it easier to modify and maintain structured programs as changes are required in the future. Modularity—structured programs can be easily broken down into routines or modules that can be assigned to any number of programmers. The routines are then pieced back together like modular furniture at each routine’s single entry or exit point. Additionally, often a module can be used in multiple programs, saving development time in the new project. Consider the college admissions program from the beginning of this chapter. It has been rewritten in structured form in Figure 3-10 and is easier to foIIow now. Figure 3-10 also shows structured pseudocode forthe same probIem. 54 start read testscore, classRank lf testscore >= 90 then 1£ classRank >= 25 then print Accept" else print Re)ect" endlf else 1£ testscore >= 80 then 1£ classRank >= SO then print " Accept.. else print " ReJect endlf else 1£ testscore >= 70 then 1£ classRank >= 75 then print Accept" else print ReJect" endlf else print " ReJect endlf endlf ru!lesiSoore,ClassRlr1k I l tesiScor e > = 90tllell C13SS11MK >= 25 men 1pnm """"'' else I endif 1..£!!1!-H- else I if testSccn >= so then nclassR:!nk >= so then I else l llfint "Reieci endn..:;::_- 55 else - n testScore >= 70 then n d3ssRook>= 75t11e11 l pnnt ACcepr elSe 1pmt -t=60 then print “Pass” endif else print “Fail” 2. This pseudocode segment is intended to describe computing the number of miles per gallon you get with your automobile. The program segment should continue as long as the user enters a positive value for miles travelled. input gallonsOfGasUsed input milesTraveled while milesTraveled>0 milesPerGallon=gallonsOfGasUsed/milesTraveled print milesPerGal endwhile 3. This pseudocode segment is intended to describe computing the cost per day for a vacation. The user enters a value for total dollars available to spend and can continue to enter new dollar amounts while the amount entered is not 0. For each new amount entered, if the amount of money available to spend per day is below R100, a message displays. input totalDollarsAvailable while totalDollarsAvailable not = 0 dollarsPerDay = totalMoneyAvailable / 7 print dollarsPerDay endwhile input totalDollarsAvailable if dollarsPerDay > 100 then print “You better search for a bargain vacation” endwhile 65 TOPIC 4 4. MODULARIZATION LEARNING OUTCOMES After Studying this topic you should be able to: Describe the advantages of modularization Modularize a program Understand how a module can call another module Explain how to declare variables 4.1 MODULES, SUBROUTINES, PROCEDURES, FUNCTIONS, OR METHODS Programmers seldom write programs as one long series of steps. Instead, they break down the programming problem into reasonable units, and tackle one small task at a time. These reasonable units are called modules. Programmers also refer to them as subroutines, procedures, functions, or methods. The name that programmers use for their modul