ilovepdf_merged.pdf - Automated Testing and Testing Tools PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document provides a comprehensive introduction to automated testing and testing tools. It details the benefits of automation like speed, efficiency, accuracy, and resource reduction. The document explores various types of tools, including viewers, drivers, stubs, and stress/load tools and explaining software test automation and different automation types such as macro recording and playback.
Full Transcript
Unit 8-Automated testing and testing tools Introduction Testing software is hard work. If you’ve done some testing on your own while reading this book, you’ve seen that the physical task of performing the tests can take agreat deal of time and effort. Sure, you could spend more time equivalence pa...
Unit 8-Automated testing and testing tools Introduction Testing software is hard work. If you’ve done some testing on your own while reading this book, you’ve seen that the physical task of performing the tests can take agreat deal of time and effort. Sure, you could spend more time equivalence partitioning your test cases, reducing the number that you run, but then you take on more risk because you’re reducing coverage, choosing not to test important features. You need to test more, but you don’t have the time. What can you do? The answer is to do what people have done for years in every other field and industry—develop and use tools to make the job easier and more efficient. That’s what this chapter is all about. Highlights of this chapter include Why test tools and automation are necessary Examples of simple test tools you can use How using tools migrates to test automation How to feed and care for “monkeys” Why test tools and automation aren’t a panacea The Benefits of Automation and Tools Think back to what you’ve learned about how software is created. In most software development models, the codetest- fix loop can repeat several times before the software is released. If you’re testing a particular feature, that means you may need to run your tests not once, but potentially dozens of times. You’ll check that the bugs you found in previous test runs were indeed fixed and that no new bugs were introduced. This process of rerunning your tests is known as regression testing. If a small software project had several thousand test cases to run, there might be barely enough time to execute them just once. Running them numerous times might be impossible, let alone monotonous. Software test tools and automation can help solve this problem by providing a more efficient means to run your tests than by manual testing. The principal attributes of tools and automation are Speed. Think about how long it would take you to manually try a few thousand test cases for the Windows Calculator. You might average a test case every five seconds or so. Automation might be able to run 10, 100, even 1000 times that fast. Efficiency. While you’re busy running test cases, you can’t be doing anything else. If you have a test tool that reduces the time it takes for you to run your tests, you have more time for test planning and thinking up new tests. Accuracy and Precision. After trying a few hundred cases, your attention span will wane and you’ll start to make mistakes. A test tool will perform the same test and check the results perfectly, each and every time. Resource Reduction. Sometimes it can be physically impossible to perform a certain test case. The number of people or the amount of equipment required to create the test condition could be prohibitive. A test tool can be used to simulate the real world and greatly reduce the physical resources necessary to perform the testing. Simulation and Emulation. Test tools are often used to replace hardware or software that would normally interface to your product. This “fake” device or application can then be used to drive or respond to your software in ways that you choose—and ways that might otherwise be difficult to achieve. Relentlessness. Test tools and automation never tire or give up. They’re like that battery-operated bunny of the TV commercials—they can keep going and going Test Tools As a software tester you’ll be exposed to a wide range of testing tools. The types of tools that you’ll use are based on the type of software that you’re testing and whether you’re performing black-box or white-box tests. The beauty of test tools is that you don’t always need to be an expert in how they work or exactly what they do to use them. Suppose that you’re testing networking software that allows a computer to simultaneously communicate with up to 1 million other computers. It would be difficult, if not impossible, to perform a controlled test with 1 million real connections. But, if someone gave you a special tool that simulated those connections, maybe letting you adjust the number from one to a million, you could perform your tests without having to set up a real-world scenario. You don’t need to understand how the tool works, just that it does— that’s black-box testing. On the other hand, a tool could be set up to monitor and modify the raw communications that occurs among those million computers. You’d likely need some whitebox skills and knowledge of the low-level protocol to effectively use this tool. Some examples are based on tools that are included with most programming languages; others are commercial tools sold individually. You may find, however, that your software or hardware is unique enough that you’ll have to develop or have someone else develop custom tools that fit your specific needs. They will likely, though, still fall into one of these categories. Viewers and Monitors A viewer or monitor test tool allows you to see details of the software’s operation that you wouldn’t normally be able to see. “Testing the Software with XRay Glasses,” you learned how code coverage analyzers provide a means for you to see what lines of code are executed, what functions are run, and what code paths are followed when you run your tests. A code coverage analyzer is an example of a viewing tool. Most code coverage analyzers are invasive tools because they need to be compiled and linked into the program to access the information they provide. Drivers Drivers are tools used to control and operate the software being tested. One of the simplest examples of a driver is a batch file, a simple list of programs or commands that are executed sequentially. In the days of MS-DOS, this was a popular means for testers to execute their test programs. They’d create a batch file containing the names of their test programs, start the batch running, and go home. With today’s operating systems and programming languages, there are much more sophisticated methods for executing test programs. For example, Java or a Perl script can take the place of an old MS-DOS batch file, and the Windows Task Scheduler can execute various test programs at certain times throughout the day. Stubs Stubs, like drivers, were mentioned as white-box testing techniques. Stubs are essentially the opposite of drivers in that they don’t control or operate the software being tested; they instead receive or respond to data that the software sends. If you’re testing software that sends data to a printer, one way to test it is to enter data, print it, and look at the resulting paper printout. That would work, but it’s fairly slow, inefficient, and error prone. Stress and Load Tools Stress and load tools induce stresses and loads to the software being tested. A word processor running as the only application on the system, with all available memory and disk space, probably works just fine. But, if the system runs low on these resources, you’d expect a greater potential for bugs. You could copy files to fill up the disk, run lots of programs to eat up memory, and so on, but these methods are inefficient and non-exact. A stress tool specifically designed for this would make testing much easier. Analysis Tools You might call this last category of tools analysis tools, a best-of-the-rest group. Most software testers use the following common tools to make their everyday jobs easier. They’re not necessarily as fancy as the tools discussed so far. They’re often taken for granted, but they get the job done and can save you a great deal of time. Word processing software Spreadsheet software Database software File comparison software Screen capture and comparison software Debugger Binary-hex calculator Stopwatch VCR or camera Of course, software complexity and direction change all the time. You need to look at your individual situation to decide what the most effective tools would be and how best to apply them. Software Test Automation Although test automation is just another class of software testing tools, it’s one that deserves special consideration. The software test tools that you’ve learned about so far are indeed effective, but they still must be operated or monitored manually. What if those tools could be combined, started, and run with little or no intervention from you? They could run your test cases, look for bugs, analyze what they see, and log the results. That’s software test automation. The next few sections will walk you through the different types of automation, progressing from the simplest to the most complex. Macro Recording and Playback The most basic type of test automation is recording your keyboard and mouse actions as you run your tests for the first time and then playing them back when you need to run them again. If the software you’re testing is for Windows or the Mac, recording and playing back macros is a fairly easy process. On the Mac you can use QuicKeys; on Windows the shareware program Macro Magic is a good choice. Many macro record and playback programs are available, so you might want to scan your favorite shareware supplier and find one that best fits your needs. Macro recorders and players are a type of driver tool. As mentioned earlier, drivers are tools used to control and operate the software being tested. With a macro program you’re doing just that—the macros you record are played back, repeating the actions that you performed to test the software. Figure shows a screen from the Macro Magic Setup Wizard, which walks you through the steps necessary to configure and capture your macros. Fig: The Macro Magic Setup Wizard allows you to configure how your recorded macros are triggered and played back Programmed Macros Programmed macros are a step up in evolution from the simple record and playback variety. Rather than create programmed macros by recording your actions as you run the test for the first time, create them by programming simple instructions for the playback system to follow. A very simple macro program might look like the one in Listing (created with the Macro Magic Setup Wizard). This type of macro can be programmed by selecting individual actions from a menu of choices—you don’t even need to type in the commands. LISTING: A Simple Macro That Performs a Test on the Windows Calculator 1: Calculator Test #2 2: 3: 4: 123-100= 5: 6: Line 1 is a comment line identifying the test. Line 2 executes calc.exe, the Windows calculator. Line 3 waits up to five seconds for Calculator to start. It does this by pausing until a window appears with the word Calculator in its title bar. Line 4 types the keys 123–100=. Line 5 displays a message prompt stating that the answer should be 23. Line 6 closes the Calculator window and ends the test. Fully Programmable Automated Testing Tools What if you had the power of a full-fledged programming language, coupled with macro commands that can drive the software being tested, with the additional capacity to perform verification? You’d have the ultimate bug-finding tool! Automated testing tools such as Visual Test provide the means for software testers to create very powerful tests. Many are based on the BASIC programming language, making it very easy for even non-programmers to write test code. The most important feature that comes with these automation tools is the ability to perform verification, actually checking that the software is doing what’s expected. There are several ways to do this: Screen captures. The first time you run your automated tests, you could capture and save screen images at key points that you know are correct. On future test runs, your automation could then compare the saved screens with the current screens. If they’re different, something unexpected happened and the automation could flag it as a bug. Control values. Rather than capture screens, you could check the value of individual elements in the software’s window. If you’re testing Calculator, your automation could read the value out of the display field and compare it with what you expected. You could also determine if a button was pressed or a check box was selected. Automation tools provide the means to easily do this within your test program. File and other output. Similarly, if your program saves data to a file—for example, a word processor—your automation could read it back after creating it and compare it to a known good file. The same techniques would apply if the software being tested sent data over a modem or a network. The automation could be configured to read the data back in and compare it with the data that it expects. Random Testing: Monkeys and Gorillas The test automation tools and techniques that you’ve learned about so far have concentrated on making your job as a software tester easier and more efficient. They’re designed to help you in running your test cases or, ideally, running your test cases automatically without the need for constant attention. Using tools and automation for this purpose will help you find bugs; while the tools are busy doing regression testing, you’ll have time to plan new tests and design new and interesting cases to try. Another type of automated testing, though, isn’t designed to help run or automatically run test cases. Its goal is to simulate what your users might do. That type of automation tool is called a test monkey. The term test monkey comes from the idea that if you had a million monkeys typing on a million keyboards for a million years, statistically, they might eventually write a Shakespearean play, The Adventures of Curious George, or some other great work. All that random pounding of keys could accidentally hit the right combination of letters and the monkeys would, for a moment, look brilliant. When your software is released to the public, it will have thousands or possibly millions of people using it. Despite your best efforts at designing test cases to find bugs, some bugs will slip by and be found by those users. What if you could supplement your test case approach with a simulation of what all those users would do, before you released your product? You could potentially find bugs that would have otherwise made it past your testing. That’s what a test monkey can do. Dumb Monkeys The easiest and most straightforward type of test monkey is a dumb monkey. A dumb monkey doesn’t know anything about the software being tested; it just clicks or types randomly. The software running on the PC doesn’t know the difference between this program and a real person—except that it happens much more quickly. On a reasonably speedy PC it will run in just a few seconds. Imagine how many random inputs you’d get if it ran all night! Remember, this monkey is doing absolutely no verification. It just clicks and types until one of two things happens—either it finishes its loop or the software or the operating system crashes. If the software under test crashes, the monkey won’t even know it, and will continue clicking and typing away. Semi-Smart Monkeys Dumb monkeys can be extremely effective. They’re easy to write and can find serious, crashing bugs. They lack a few important features, though, that would make them even more effective. Adding these features raises your monkey’s IQ a bit, making him semi-smart. Say that your monkey ran for several hours, logging thousands of random inputs before the software crashed. You’d know there was a problem but you couldn’t show the programmer exactly how to re-create it. You could rerun your monkey with the same random seed but if it took several hours again to fail, you’d be wasting a lot of time. The solution is to add logging to your monkey so that everything it does is recorded to a file. When the monkey finds a bug, you need only to look at the log file to see what it was doing before the failure. Smart Monkeys Moving up on the evolutionary scale is the smart monkey. Such a monkey takes the effectiveness of random testing from his less-intelligent brothers and adds to that an awareness of his surroundings. He doesn’t just pound on the keyboard randomly—he pounds on it with a purpose. A true smart monkey knows Where he is What he can do there Where he can go Where he’s been If what he’s seeing is correct Does this list sound familiar? It should. A smart monkey can read the software’s state transition map—the type of map described in, “Testing the Software with Blinders On.” If all the state information that describes the software can be read by the monkey, it could bounce around the software just like a user would, only much more quickly, and be able to verify things as it went. Realities of Using Test Tools and Automation Before you get all excited and want to run out and start using tools and automation on your tests, you need to read this section and take it to heart. Test automation isn’t a panacea. When it is properly planned and executed it can make your testing immensely more efficient and find bugs that would have otherwise gone undiscovered. However, countless test automation efforts have been abandoned and cost their projects dearly when they went astray. You should consider these important issues before you begin using the techniques described in this chapter: The software changes. Specifications are never fixed. New features are added late. The product name can change at the last minute. What if you recorded thousands of macros to run all your tests and a week before the product was to be released, the software was changed to display an extra screen when it started up? All of your recorded macros would fail to run because they wouldn’t know the extra screen was there. You need to write your automation so that it’s flexible and can easily and quickly be changed if necessary. There’s no substitute for the human eye and intuition. Smart monkeys can be programmed to be only so smart. They can test only what you tell them to test. They can never see something and say, “Gee, that looks funny. I should do some more checking”—at least, not yet. Verification is hard to do. If you’re testing a user interface, the obvious and simplest method to verify your test results is capturing and comparing screens. But, captured screens are huge files and those screens can be constantly changing during the product’s development. Make sure that your tools check only what they need to and can efficiently handle changes during product development. It’s easy to rely on automation too much. Don’t ever assume that because all your automation runs without finding a bug that there are no more bugs to find. They’re still in there. It’s the pesticide paradox. Don’t spend so much time working on tools and automation that you fail to test the software. It’s easy and fun to start writing macros or programming a smart monkey, but that’s not testing. These tools may help you be more efficient, but you’ll need to use them on the software and do some real testing to find bugs. If you’re writing macros, developing a tool, or programming a monkey, you’re doing development work. You should follow the same standards and guidelines that you ask of your programmers. Just because you’re a tester doesn’t mean you can break the rules. Some tools are invasive and can cause the software being tested to improperly fail. If you use a tool that finds a bug, try to re-create that bug by hand without using the tool. It might turn out to be a simple reproducible bug, or the tool might be the cause of the problem. Case studies on testing tool: Selenium Selenium is an open-source and a portable automated software testing tool for testing web applications. It has capabilities to operate across different browsers and operating systems. Selenium is not just a single tool but a set of tools that helps testers to automate web-based applications more efficiently. You can use multiple programming languages like Java, C#, Python, etc to create Selenium Test Scripts. Testing done using the Selenium testing tool is usually referred to as Selenium Testing. Selenium Tool Suite Selenium Software is not just a single tool but a suite of software, each piece catering to different Selenium QA testing needs of an organization. Here is the list of tools: Selenium Integrated Development Environment (IDE) Selenium Remote Control (RC) WebDriver Selenium Grid Selenium Grid Selenium Grid is a tool used together with Selenium RC to run parallel tests across different machines and different browsers all at the same time. Parallel execution means running multiple tests at once. Features: Enables simultaneous running of tests in multiple browsers and environments. Saves time enormously. Utilizes the hub-and-nodes concept. The hub acts as a central source of Selenium commands to each node connected to it. Advantages of Selenium Selenium is an open-source tool. Can be extended for various technologies that expose DOM. Has capabilities to execute scripts across different browsers. Can execute scripts on various operating systems. Supports mobile devices. Executes tests within the browser, so focus is NOT required while script execution is in progress. Can execute tests in parallel with the use of Selenium Grids. Disadvantages of Selenium Supports only web-based applications. No feature such as Object Repository/Recovery Scenario No IDE, so the script development won't be as fast as commercial tools. Cannot access controls within the browser. No default test report generation. For parameterization, users has to rely on the programming language. Unit-7 Formal Approaches of SQA STATISTICAL SOFTWARE QUALITY ASSURANCE Statistical quality assurance reflects a growing trend throughout industry to become more quantitative about quality. For software, statistical quality assurance implies the following steps: 1. Information about software errors and defects is collected and categorized. 2. An attempt is made to trace each error and defect to its underlying cause (e.g., nonconformance to specifications, design error, violation of standards, poor communication with the customer). 3. Using the Pareto principle (80 percent of the defects can be traced to 20 percent of all possible causes), isolate the 20 percent (the vital few). 4. Once the vital few causes have been identified, move to correct the problems that have caused the errors and defects. This relatively simple concept represents an important step toward the creation of an adaptive software process in which changes are made to improve those elements of the process that introduce error. 1 A Generic Example To illustrate the use of statistical methods for software engineering work, assume that a software engineering organization collects information on errors and defects for a period of one year. Some of the errors are uncovered as software is being developed. Others (defects) are encountered after the software has been released to its end users. Although hundreds of different problems are uncovered, all can be tracked to one (or more) of the following causes: Incomplete or erroneous specifications (IES) Misinterpretation of customer communication (MCC) Intentional deviation from specifications (IDS) Violation of programming standards (VPS) Error in data representation (EDR) Inconsistent component interface (ICI) Error in design logic (EDL) Incomplete or erroneous testing (IET) Inaccurate or incomplete documentation (IID) Error in programming language translation of design (PLT) Ambiguous or inconsistent human/computer interface (HCI) Miscellaneous (MIS) To apply statistical SQA, the table in Figure 16.2 is built. The table indicates that IES, MCC, and EDR are the vital few causes that account for 53 percent of all errors. It should be noted, however, that IES, EDR, PLT, and EDL would be selected as the vital few causes if only serious errors are considered. Once the vital few causes are determined, the software engineering organization can begin corrective action. For example, to correct MCC, you might implement requirements gathering techniques to improve the quality of customer communication and specifications. To improve EDR, you might acquire tools for data modeling and perform more stringent data design reviews. It is important to note that corrective action focuses primarily on the vital few. As the vital few causes are corrected, new candidates pop to the top of the stack. 2 Six Sigma for Software Engineering Six Sigma is the most widely used strategy for statistical quality assurance in industry today. Originally popularized by Motorola in the 1980s, the Six Sigma strategy “is a rigorous and disciplined methodology that uses data and statistical analysis to measure and improve a company’s operational performance by identifying and eliminating defects’ in manufacturing and service-related processes” [ISI08]. The term Six Sigma is derived from six standard deviations— 3.4 instances (defects) per million occurrences—implying an extremely high quality standard. The Six Sigma methodology defines three core steps: Define customer requirements and deliverables and project goals via welldefined methods of customer communication. Measure the existing process and its output to determine current quality performance (collect defect metrics). Analyze defect metrics and determine the vital few causes. If an existing software process is in place, but improvement is required, Six Sigma suggests two additional steps: Improve the process by eliminating the root causes of defects. Control the process to ensure that future work does not reintroduce the causes of defects. These core and additional steps are sometimes referred to as the DMAIC (define, measure, analyze, improve, and control) method. If an organization is developing a software process (rather than improving an existing process), the core steps are augmented as follows: Design the process to (1) avoid the root causes of defects and (2) to meet customer requirements. Verify that the process model will, in fact, avoid defects and meet customer requirements. This variation is sometimes called the DMADV (define, measure, analyze, design, and verify) method. SOFTWARE RELIABILITY There is no doubt that the reliability of a computer program is an important element of its overall quality. If a program repeatedly and frequently fails to perform, it matters little whether other software quality factors are acceptable. Software reliability, unlike many other quality factors, can be measured directly and estimated using historical and developmental data. Software reliability is defined in statistical terms as “the probability of failure-free operation of a computer program in a specified environment for a specified time” [Mus87]. To illustrate, program X is estimated to have a reliability of 0.999 over eight elapsed processing hours. In other words, if program X were to be executed 1000 times and require a total of eight hours of elapsed processing time (execution time), it is likely to operate correctly (without failure) 999 times. Whenever software reliability is discussed, a pivotal question arises: What is meant by the term failure? In the context of any discussion of software quality and reliability, failure is nonconformance to software requirements. Yet, even within this definition, there are gradations. Failures can be only annoying or catastrophic. One failure can be corrected within seconds, while another requires weeks or even months to correct. Complicating the issue even further, the correction of one failure may in fact result in the introduction of other errors that ultimately result in other failures. 1 Measures of Reliability and Availability Early work in software reliability attempted to extrapolate the mathematics of hardware reliability theory to the prediction of software reliability. Most hardware- related reliability models are predicated on failure due to wear rather than failure due to design defects. In hardware, failures due to physical wear (e.g., the effects of temperature, corrosion, shock) are more likely than a design-related failure. Unfortunately, the opposite is true for software. In fact, all software failures can be traced to design or implementation problems; it does not enter into the picture. There has been an ongoing debate over the relationship between key concepts in hardware reliability and their applicability to software. Although an irrefutable link has yet to be established, it is worthwhile to consider a few simple concepts that apply to both system elements. If we consider a computer-based system, a simple measure of reliability is meantime-between-failure (MTBF): MTBF = MTTF +MTTR where the acronyms MTTF and MTTR are mean-time-to-failure and mean-time-to repair, respectively. Many researchers argue that MTBF is a far more useful measure than other quality-related software metrics. Stated simply, an end user is concerned with failures, not with the total defect count. Because each defect contained within a program does not have the same failure rate, the total defect count provides little indication of the reliability of a system. For example, consider a program that has been in operation for 3000 processor hours without failure. Many defects in this program may remain undetected for tens of thousand of hours before they are discovered. The MTBF of such obscure errors might be 30,000 or even 60,000 processor hours. Other defects, as yet undiscovered, might have a failure rate of 4000 or 5000 hours. Even if every one of the first category of errors (those with long MTBF) is removed, the impact on software reliability is negligible. However, MTBF can be problematic for two reasons: (1) it projects a time span between failures, but does not provide us with a projected failure rate, and (2) MTBF can be misinterpreted to mean average life span even though this is not what it implies. An alternative measure of reliability is failures-in-time (FIT)—a statistical measure of how many failures a component will have over one billion hours of operation. Therefore, 1 FIT is equivalent to one failure in every billion hours of operation. In addition to a reliability measure, you should also develop a measure of availability. Software availability is the probability that a program is operating according to requirements at a given point in time and is defined as The MTBF reliability measure is equally sensitive to MTTF and MTTR. The availability measure is somewhat more sensitive to MTTR, an indirect measure of the maintainability of software. 2 Software Safety Software safety is a software quality assurance activity that focuses on the identification and assessment of potential hazards that may affect software negatively and cause an entire system to fail. If hazards can be identified early in the software process, software design features can be specified that will either eliminate or control potential hazards. A modeling and analysis process is conducted as part of software safety. Initially, hazards are identified and categorized by criticality and risk. For example, some of the hazards associated with a computer-based cruise control for an automobile might be: (1) causes uncontrolled acceleration that cannot be stopped, (2) does not respond to depression of brake pedal (by turning off), (3) does not engage when switch is activated, and (4) slowly loses or gains speed. Once these system-level hazards are identified, analysis techniques are used to assign severity and probability of occurrence.3 To be effective, software must be analyzed in the context of the entire system. For example, a subtle user input error (people are system components) may be magnified by a software fault to produce control data that improperly positions a mechanical device. If and only if a set of external environmental conditions is met, the improper position of the mechanical device will cause a disastrous failure. Analysis techniques [Eri05] such as fault tree analysis, real-time logic, and Petri net models can be used to predict the chain of events that can cause hazards and the probability that each of the events will occur to create the chain. Once hazards are identified and analyzed, safety-related requirements can be specified for the software. That is, the specification can contain a list of undesirable events and the desired system responses to these events. The role of software in managing undesirable events is then indicated. Although software reliability and software safety are closely related to one another, it is important to understand the subtle difference between them. Software reliability uses statistical analysis to determine the likelihood that a software failure will occur. However, the occurrence of a failure does not necessarily result in a hazard or mishap. Software safety examines the ways in which failures result in conditions that can lead to a mishap. That is, failures are not considered in a vacuum, but are evaluated in the context of an entire computer-based system and its environment. THE ISO 9000 QUALITY STANDARDS A quality assurance system may be defined as the organizational structure, responsibilities, procedures, processes, and resources for implementing quality management [ANS87]. Quality assurance systems are created to help organizations ensure their products and services satisfy customer expectations by meeting their specifications. These systems cover a wide variety of activities encompassing a product’s entire life cycle including planning, controlling, measuring, testing and reporting, and improving quality levels throughout the development and manufacturing process. ISO 9000 describes quality assurance elements in generic terms that can be applied to any business regardless of the products or services offered. To become registered to one of the quality assurance system models contained in ISO 9000, a company’s quality system and operations are scrutinized by third- party auditors for compliance to the standard and for effective operation. Upon successful registration, a company is issued a certificate from a registration body represented by the auditors. Semiannual surveillance audits ensure continued compliance to the standard. The requirements delineated by ISO 9001:2000 address topics such as management responsibility, quality system, contract review, design control, document and data control, product identification and traceability, process control, inspection and testing, corrective and preventive action, control of quality records, internal quality audits, training, servicing, and statistical techniques. In order for a software organization to become registered to ISO 9001:2000, it must establish policies and procedures to address each of the requirements just noted (and others) and then be able to demonstrate that these policies and procedures are being followed. Establish the elements of a quality management system. Develop, implement, and improve the system. Define a policy that emphasizes the importance of the system. Document the quality system. Describe the process. Produce an operational manual. Develop methods for controlling (updating) documents. Establish methods for record keeping. Support quality control and assurance. Promote the importance of quality among all stakeholders. Focus on customer satisfaction. Define a quality plan that addresses objectives, responsibilities, and authority. Define communication mechanisms among stakeholders. Establish review mechanisms for the quality management system. Identify review methods and feedback mechanisms. Define follow-up procedures. Identify quality resources including personnel, training, and infrastructure elements. Establish control mechanisms. For planning For customer requirements For technical activities (e.g., analysis, design, testing) For project monitoring and management Define methods for remediation. Assess quality data and metrics. Define approach for continuous process and quality improvement. THE SQA PLAN The SQA Plan provides a road map for instituting software quality assurance. Developed by the SQA group (or by the software team if an SQA group does not exist), the plan serves as a template for SQA activities that are instituted for each software project. A standard for SQA plans has been published by the IEEE [IEE93]. The standard recommends a structure that identifies: (1) the purpose and scope of the plan, (2) a description of all software engineering work products (e.g., models, documents, source code) that fall within the purview of SQA, (3) all applicable standards and practices that are applied during the software process, (4) SQA actions and tasks (including reviews and audits) and their placement throughout the software process, (5) the tools and methods that support SQA actions and tasks, (6) software configuration management procedures, (7) methods for assembling, safeguarding, and maintaining all SQA-related records, and (8) organizational roles and responsibilities relative to product quality. Unit- 6 Quality Concepts & Software Quality Assurance WHAT IS QUALITY? Quality... you know what it is, yet you don’t know what it is. But that’s self- contradictory. But some things are better than others; that is, they have more quality. But when you try to say what the quality is, apart from the things that have it, it all goes poof! There’s nothing to talk about. But if you can’t say what Quality is, how do you know what it is, or how do you know that it even exists? If no one knows what it is, then for all practical purposes it doesn’t exist at all. But for all practical purposes it really does exist. What else are the grades based on? Why else would people pay fortunes for some things and throw others in the trash pile? Obviously some things are better than others... At a somewhat more pragmatic level, David Garvin [Gar84] of the Harvard Business School suggests that “quality is a complex and multifaceted concept” that can be described from five different points of view. The transcendental view argues (like Persig) that quality is something that you immediately recognize, but cannot explicitly define. The user view sees quality in terms of an end user’s specific goals. If a product meets those goals, it exhibits quality. The manufacturer’s view defines quality in terms of the original specification of the product. If the product conforms to the spec, it exhibits quality. The product view suggests that quality can be tied to inherent characteristics (e.g., functions and features) of a product. Finally, the value-based view measures quality based on how much a customer is willing to pay for a product. In reality, quality encompasses all of these views and more. SOFTWARE QUALITY Even the most jaded software developers will agree that high-quality software is an important goal. But how do we define software quality? In the most general sense, software quality can be defined1 as: An effective software process applied in a manner that creates a useful product that provides measurable value for those who produce it and those who use it. There is little question that the preceding definition could be modified or extended and debated endlessly. For the purposes of this book, the definition serves to emphasize three important points: 1. An effective software process establishes the infrastructure that supports any effort at building a high-quality software product. The management aspects of process create the checks and balances that help avoid project chaos—a key contributor to poor quality. Software engineering practices allow the developer to analyze the problem and design a solid solution—both critical to building high- quality software. Finally, umbrella activities such as change management and technical reviews have as much to do with quality as any other part of software engineering practice. 2. A useful product delivers the content, functions, and features that the end user desires, but as important, it delivers these assets in a reliable, error-free way. A useful product always satisfies those requirements that have been explicitly stated by stakeholders. In addition, it satisfies a set of implicit requirements (e.g., ease of use) that are expected of all high-quality software. 3. By adding value for both the producer and user of a software product, highquality software provides benefits for the software organization and the end user community. The software organization gains added value because high- quality software requires less maintenance effort, fewer bug fixes, and reduced customer support. This enables software engineers to spend more time creating new applications and less on rework. The user community gains added value because the application provides a useful capability in a way that expedites some business process. The end result is (1) greater software product revenue, (2) better profitability when an application supports a business process, and/or (3) improved availability of information that is crucial for the business. 1 Garvin’s Quality Dimensions David Garvin [Gar87] suggests that quality should be considered by taking a multidimensional viewpoint that begins with an assessment of conformance and terminates with a transcendental (aesthetic) view. Although Garvin’s eight dimensions of quality were not developed specifically for software, they can be applied when software quality is considered: Performance quality. Does the software deliver all content, functions, and features that are specified as part of the requirements model in a way that provides value to the end user? Feature quality. Does the software provide features that surprise and delight first- time end users? Reliability. Does the software deliver all features and capability without failure? Is it available when it is needed? Does it deliver functionality that is error-free? Conformance. Does the software conform to local and external software standards that are relevant to the application? Does it conform to de facto design and coding conventions? For example, does the user interface conform to accepted design rules for menu selection or data input? Durability. Can the software be maintained (changed) or corrected (debugged) without the inadvertent generation of unintended side effects? Will changes cause the error rate or reliability to degrade with time? Serviceability. Can the software be maintained (changed) or corrected (debugged) in an acceptably short time period? Can support staff acquire all information they need to make changes or correct defects? Douglas Adams [Ada93] makes a wry comment that seems appropriate here: “The difference between something that can go wrong and something that can’t possibly go wrong is that when something that can’t possibly go wrong goes wrong it usually turns out to be impossible to get at or repair.” Aesthetics. There’s no question that each of us has a different and very subjective vision of what is aesthetic. And yet, most of us would agree that an aesthetic entity has a certain elegance, a unique flow, and an obvious “presence” that are hard to quantify but are evident nonetheless. Aesthetic software has these characteristics. Perception. In some situations, you have a set of prejudices that will influence your perception of quality. For example, if you are introduced to a software product that was built by a vendor who has produced poor quality in the past, your guard will be raised and your perception of the current software product quality might be influenced negatively. Similarly, if a vendor has an excellent reputation, you may perceive quality, even when it does not really exist. 2 McCall’s Quality Factors McCall, Richards, and Walters [McC77] propose a useful categorization of factors that affect software quality. These software quality factors, shown in Figure 14.1, focus on three important aspects of a software product: its operational characteristics, its ability to undergo change, and its adaptability to new environments. Referring to the factors noted in Figure 14.1, McCall and his colleagues provide the following descriptions: Correctness. The extent to which a program satisfies its specification and fulfills the customer’s mission objectives. Reliability. The extent to which a program can be expected to perform its intended function with required precision. Efficiency. The amount of computing resources and code required by a program to perform its function. Integrity. Extent to which access to software or data by unauthorized persons can be controlled. Usability. Effort required to learn, operate, prepare input for, and interpret output of a program. Maintainability. Effort required to locate and fix an error in a program. [This is a very limited definition.] Flexibility. Effort required to modify an operational program. Testability. Effort required to test a program to ensure that it performs its intended function. Portability. Effort required to transfer the program from one hardware and/or software system environment to another. Reusability. Extent to which a program [or parts of a program] can be reused in other applications—related to the packaging and scope of the functions that the program performs. Interoperability. Effort required to couple one system to another. It is difficult, and in some cases impossible, to develop direct measures2 of these quality factors. In fact, many of the metrics defined by McCall et al. can be measured only indirectly. However, assessing the quality of an application using these factors will provide you with a solid indication of software quality. 3 ISO 9126 Quality Factors The ISO 9126 standard was developed in an attempt to identify the key quality attributes for computer software. The standard identifies six key quality attributes: Functionality. The degree to which the software satisfies stated needs as indicated by the following subattributes: suitability, accuracy, interoperability, compliance, and security. Reliability. The amount of time that the software is available for use as indicated by the following subattributes: maturity, fault tolerance, recoverability. Usability. The degree to which the software is easy to use as indicated by the following subattributes: understandability, learnability, operability. Efficiency. The degree to which the software makes optimal use of system resources as indicated by the following subattributes: time behavior, resource behavior. Maintainability. The ease with which repair may be made to the software as indicated by the following subattributes: analyzability, changeability, stability, testability. Portability. The ease with which the software can be transposed from one environment to another as indicated by the following subattributes: adaptability, installability, conformance, replaceability. Like other software quality factors discussed in the preceding subsections, the ISO 9126 factors do not necessarily lend themselves to direct measurement Targeted Quality Factors The quality dimensions and factors presented in Sections 14.2.1 and 14.2.2 focus on the software as a whole and can be used as a generic indication of the quality of an application. A software team can develop a set of quality characteristics and associated questions that would probe3 the degree to which each factor has been satisfied. For example, McCall identifies usability as an important quality factor. If you were asked to review a user interface and assess its usability, how would you proceed? You might start with the subattributes suggested by McCall— understandability, learnability, and operability—but what do these mean in a pragmatic sense? To conduct your assessment, you’ll need to address specific, measurable (or at least, recognizable) attributes of the interface. For example [Bro03]: Intuitiveness. The degree to which the interface follows expected usage patterns so that even a novice can use it without significant training. Is the interface layout conducive to easy understanding? Are interface operations easy to locate and initiate? Does the interface use a recognizable metaphor? Is input specified to economize key strokes or mouse clicks? Does the interface follow the three golden rules? Do aesthetics aid in understanding and usage? Efficiency. The degree to which operations and information can be located or initiated. Does the interface layout and style allow a user to locate operations and information efficiently? Can a sequence of operations (or data input) be performed with an economy of motion? Are output data or content presented so that it is understood immediately? Have hierarchical operations been organized in a way that minimizes the depth to which a user must navigate to get something done? Robustness. The degree to which the software handles bad input data or inappropriate user interaction. Will the software recognize the error if data at or just outside prescribed boundaries is input? More importantly, will the software continue to operate without failure or degradation? Will the interface recognize common cognitive or manipulative mistakes and explicitly guide the user back on the right track? Does the interface provide useful diagnosis and guidance when an error condition (associated with software functionality) is uncovered? Richness. The degree to which the interface provides a rich feature set. Can the interface be customized to the specific needs of a user? Does the interface provide a macro capability that enables a user to identify a sequence of common operations with a single action or command? As the interface design is developed, the software team would review the design prototype and ask the questions noted. If the answer to most of these questions is “yes,” it is likely that the user interface exhibits high quality. A collection of questions similar to these would be developed for each quality factor to be assessed. THE SOFTWARE QUALITY DILEMMA In an interview [Ven03] published on the Web, Bertrand Meyer discusses what I call the quality dilemma: If you produce a software system that has terrible quality, you lose because no one will want to buy it. If on the other hand you spend infinite time, extremely large effort, and huge sums of money to build the absolutely perfect piece of software, then it’s going to take so long to complete and it will be so expensive to produce that you’ll be out of business anyway. Either you missed the market window, or you simply exhausted all your resources. So people in industry try to get to that magical middle ground where the product is good enough not to be rejected right away, such as during evaluation, but also not the object of so much perfectionism and so much work that it would take too long or cost too much to complete. It’s fine to state that software engineers should strive to produce high-quality systems. It’s even better to apply good practices in your attempt to do so. But the situation discussed by Meyer is real life and represents a dilemma for even the best software engineering organizations. 1 “Good Enough” Software Stated bluntly, if we are to accept the argument made by Meyer, is it acceptable to produce “good enough” software? The answer to this question must be “yes,” because major software companies do it every day. They create software with known bugs and deliver it to a broad population of end users. They recognize that some of the functions and features delivered in Version 1.0 may not be of the highest quality and plan for improvements in Version 2.0. They do this knowing that some customers will complain, but they recognize that time-to-market may trump better quality as long as the delivered product is “good enough.” Exactly what is “good enough”? Good enough software delivers high-quality functions and features that users desire, but at the same time it delivers other more obscure or specialized functions and features that contain known bugs. The software vendor hopes that the vast majority of end users will overlook the bugs because they are so happy with other application functionality. This idea may resonate with many readers. If you’re one of them, I can only ask you to consider some of the arguments against “good enough.” 2 The Cost of Quality The argument goes something like this—we know that quality is important, but it costsus time and money—too much time and money to get the level of software quality we really want. On its face, this argument seems reasonable (see Meyer’s comments earlier in this section). There is no question that quality has a cost, but lack of quality also has a cost—not only to end users who must live with buggy software, but also to the software organization that has built and must maintain it. The real question is this: which cost should we be worried about? To answer this question, you must understand both the cost of achieving quality and the cost of low-quality software. The cost of quality includes all costs incurred in the pursuit of quality or in performing quality-related activities and the downstream costs of lack of quality. To understand these costs, an organization must collect metrics to provide a baseline for the current cost of quality, identify opportunities for reducing these costs, and provide a normalized basis of comparison. The cost of quality can be divided into costs associated with prevention, appraisal, and failure. Prevention costs include (1) the cost of management activities required to plan and coordinate all quality control and quality assurance activities, (2) the cost of added technical activities to develop complete requirements and design models, (3) test planning costs, and (4) the cost of all training associated with these activities. Appraisal costs include activities to gain insight into product condition the “first time through” each process. Examples of appraisal costs include: Cost of conducting technical reviews for software engineering work products Cost of data collection and metrics evaluation Cost of testing and debugging Failure costs are those that would disappear if no errors appeared before or after shipping a product to customers. Failure costs may be subdivided into internal failure costs and external failure costs. Internal failure costs are incurred when you detect an error in a product prior to shipment. Internal failure costs include Cost required to perform rework (repair) to correct an error Cost that occurs when rework inadvertently generates side effects that must be mitigated Costs associated with the collection of quality metrics that allow an organization to assess the modes of failure External failure costs are associated with defects found after the product has been shipped to the customer. Examples of external failure costs are complaint resolution, product return and replacement, help line support, and labor costs associated with warranty work. A poor reputation and the resulting loss of business is another external failure cost that is difficult to quantify but nonetheless very real. Bad things happen when low-quality software is produced 3 Risks In Chapter, I wrote “people bet their jobs, their comforts, their safety, their entertainment, their decisions, and their very lives on computer software. It better be right.” The implication is that low-quality software increases risks for both the developer and the end user. In the preceding subsection, I discussed one of these risks (cost). But the downside of poorly designed and implemented applications does not always stop with dollars and time. An extreme example [Gag04] might serve to illustrate. Throughout the month of November 2000 at a hospital in Panama, 28 patients received massive overdoses of gamma rays during treatment for a variety of cancers. In the months that followed, 5 of these patients died from radiation poisoning and 15 others developed serious complications. What caused this tragedy? A software package, developed by a U.S. company, was modified by hospital technicians to compute doses of radiation for each patient. 4 Negligence and Liability The story is all too common. A governmental or corporate entity hires a major software developer or consulting company to analyze requirements and then design and construct a software-based “system” to support some major activity. The system might support a major corporate function (e.g., pension management) or some governmental function (e.g., health care administration or homeland security). Work begins with the best of intentions on both sides, but by the time the system is delivered, things have gone bad. The system is late, fails to deliver desired features and functions, is error-prone, and does not meet with customer approval. Litigation ensues. In most cases, the customer claims that the developer has been negligent (in the manner in which it has applied software practices) and is therefore not entitled to payment. The developer often claims that the customer has repeatedly changed its requirements and has subverted the development partnership in other ways. In every case, the quality of the delivered system comes into question. 5 Quality and Security As the criticality of Web-based systems and applications grows, application security has become increasingly important. Stated simply, software that does not exhibit high quality is easier to hack, and as a consequence, low-quality software can indirectly increase the security risk with all of its attendant costs and problems. ACHIEVING SOFTWARE QUALITY Software quality doesn’t just appear. It is the result of good project management and solid software engineering practice. Management and practice are applied within the context of four broad activities that help a software team achieve high software quality: software engineering methods, project management techniques, quality control actions, and software quality assurance. 1 Software Engineering Methods If you expect to build high-quality software, you must understand the problem to be solved. You must also be capable of creating a design that conforms to the problem while at the same time exhibiting characteristics that lead to software that exhibits the quality dimensions and factors. If you apply those concepts and adopt appropriate analysis and design methods, the likelihood of creating high-quality software will increase substantially. 2 Project Management Techniques The impact of poor management decisions on software quality has been discussed. The implications are clear: if (1) a project manager uses estimation to verify that delivery dates are achievable, (2) schedule dependencies are understood and the team resists the temptation to use short cuts, (3) risk planning is conducted so problems do not breed chaos, software quality will be affected in a positive way. In addition, the project plan should include explicit techniques for quality and change management. 3 Quality Control Quality control encompasses a set of software engineering actions that help to ensure that each work product meets its quality goals. Models are reviewed to ensure that they are complete and consistent. Code may be inspected in order to uncover and correct errors before testing commences. A series of testing steps is applied to uncover errors in processing logic, data manipulation, and interface communication. A combination of measurement and feedback allows a software team to tune the process when any of these work products fail to meet quality goals. 4 Quality Assurance Quality assurance establishes the infrastructure that supports solid software engineering methods, rational project management, and quality control actions— all pivotal if you intend to build high-quality software. In addition, quality assurance consists of a set of auditing and reporting functions that assess the effectiveness and completeness of quality control actions. The goal of quality assurance is to provide management and technical staff with the data necessary to be informed about product quality, thereby gaining insight and confidence that actions to achieve product quality are working. Of course, if the data provided through quality assurance identifies problems, it is management’s responsibility to address the problems and apply the necessary resources to resolve quality issues. BACKGROUND ISSUES Quality control and assurance are essential activities for any business that produces products to be used by others. Prior to the twentieth century, quality control was the sole responsibility of the craftsperson who built a product. As time passed and mass production techniques became commonplace, quality control became an activity performed by people other than the ones who built the product. The first formal quality assurance and control function was introduced at Bell Labs in 1916 and spread rapidly throughout the manufacturing world. During the 1940s, more formal approaches to quality control were suggested. These relied on measurement and continuous process improvement [Dem86] as key elements of quality management. Today, every company has mechanisms to ensure quality in its products. In fact, explicit statements of a company’s concern for quality have become a marketing ploy during the past few decades. The history of quality assurance in software development parallels the history of quality in hardware manufacturing. During the early days of computing (1950s and 1960s), quality was the sole responsibility of the programmer. Standards for quality assurance for software were introduced in military contract software development during the 1970s and have spread rapidly into software development in the commercial world [IEE93]. Extending the definition presented earlier, software quality assurance is a “planned and systematic pattern of actions” [Sch98c] that are required to ensure high quality in software. The scope of quality assurance responsibility might best be characterized by paraphrasing a once- popular automobile commercial: “Quality Is Job #1.” The implication for software is that many different constituencies have software quality assurance responsibility—software engineers, project managers, customers, salespeople, and the individuals who serve within an SQA group. The SQA group serves as the customer’s in-house representative. That is, the people who perform SQA must look at the software from the customer’s point of view. Has software development been conducted according to preestablished standards? Have technical disciplines properly performed their roles as part of the SQA activity? The SQA group attempts to answer these and other questions to ensure that software quality is maintained. ELEMENTS OF SOFTWARE QUALITY ASSURANCE Software quality assurance encompasses a broad range of concerns and activities that focus on the management of software quality. These can be summarized in the following manner [Hor03]:Standards. The IEEE, ISO, and other standards organizations have produced a broad array of software engineering standards and related documents. Standards may be adopted voluntarily by a software engineering organization or imposed by the customer or other stakeholders. The job of SQA is to ensure that standards that have been adopted are followed and that all work products conform to them. Reviews and audits. Technical reviews are a quality control activity performed by software engineers for software engineers. Their intent is to uncover errors. Audits are a type of review performed by SQA personnel with the intent of ensuring that quality guidelines are being followed for software engineering work. For example, an audit of the review process might be conducted to ensure that reviews are being performed in a manner that will lead to the highest likelihood of uncovering errors. Testing. Software testing is a quality control function that has one primary goal— to find errors. The job of SQA is to ensure that testing is properly planned and efficiently conducted so that it has the highest likelihood of achieving its primary goal. Error/defect collection and analysis. The only way to improve is to measure how you’re doing. SQA collects and analyzes error and defect data to better understand how errors are introduced and what software engineering activities are best suited to eliminating them. Change management. Change is one of the most disruptive aspects of any software project. If it is not properly managed, change can lead to confusion, and confusion almost always leads to poor quality. SQA ensures that adequate change management practices have been instituted. Education. Every software organization wants to improve its software engineering practices. A key contributor to improvement is education of software engineers, their managers, and other stakeholders. The SQA organization takes the lead in software process improvement and is a key proponent and sponsor of educational programs. Vendor management. Three categories of software are acquired from external software vendors—shrink-wrapped packages (e.g., Microsoft Office), a tailored shell [Hor03] that provides a basic skeletal structure that is custom tailored to the needs of a purchaser, and contracted software that is custom designed and constructed from specifications provided by the customer organization. The job of the SQA organization is to ensure that high-quality software results by suggesting specific quality practices that the vendor should follow (when possible), and incorporating quality mandates as part of any contract with an external vendor. Security management. With the increase in cyber crime and new government regulations regarding privacy, every software organization should institute policies that protect data at all levels, establish firewall protection for WebApps, and ensure that software has not been tampered with internally. SQA ensures that appropriate process and technology are used to achieve software security. Safety. Because software is almost always a pivotal component of human rated systems (e.g., automotive or aircraft applications), the impact of hidden defects can be catastrophic. SQA may be responsible for assessing the impact of software failure and for initiating those steps required to reduce risk. Risk management. Although the analysis and mitigation of risk is the concern of software engineers, the SQA organization ensures that risk management activities are properly conducted and that risk-related contingency plans have been established. SQA TASKS, GOALS, AND METRICS Software quality assurance is composed of a variety of tasks associated with two different constituencies—the software engineers who do technical work and an SQA group that has responsibility for quality assurance planning, oversight, record keeping, analysis, and reporting. Software engineers address quality (and perform quality control activities) by applying solid technical methods and measures, conducting technical reviews, and performing well-planned software testing. 1 SQA Tasks The charter of the SQA group is to assist the software team in achieving a highquality end product. The Software Engineering Institute recommends a set of SQA actions that address quality assurance planning, oversight, record keeping, analys is, and reporting. These actions are performed (or facilitated) by an independent SQA group that: Prepares an SQA plan for a project. The plan is developed as part of project planning and is reviewed by all stakeholders. Quality assurance actions performed by the software engineering team and the SQA group are governed by the plan. The plan identifies evaluations to be performed, audits and reviews to be conducted, standards that are applicable to the project, procedures for error reporting and tracking, work products that are produced by the SQA group, and feedback that will be provided to the software team. Participates in the development of the project’s software process description. The software team selects a process for the work to be performed. The SQA group reviews the process description for compliance with organizational policy, internal software standards, externally imposed standards (e.g., ISO-9001), and other parts of the software project plan. Reviews software engineering activities to verify compliance with the defined software process. The SQA group identifies, documents, and tracks deviations from the process and verifies that corrections have been made. Audits designated software work products to verify compliance with those defined as part of the software process. The SQA group reviews selected work products; identifies, documents, and tracks deviations; verifies that corrections have been made; and periodically reports the results of its work to the project manager. Ensures that deviations in software work and work products are documented and handled according to a documented procedure. Deviations may be encountered in the project plan, process description, applicable standards, or software engineering work products. Records any noncompliance and reports to senior management. Noncompliance items are tracked until they are resolved. In addition to these actions, the SQA group coordinates the control and management of change and helps to collect and analyze software metrics. 2 Goals, Attributes, and Metrics The SQA actions described in the preceding section are performed to achieve a set of pragmatic goals: Requirements quality. The correctness, completeness, and consistency of the requirements model will have a strong influence on the quality of all work products that follow. SQA must ensure that the software team has properly reviewed the requirements model to achieve a high level of quality. Design quality. Every element of the design model should be assessed by the software team to ensure that it exhibits high quality and that the design itself conforms to requirements. SQA looks for attributes of the design that are indicators of quality. Code quality. Source code and related work products (e.g., other descriptive information) must conform to local coding standards and exhibit characteristics that will facilitate maintainability. SQA should isolate those attributes that allow a reasonable analysis of the quality of code. Quality control effectiveness. A software team should apply limited resources in a way that has the highest likelihood of achieving a high-quality result. SQA analyzes the allocation of resources for reviews and testing to assess whether they are being allocated in the most effective manner. Figure 16.1 (adapted from [Hya96]) identifies the attributes that are indicators for the existence of quality for each of the goals discussed. Metrics that can be used to indicate the relative strength of an attribute are also shown. Unit-5 Test Planning and documentation Properly communicating and documenting the test effort with well-constructed test plans, test cases, and test reports will make it more likely that you and your fellow testers will achieve this goal. THE GOAL OF TEST PLANNING The testing process can’t operate in a vacuum. Performing your testing tasks would be very difficult if the programmers wrote their code without telling you what it does, how it works, or when it will be complete. Likewise, if you and the other software testers don’t communicate what you plan to test, what resources you need, and what your schedule is, your project will have little chance of succeeding. The software test plan is the primary means by which software testers communicate to the product development team what they intend to do. The IEEE Standard 829–1998 for Software Test Documentation1 states that the purpose of a software test plan is as follows: To prescribe the scope, approach, resources, and schedule of the testing activities. To identify the items being tested, the features to be tested, the testing tasks to be performed, the personnel responsible for each task, and the risks associated with the plan. Given that definition and the rest of the IEEE standard, you will notice that the form the test plan takes is a written document. That shouldn’t be too surprising, but it’s an important point because although the end result is a piece of paper (or online document or web page), that paper isn’t what the test plan is all about. The test plan is a by-product of the detailed planning process that’s undertaken to create it. It’s the planning process that matters, not the resulting document. The title of this chapter is “Planning Your Test Effort,” not “Writing Your Test Plan.” The distinction is intentional. Too often a written test plan ends up as shelfware—a document that sits on a shelf, never to be read. If the purpose of the planning effort is flipped from the creation of a document to the process of creating it, from writing a test plan to planning the testing tasks, the shelfware problem disappears. This isn’t to say that a final test plan document that describes and summarizes the results of the planning process is unnecessary. To the contrary, there still needs to be a test plan for reference and archiving—and in some industries it’s required by law. What’s important is that the plan is the by-product of, not the fundamental reason for, the planning process. If you spend time with your project team working through the topics presented in the remainder of this chapter, making sure that everyone has been informed and understands what the test team is planning to do, you’ll go a long way in meeting this goal. TEST PLANNING TOPICS Many software testing books present a test plan template or a sample test plan that you can easily modify to create your own project-specific test plan. The problem with this approach is that it makes it too easy to put the emphasis on the document, not the planning process. Test leads and managers of large software projects have been known to take an electronic copy of a test plan template or an existing test plan, spend a few hours cutting, copying, pasting, searching, and replacing, and turn out a “unique” test plan for their project. They felt they had done a great thing, creating in a few hours what other testers had taken weeks or months to create. They missed the point, though, and their project showed it when no one on the product team knew what the heck the testers were doing or why. For that reason, you won’t see a test plan template in this book. What follows, instead, is a list of important topics that should be thoroughly discussed, understood, and agreed to among your entire project team—including all the testers. The list may not map perfectly to all projects, but because it’s a list of common and important test-related concerns, it’s likely more applicable than a test plan template. By its nature, planning is a very dynamic process, so if you find yourself in a situation where the listed topics don’t apply, feel free to adjust them to fit. Of course, the result of the test planning process will be a document of some sort. The format may be predefined—if the industry or the company has a standard. The IEEE Standard 829–1998 for Software Test Documentation suggests a common form. Otherwise, the format will be up to your team and should be what’s most effective in communicating the fruits of your work. High-Level Expectations The first topics to address in the planning process are the ones that define the test team’s high-level expectations. They’re fundamental topics that must be agreed to by everyone on the project team, but they’re often overlooked. They might be considered “too obvious” and assumed to be understood by everyone—but a good tester knows never to assume anything! What’s the purpose of the test planning process and the software test plan? You know the reasons for test planning—okay, you will soon—but do the programmers know, do the technical writers know, does management know? More importantly, do they agree with and support the test planning process? What product is being tested? Of course you believe it’s the Ginsumatic v8.0, but is it, for sure? Is this v8.0 release planned to be a complete rewrite or a just a maintenance update? Is it one standalone program or thousands of pieces? Is it being developed in house or by a third party? And what is a Ginsumatic anyway? People, Places, and Things Test planning needs to identify the people working on the project, what they do, and how to contact them. If it’s a small project this may seem unnecessary, but even small projects can have team members scattered across long distances or undergo personnel changes that make tracking who does what difficult. A large team might have dozens or hundreds of points of contact. The test team will likely work with all of them and knowing who they are and how to contact them is very important. The test plan should include names, titles, addresses, phone numbers, email addresses, and areas of responsibility for all key people on the project. Definitions Getting everyone on the project team to agree with the high-level quality and reliability goals is a difficult task. Unfortunately, those terms are only the beginning of the words and concepts that need to be defined for a software project. Recall the definition of a bug from Chapter 1, “Software Testing Background”: 1. The software doesn’t do something that the product specification says it should do. 2. The software does something that the product specification says it shouldn’t do. 3. The software does something that the product specification doesn’t mention. 4. The software doesn’t do something that the product specification doesn’t mention but should. Would you say that every person on the team knows, understands, and—more importantly—agrees with that definition? Does the project manager know what your goal as a software tester is? If not, the test planning process should work to make sure they do. Here’s a list of a few common terms and very loose definitions. Don’t take the list to be complete nor the definitions to be fact. They are very dependent on what the project is, the development model the team is following, and the experience level of the people on the team. The terms are listed only to start you thinking about what should be defined for your projects and to show you how important it is for everyone to know the meanings. Build. A compilation of code and content that the programmers put together to be tested. The test plan should define the frequency of builds (daily, weekly) and the expected quality level. Test release document (TRD). A document that the programmers release with each build stating what’s new, different, fixed, and ready for testing. Alpha release. A very early build intended for limited distribution to a few key customers and to marketing for demonstration purposes. It’s not intended to be used in a real-world situation. The exact contents and quality level must be understood by everyone who will use the alpha release. Beta release. The formal build intended for widespread distribution to potential customers. Remember from Chapter 16, “Bug Bashes and Beta Testing,” that the specific reasons for doing the beta need to be defined. Spec complete. A schedule date when the specification is supposedly complete and will no longer change. After you work on a few projects, you may think that this date occurs only in fiction books, but it really should be set, with the specification only undergoing minor and controlled changes after that. Feature complete. A schedule date when the programmers will stop adding new features to the code and concentrate on fixing bugs. Bug committee. A group made up of the test manager, project manager, development manager, and product support manager that meets weekly to review the bugs and determine which ones to fix and how they should be fixed. The bug committee is one of the primary users of the quality and reliability goals set forth in the test plan. Inter-Group Responsibilities Inter-group responsibilities identify tasks and deliverables that potentially affect the test effort. The test team’s work is driven by many other functional groups— programmers, project managers, technical writers, and so on. If the responsibilities aren’t planned out, the project—specifically the testing—can become a comedy show of “I’ve got it, no, you take it, didn’t you handle, no, I thought you did,” resulting in important tasks being forgotten. The types of tasks that need to be defined aren’t the obvious ones—testers test, programmers program. The troublesome tasks potentially have multiple owners or sometimes no owner or a shared responsibility. What Will and Won’t Be Tested You might be surprised to find that everything included with a software product isn’t necessarily tested. There may be components of the software that were previously released and have already been tested. Content may be taken as is from another software company. An outsourcing company may supply pre-tested portions of the product. The planning process needs to identify each component of the software and make known whether it will be tested. If it’s not tested, there needs to be a reason it won’t be covered. It would be a disaster if a piece of code slipped through the development cycle completely untested because of a misunderstanding. Test Phases To plan the test phases, the test team will look at the proposed development model and decide whether unique phases, or stages, of testing should be performed over the course of the project. In a code-and-fix model, there’s probably only one test phase—test until someone yells stop. In the waterfall and spiral models, there can be several test phases from examining the product spec to acceptance testing. Yes, test planning is one of the test phases. The test planning process should identify each proposed test phase and make each phase known to the project team. This process often helps the entire team form and understand the overall development model. Test Strategy An exercise associated with defining the test phases is defining the test strategy. The test strategy describes the approach that the test team will use to test the software both overall and in each phase. Think back to what you’ve learned so far about software testing. If you were presented with a product to test, you’d need to decide if it’s better to use black-box testing or white-box testing. If you decide to use a mix of both techniques, when will you apply each and to which parts of the software? It might be a good idea to test some of the code manually and other code with tools and automation. If tools will be used, do they need to be developed or can existing commercial solutions be purchased? If so, which ones? Maybe it would be more efficient to outsource the entire test effort to a specialized testing company and require only a skeleton testing crew to oversee their work. Resource Requirements Planning the resource requirements is the process of deciding what’s necessary to accomplish the testing strategy. Everything that could possibly be used for testing over the course of the project needs to be considered. For example: People. How many, what experience, what expertise? Should they be full-time, part-time, contract, students? Equipment. Computers, test hardware, printers, tools. Office and lab space. Where will they be located? How big will they be? How will they be arranged? Software. Word processors, databases, custom tools. What will be purchased, what needs to be written? Outsource companies. Will they be used? What criteria will be used for choosing them? How much will they cost? Miscellaneous supplies. Disks, phones, reference books, training material. What else might be necessary over the course of the project? The specific resource requirements are very project-, team-, and company- dependent, so the test plan effort will need to carefully evaluate what will be needed to test the software. It’s often difficult or even impossible to obtain resources late in the project that weren’t budgeted for at the beginning, so it’s imperative to be thorough when creating the list. Tester Assignments Once the test phases, test strategy, and resource requirements are defined, that information can be used with the product spec to break out the individual tester assignments. The inter-group responsibilities discussed earlier dealt with what functional group (management, test, programmers, and so on) is responsible for what high-level tasks. Planning the tester assignments identifies the testers (this means you) responsible for each area of the software and for each testable feature. Table 17.1 shows a greatly simplified example of a tester assignments table for Windows WordPad. A real-world responsibilities table would go into much more detail to assure that every part of the software has someone assigned to test it. Metrics and Statistics Metrics and statistics are the means by which the progress and the success of the project, and the testing, are tracked. The test planning process should identify exactly what information will be gathered, what decisions will be made with them, and who will be responsible for collecting them. Examples of test metrics that might be useful are Total bugs found daily over the course of the project List of bugs that still need to be fixed Current bugs ranked by how severe they are Total bugs found per tester Number of bugs found per software feature or area Risks and Issues A common and very useful part of test planning is to identify potential problem or risky areas of the project—ones that could have an impact on the test effort. Suppose that you and 10 other new testers, whose total software test experience was reading this book, were assigned to test the software for a new nuclear power plant. That would be a risk. Maybe some new software needs to be tested against 1,500 modems but there’s only time in the project schedule to test 500 of them. As a software tester, you’ll be responsible for identifying risks during the planning process and communicating your concerns to your manager and the project manager. These risks will be identified in the software test plan and accounted for in the schedule. Some will come true, others will turn out to be benign. The important thing is to identify them early so that they don’t appear as a surprise late in the project. TEST CASE PLANNING OVERVIEW So where exactly does test case planning fit into the grand scheme of testing? Figure 18.1 shows the relationships among the different types of test plans. You’re already familiar with the top, or project level, test plan and know that the process of creating it is more important than the resulting document. The next three levels, the test design specification, the test case specification, and the test procedure specification are described in detail in the following sections. As you can see in Figure 18.1, moving further away from the top-level test plan puts less emphasis on the process of creation and more on the resulting written document. The reason is that these plans become useful on a daily, sometimes hourly, basis by the testers performing the testing. As you’ll learn, at the lowest level they become step-by-step instructions for executing a test, making it key that they’re clear, concise, and organized—how they got that way isn’t nearly as important. The information presented in this chapter is adapted from the IEEE Std 829-1998 Standard for Software Test Documentation (available from standards.ieee.org). This standard is what many testing teams have adopted as their test planning documentation—intentional or not—because it represents a logical and common- sense method for test planning. The important thing to realize about this standard is that unless you’re bound to follow it to the letter because of the type of software you’re testing or by your corporate or industry policy, you should use it as a guideline and not a standard. The information it contains and approaches it recommends are as valid today as they were when the standard was written in 1983. But, what used to work best as a written document is often better and more efficiently presented today as a spreadsheet or a database. You’ll see an example of this later in the chapter. The bottom line is that you and your test team should create test plans that cover the information outlined in IEEE 829. If paper printouts work best (which would be hard to believe), by all means use them. If, however, you think a central database is more efficient and your team has the time and budget to develop or buy one, you should go with that approach. Ultimately it doesn’t matter. What does matter is that when you’ve completed your work, you’ve met the four goals of test case planning: organization, repeatability, tracking, and proof. TEST CASES This section on documenting test cases will give you a few more options to consider. IEEE 829 states that the test case specification “documents the actual values used for input along with the anticipated outputs. A test case also identifies any constraints on the test procedure resulting from use of that specific test case.” Essentially, the details of a test case should explain exactly what values or conditions will be sent to the software and what result is expected. It can be referenced by one or more test design specs and may reference more than one test procedure. The IEEE 829 standard also lists some other important information that should be included: Identifiers. A unique identifier is referenced by the test design specs and the test procedure specs. Test item. This describes the detailed feature, code module, and so on that’s being tested. It should be more specific than the features listed in the test design spec. If the test design spec said “the addition function of Calculator,” the test case spec would say “upper limit overflow handling of addition calculations.” It should also provide references to product specifications or other design docs from which the test case was based. Input specification. This specification lists all the inputs or conditions given to the software to execute the test case. If you’re testing Calculator, this may be as simple as 1+1. If you’re testing cellular telephone switching software, there could be hundreds or thousands of input conditions. If you’re testing a file based product, it would be the name of the file and a description of its contents. Output specification. This describes the result you expect from executing the test case. Did 1+1 equal 2? Were the thousands of output variables set correctly in the cell software? Did all the contents of the file load as expected? Environmental needs. Environmental needs are the hardware, software, test tools, facilities, staff, and so on that are necessary to run the test case. Special procedural requirements. This section describes anything unusual that must be done to perform the test. Testing WordPad probably doesn’t need anything special, but testing nuclear power plant software might. Intercase dependencies. Chapter 1, “Software Testing Background,” included a description of a bug that caused NASA’s Mars Polar Lander to crash on Mars. It’s a perfect example of an undocumented intercase dependency. Figure 18.2 shows an example of a printer compatibility table. Each line of the matrix is a specific test case and has its own identifier. All the other information that goes with a test case—test item, input spec, output spec, environmental needs, special requirements, and dependencies—are most likely common to all these cases and could be written once and attached to the table. TEST PROCEDURES After you document the test designs and test cases, what remains are the procedures that need to be followed to execute the test cases. IEEE 829 states that the test procedure specification “identifies all the steps required to operate the system and exercise the specified test cases in order to implement the associated test design.” The test procedure or test script spec defines the step-by-step details of exactly how to perform the test cases. Here’s the information that needs to be defined: Identifier. A unique identifier that ties the test procedure to the associated test cases and test design. Purpose. The purpose of the procedure and reference to the test cases that it will execute. Special requirements. Other procedures, special testing skills, or special equipment needed to run the procedure. Procedure steps. Detailed description of how the tests are to be run: Log. Tells how and by what method the results and observations will be recorded. Setup. Explains how to prepare for the test. Start. Explains the steps used to start the test. Procedure. Describes the steps used to run the tests. Measure. Describes how the results are to be determined—for example, with a stopwatch or visual determination. Shut down. Explains the steps for suspending the test for unexpected reasons. Restart. Tells the tester how to pick up the test at a certain point if there’s a failure or after shutting down. Stop. Describes the steps for an orderly halt to the test. Wrap up. Explains how to restore the environment to its pre-test condition. Contingencies. Explains what to do if things don’t go as planned. It’s not sufficient for a test procedure to just say, “Try all the following test cases and report back on what you see….” That would be simple and easy but wouldn’t tell a new tester anything about how to perform the testing. It wouldn’t be repeatable and there’d be no way to prove what steps were executed. Using a detailed procedure makes known exactly what will be tested and how. TEST CASE ORGANIZATION AND TRACKING One consideration that you should take into account when creating the test case documentation is how the information will be organized and tracked. Think about the questions that a tester or the test team should be able to answer: Which test cases do you plan to run? How many test cases do you plan to run? How long will it take to run them? Can you pick and choose test suites (groups of related test cases) to run on particular features or areas of the software? When you run the cases, will you be able to record which ones pass and which ones fail? Of the ones that failed, which ones also failed the last time you ran them? What percentage of the cases passed the last time you ran them? These examples of important questions might be asked over the course of a typical project. Chapter 20, “Measuring Your Success,” will discuss data collection and statistics in more detail, but for now, consider that some sort of process needs to be in place that allows you to manage your test cases and track the results of running them. There are essentially four possible systems: In your head. Don’t even consider this one, even for the simplest projects, unless you’re testing software for your own personal use and have no reason to track your testing. You just can’t do it. Paper/documents. It’s possible to manage the test cases for very small projects on paper. Tables and charts of checklists have been used effectively. They’re obviously a weak method for organizing and searching the data but they do offer one very important positive—a written checklist that includes a tester’s initials or signature denoting that tests were run is excellent proof in a court-of-law that testing was performed. Spreadsheet. A popular and very workable method of tracking test cases is by using a spreadsheet. Figure 18.4 shows an example of this. By keeping all the details of the test cases in one place, a spreadsheet can provide an at-a-glance view of your testing status. They’re easy to use, relatively easy to set up, and provide good tracking and proof of testing. Custom database. The ideal method for tracking test cases is to use a Test Case Management Tool, a database programmed specifically to handle test cases. Many commercially available applications are set up to perform just this specific task. Visit some of the web links listed in Chapter 22, “Your Career as a Software Tester,” for more information and recommendations from other testers. If you’re interested in creating your own tracking system, database software such as FileMaker Pro, Microsoft Access, and many others provide almost drag-and-drop database creation that would let you build a database that mapped to the IEEE 829 standard in just a few hours. You could then set up reports and queries that would allow you to answer just about any question regarding the test cases. The important thing to remember is that the number of test cases can easily be in the thousands and without a means to manage them, you and the other testers could quickly be lost in a sea of documentation. You need to know, at a glance, the answer to fundamental questions such as, “What will I be testing tomorrow, and how many test cases will I need to run?” GETTING YOUR BUGS FIXED “The Realities of Software Testing,” you learned that despite your best efforts at planning and executing your tests, not all the bugs you find will be fixed. Some may be dismissed completely, and others may be deferred or postponed for fixing in a subsequent release of the software. At the time, it may have been a bit discouraging or even frightening to think that such a concept was a possibility. Hopefully, now that you know a great deal more about software testing, you can see why not fixing all the bugs is a reality. The reasons listed for not fixing a bug were There’s not enough time. Every project always has too many software features, too few people to code and test them, and not enough room left in the schedule to finish. If you’re working on a tax-preparation program, April 15 isn’t going to move—you must have your software ready in time. It’s really not a bug. Maybe you’ve heard the phrase, “It’s not a bug, it’s a feature!” It’s not uncommon for misunderstandings, test errors, or spec changes to result in would-be bugs being dismissed as features. It’s too risky to fix. Unfortunately, this is all too often true. Software is fragile, intertwined, and sometimes like spaghetti. You might make a bug fix that causes other bugs to appear. Under the pressure to release a product under a tight schedule, it might be too risky to change the software. It may be better to leave in the known bug to avoid the risk of creating new, unkn