Design Production PDF
Document Details
Uploaded by IntuitiveLithium
Tags
Summary
This chapter discusses design production in the context of the UX lifecycle. It focuses on using requirements to drive design, exploring lifecycle iterations, understanding wireframes, and the concept of interaction design specifications for software implementation.
Full Transcript
CHAPTER Design Production Objectives 9 After reading this chapter, you will: 1. Know how to use requirements to drive desi...
CHAPTER Design Production Objectives 9 After reading this chapter, you will: 1. Know how to use requirements to drive design 2. Understand the macro view of lifecycle iteration for design 3. Be able to unpack conceptual designs and explore strategies for realization in intermediate design 4. Understand wireframes and how to make and use them 5. Be prepared to use annotated scenarios, prototypes, and wireframes to represent screens and navigation in detailed design 6. Know how to maintain a custom style guide in design 7. Understand the concept of interaction design specifications for software implementation 9.1 INTRODUCTION 9.1.1 You Are Here We begin each process chapter with a “you are here” picture of the chapter topic in the context of the overall Wheel lifecycle template; see Figure 9-1. This chapter is a continuation of the previous one about designing the new work practice and the new system. In Chapter 7 we did ideation and sketching and in Chapter 8 we conceptualized design alternatives. Now it is time to make sure that we account for all the requirements and envisioned models in those designs. This is especially important for domain-complex systems where it is necessary to maintain connections to contextual data. The translation from requirements to design is often regarded as the most difficult step in the UX lifecycle process. We should expect it to be difficult because now that we have made the cognitive shift from analysis-mode thinking to synthesis-mode thinking, there are so many possible choices for design to meet any one given requirement and following requirements does not guarantee an integrated overall solution. 334 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE Figure 9-1 You are here; the third of three chapters on creating an interaction design in the context of the overall Wheel lifecycle template. Beyer, Holtzblatt, and Wood (2005, p. 218) remind us that “The design isn’t explicit in the data.” “The data guides, constrains, and suggests directions” that design “can respond to.” The requirements, whether in a requirements document or as an interpretation of the work activity affinity diagram (WAAD), offer a large inventory of things to be supported in the design. 9.2 MACRO VIEW OF LIFECYCLE ITERATIONS FOR DESIGN In Figure 9-2 we show a “blow up” of how lifecycle iteration plays out on a macroscopic scale for the various types of design. Each type of design has its own iterative cycle with its own kind of prototype and evaluation. Among the very first to talk about iteration for interaction design were Buxton and Sniderman (1980). The observant reader will note that the progressive series of iterative loops in Figure 9-2 can be thought of as a kind of spiral lifecycle concept. Each loop in turn addresses an increasing level of detail. For each different project context and each stage of progress within the project, you have to adjust the amount of and kind of design, prototyping, and evaluation to fit the situation in each of these incarnations of that lifecycle template. DESIGN PRODUCTION 335 Figure 9-2 9.2.1 Ideation Iteration Macro view of lifecycle iterations in design. At “A” in Figure 9-2, iteration for ideation and sketching (Chapter 7) is a lightning-fast, loosely structured iteration for the purpose of exploring design ideas. The role of prototype is played by sketches, and the role of evaluation is carried out by brainstorming, discussion, and critiquing. Output is possibly multiple alternatives for conceptual designs, mostly in the form of annotated rough sketches. 336 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE Conceptual Design 9.2.2 Conceptual Design Iteration A conceptual design is a At “B” in Figure 9-2, iteration for conceptual design is to evaluate and compare theme, notion, or idea possibly multiple design concepts and weigh concept feasibility. The type of with the purpose of communicating a design prototype evolves with each successive iteration, roughly from paper prototype vision about a system or to low-fidelity wireframes and storyboards. The type of evaluation here is usually product. It is the part of the system design that brings in the form of storytelling via storyboards to key stakeholders. The idea is to the designer’s mental model communicate how the broader design concepts help users in the envisioned to life. work domain. Depending on the project context, one or more of the design perspectives Wireframe may be emphasized in the storyboards. This is usually the stage where key A wireframe is a visual stakeholders such as users or their representatives, business, software schematic, blueprint, or engineering, and marketing must be heavily involved. You are planting the seeds template of a screen or Web page design in an for what the entire design will be for the system going forward. interaction design. It is a skeletal representation of screen (or page) layout of interaction objects such as 9.2.3 Intermediate Design Iteration tabs, menus, buttons, At “C” in Figure 9-2, the purpose of intermediate design (coming up soon) dialogue boxes, displays, and navigational elements. iteration is to sort out possible multiple conceptual design candidates and to The focus of wireframes is arrive at one intermediate design for layout and navigation. For example, for on screen content and behavior but not graphical the Ticket Kiosk System, there are at least two conceptual design candidates in specifics such as fonts, the interaction perspective. One is a traditional “drill-in” concept where users colors, or graphics. Often the earliest way design are shown available categories (e.g., movies, concerts, MU athletics) from ideas become tangible, which they choose one. Based on the choice on this first screen, the user is wireframes are the basis for iterative rapid prototypes. shown further options and details, navigating with a back button and/or “bread crumb” trail, if necessary, to come back to the category view. A second conceptual design is the one using the three-panel idea described in the previous chapter. Intermediate prototypes might evolve from low-fidelity to high-fidelity or wireframes. Fully interactive high-fidelity mockups can be used as a vehicle to demonstrate leading conceptual design candidates to upper management stakeholders if you need this kind of communication at this stage. Using such wireframes or other types of prototypes, the candidate design concepts are validated and a conceptual design forerunner is selected. 9.2.4 Detailed Design Iteration At “D” in Figure 9-2, iteration for detailed design is to decide screen design and layout details, including “visual comps” (coming up soon) of the “skin” for look and feel appearance. The prototypes might be detailed wireframes and/or high-fidelity interactive mockups. At this stage, the design will be fully DESIGN PRODUCTION 337 specified with complete descriptions of behavior, look and feel, and information on how all workflows, exception cases, and settings will be handled. 9.2.5 Design Refinement Iteration At “E” in Figure 9-2, a prototype for refinement evaluation and iteration is usually medium to high fidelity and evaluation is either a rapid method (Chapter 13) or a full rigorous evaluation process (Chapters 12 and 14 through 18). 9.3 INTERMEDIATE DESIGN For intermediate design, you will need the same team you have had since ideation and sketching, plus a visual designer if you do not already have one. Intermediate design starts with your conceptual design and moves forward with increasing detail and fidelity. The goal of intermediate design is to create a logical flow of intermediate-level navigational structure and screen designs. Even though we use the term screen here for ease of discussion, this is also applicable to other product designs where there are no explicit screens. 9.3.1 Unpacking the Conceptual Design: Strategies for Realization At “C” in Figure 9-2, you are taking the concepts created in conceptual design, Information Object decomposing them into logical units, and expanding each unit into different An information object is an internally stored work possible design strategies (corresponding to different conceptual design object shared by users and candidates) for concept realization. Eventually you will decide on a design the system. Information objects are often data strategy, from which spring an iterated and evaluated intermediate prototype. entities central to work flow, being operated on by users; they are searched and 9.3.2 Ground Your Design in Application Ontology with browsed for, accessed and displayed, modified and Information Objects manipulated, and stored Per Johnson and Henderson (2002, p. 27), you should begin by thinking in back again. terms of the ontological structure of the system, which will now be available in analyzed and structured contextual data. This starts with what we call information objects that we identified in modeling (Chapter 6). As these information objects move within the envisioned flow model, they are accessed and manipulated by people in work roles. In a graphics-drawing application, for example, information objects might be rectangles, circles, and other graphical objects that are created, modified, and combined by users. Identify relationships among the application objects—sometimes hierarchical, sometimes temporal, sometimes involving user workflow. With the 338 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE help of your physical model, cast your ontological net broadly enough to identify other kinds of related objects, for example, telephones and train tickets, and their physical manipulation as done in conjunction with system operation. In design we also have to think about how users access information objects; from the user perspective, accessing usually means getting an object on the screen so that it can be operated on in some way. Then we have to think about what kinds of operations or manipulation will be performed. For example, in the Ticket Kiosk System, events and tickets are important information objects. Start by thinking about how these can be represented in the design. What are the best design patterns to show an event? What are the design strategies to facilitate ways to manipulate them? In your modeling you should have already identified information objects, their attributes, and relationships among them. In your conceptual design and later in intermediate design, you should already have decided how information objects will be represented in the user interaction design. Now you can decide how users get at, or access, these information objects. Typically, because systems are too large and complex to show all information objects on the screen at once initially, how do your users call up a specific information object to operate on it? Think about information seeking, including browsing and searching. Decide what operations users will carry out on your information objects. For example, a graphics package would have an operation to create a new rectangle object and operations to change its size, location, color, etc. Think about how users will invoke and perform those operations. Add these new things to your storyboards. The design of information object operations goes hand in hand with design scenarios (Chapter 6), personas (Chapter 7), and storyboards (Chapter 8), which can add life to the static wireframe images of screens. 9.3.3 Illustrated Scenarios for Communicating Designs One of the best ways to describe parts of your intermediate interaction design in a document is through illustrated scenarios, which combine the visual communication capability of storyboards and screen sketches with the capability of textual scenarios to communicate details. The result is an excellent vehicle for sharing and communicating designs to the rest of the team, and to management, marketing, and all other stakeholders. Making illustrated scenarios is simple; just intersperse graphical storyboard frames and/or screen sketches as figures in the appropriate places to illustrate DESIGN PRODUCTION 339 the narrative text of a design scenario. The storyboards in initial illustrated scenarios can be sketches or early wireframes (coming up later). 9.3.4 Screen Layout and Navigational Structure During this phase, all layout and navigation elements are fully fleshed out. Using sequences of wireframes, key workflows are represented while describing what happens when the user interacts with the different user interface objects in the design. It is not uncommon to have wireframe sets represent part of the workflow or each task sequence using click-through prototypes. 9.4 DETAILED DESIGN At “D” in Figure 9-2, for detailed design you will need the same team you had for intermediate design, plus documentation and language experts, to make sure that the tone, vocabulary, and language are accurate, precise and consistent, both with itself and with terminology used in the domain. 9.4.1 Annotated Wireframes To iterate and evaluate your detailed designs, refine your wireframes more Custom Style Guide completely by including all user interface objects and data elements, still A custom style guide is a represented abstractly but annotated with call-out text. document that is fashioned and maintained by designers to capture and describe details of visual and other general design 9.4.2 Visual Design and Visual Comps decisions that can be As a parallel activity, a visual designer who has been involved in ideation, applied in multiple places. Its contents can be specific sketching, and conceptual design now produces what we call visual “comps,” to one project or an meaning variously comprehensive or composite layout (a term originating in the umbrella guide across all projects on a given printing industry). All user interface elements are represented, now with a very platform, or over a whole specific and detailed graphical look and feel. organization. A visual comp is a pixel-perfect mockup of the graphical “skin,” including objects, colors, sizes, shapes, fonts, spacing, and location, plus visual “assets” for user interface elements. An asset is a visual element along with all of its defining Exercise See Exercise 9-1, characteristics as expressed in style definitions such as cascading style sheets for a Intermediate and Detailed Website. The visual designer casts all of this to be consistent with company Design for Your System branding, style guides, and best practices in visual design. 340 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE 9.5 WIREFRAMES In Figure 9-3 we show the path from ideation and sketching, task interaction models, and envisioned design scenarios to wireframes as representations of your designs for screen layout and navigational flow. Along with ideation and sketching, task interaction models and design scenarios are the principal inputs to storytelling and communication of designs. As sequences of sketches, storyboards are a natural extension of sketching. Storyboards, like scenarios, represent only selected task threads. Fortunately, it is a short and natural step from storyboards to wireframes. To be sure, nothing beats pencil/pen and paper or a whiteboard for the sketching needed in ideation (Chapter 7), but, at some point, when the design concept emerges from ideation, it must be communicated to others who pursue the rest of the lifecycle process. Wireframes have long been the choice in the field for documenting, communicating, and prototyping interaction designs. 9.5.1 What Are Wireframes? Wireframes, a major bread-and-butter tool of interaction designers, are a form of prototype, popular in industry practice. Wireframes comprise lines and outlines (hence the name “wire frame”) of boxes and other shapes to represent emerging interaction designs. They are schematic diagrams and “sketches” that define a Web page or screen content and navigational flow. They are used to illustrate high-level concepts, approximate visual layout, behavior, and sometimes even look and feel for an interaction design. Wireframes are embodiments of maps of screen or other state transitions during usage, depicting envisioned task flows in terms of user actions on user interface objects. The drawing aspects of wireframes are often simple, offering mainly the use of rectangular objects that can be labeled, moved, and resized. Text and graphics Figure 9-3 The path from ideation and sketching, task interaction models, and envisioned design scenarios to wireframes. DESIGN PRODUCTION 341 representing content and data in the design is placed in those objects. Drawing templates, or stencils, are used to provide quick means to represent the more common kinds of user interface objects (more on this in the following sections). Wireframes are often deliberately unfinished looking; during early stages of design they may not even be to scale. They usually do not contain much visual content, such as finished graphics, colors, or font choices. The idea is to create design representations quickly and inexpensively by just drawing boxes, lines, and other shapes. As an example of using wireframes to illustrate high-level conceptual designs, see Figure 9-4. The design concept depicted in this figure is comprised of a three-column pattern for a photo manipulation application. A primary navigation pane (the “nav bar”) on the left-hand side is intended to show a list of all the user’s photo collections. The center column is the main content display area for details, thumbnail images and individual photos, from the collection selected in the left pane. The column on the right in Figure 9-4 is envisioned to show related contextual information for the selected collection. Note how a simple wireframe using just boxes, lines, and a little text can be effective in describing a broad Figure 9-4 An example wireframe illustrating a high-level conceptual design. 342 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE Figure 9-5 Further elaboration of the conceptual design and layout of Figure 9-4. interaction conceptual design pattern. Often these kinds of patterns are explored during ideation and sketching, and selected sketches are translated into wireframes. While wireframes can be used to illustrate high-level ideas, they are used more commonly to illustrate medium-fidelity interaction designs. For example, the idea of Figure 9-4 is elaborated further in Figure 9-5. The navigation bar in the left column now shows several picture collections and a default “work bench” where all uploaded images are collected. The selected item in this column, “Italy trip,” is shown as the active collection using another box with the same label and a fill color of gray, for example, overlaid on the navigation bar. The center content area is also elaborated more using boxes and a few icons to show a scrollable grid of thumbnail images with some controls on the top right. Note how certain details pertaining to the different manipulation options are left incomplete while showing where they are located on the screen. Wireframes can also be used to show behavior. For example, in Figure 9-6 we show what happens when a user clicks on the vertical “Related information” bar in Figure 9-5: a pane with contextual information for this collection (or individual photo) slides out. In Figure 9-7 we show a different view of the content DESIGN PRODUCTION 343 Figure 9-6 The display that results when a user clicks on the “Related information” bar. Figure 9-7 The display that results when a user clicks on the “One-up” view button. 344 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE pane, this time as a result of a user clicking on the “One-up” view switcher button in Figure 9-5 to see a single photo in the context pane. Double-clicking a thumbnail image will also expand that image into a one-up view to fill the content pane. 9.5.2 How Are Wireframes Used? Wireframes are used as conversational props to discuss designs and design alternatives. They are effective tools to elicit feedback from potential users and other stakeholders. A designer can move through a deck of wireframes one slide at a time, simulating a potential scenario by pretending to click on interaction widgets on the screen. These page sequences can represent the flow of user activity within a scenario, but cannot show all possible navigational paths. For example, if Figures 9-5, 9-6, and 9-7 are in a deck, a designer can narrate a design scenario where user actions cause the deck to progress through the corresponding images. Such wireframes can be used for rapid and early lab-based evaluation by printing and converting them into low-fidelity paper prototypes (Chapter 11). A rough low- to medium-fidelity prototype, using screens like the ones shown in Figures 9-5, 9-6, and 9-7, can also be used for design walkthroughs and expert evaluations. In the course of such an evaluation, the expert can extrapolate intermediate states between wireframes. What we have described so far is easy to do with almost all wireframing tools. Most wireframing tools also provide hyperlinking capabilities to make the deck a click-through prototype. While this takes more effort to create, and even more to maintain as the deck changes, it provides a more realistic representation of the envisioned behavior of the design. However, the use of this kind of prototype in an evaluation might require elaborating all the states of the design in the workflow that is the focus of the evaluation. Finally, after the design ideas are iterated and agreed upon by relevant stakeholders, wireframes can be used as interaction design specifications. When wireframes are used as inputs to design production, they are annotated with details to describe the different states of the design and widgets, including mouse-over states, keyboard inputs, and active focus states. Edge cases and transition effects are also described. The goal here is completeness, to enable a developer to implement the designs without the need for any interpretation. Such specifications are usually accompanied by high-fidelity visual comps, discussed previously in this chapter. DESIGN PRODUCTION 345 9.5.3 How to Build Wireframes? Wireframes can be built using any drawing or word processing software package that supports creating and manipulating shapes, such as iWork Pages, Keynote, Microsoft PowerPoint, or Word. While such applications suffice for simple wireframing, we recommend tools designed specifically for this purpose, such as OmniGraffle (for Mac), Microsoft Visio (for PC), and Adobe InDesign. Many tools and templates for making wireframes are used in combination— truly an invent-as-you-go approach serving the specific needs of prototyping. For example, some tools are available to combine the generic-looking placeholders in wireframes with more detailed mockups of some screens or parts of screens. In essence they allow you to add color, graphics, and real fonts, as well as representations of real content, to the wireframe scaffolding structure. In early stages of design, during ideation and sketching, you started with thinking about the high-level conceptual design. It makes sense to start with that here, too, first by wireframing the design concept and then by going top down to address major parts of the concept. Identify the interaction conceptual design using boxes with labels, as shown in Figure 9-4. Take each box and start fleshing out the design details. What are the different kinds of interaction needed to support each part of the design, and what kinds of widgets work best in each case? What are the best ways to lay them out? Think about relationships among the widgets and any data that need to go with them. Leverage design patterns, metaphors, and other ideas and concepts from the work domain ontology. Do not spend too much time with exact locations of these widgets or on their alignment yet. Such refinement will come in later iterations after all the key elements of the design are represented. As you flesh out all the major areas in the design, be mindful of the information architecture on the screen. Make sure the wireframes convey that inherent information architecture. For example, do elements on the screen follow a logical information hierarchy? Are related elements on the screen positioned in such a way that those relationships are evident? Are content areas indented appropriately? Are margins and indents communicating the hierarchy of the content in the screen? Next it is time to think about sequencing. If you are representing a workflow, start with the “wake-up” state for that workflow. Then make a wireframe representing the next state, for example, to show the result of a user action such as clicking on a button. In Figure 9-6 we showed what happens when a user clicks 346 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE on the “Related information” expander widget. In Figure 9-7 we showed what happens if the user clicks on the “One-up” view switcher button. Once you create the key screens to depict the workflow, it is time to review and refine each screen. Start by specifying all the options that go on the screen (even those not related to this workflow). For example, if you have a toolbar, what are all the options that go into that toolbar? What are all the buttons, view switchers, window controllers (e.g., scrollbars), and so on that need to go on the screen? At this time you are looking at scalability of your design. Is the design pattern and layout still working after you add all the widgets that need to go on this screen? Think of cases when the windows or other container elements such as navigation bars in the design are resized or when different data elements that need to be supported are larger than shown in the wireframe. For example, in Figures 9-5 and 9-6, what must happen if the number of photo collections is greater than what fits in the default size of that container? Should the entire page scroll or should new scrollbars appear on the left-hand navigation bar alone? How about situations where the number of people identified in a collection are large? Should we show the first few (perhaps ones with most number of associated photos) with a “more” option, should we use an independent scrollbar for that pane, or should we scroll the entire page? You may want to make wireframes for such edge cases; remember they are less expensive and easier to do using boxes and lines than in code. As you iterate your wireframes, refine them further, increasing the fidelity of the deck. Think about proportions, alignments, spacing, and so on for all the widgets. Refine the wording and language aspects of the design. Get the wireframe as close to the envisioned design as possible within the constraints of using boxes and lines. 9.5.4 Hints and Tips for Wireframing Because the point of wireframing is to make quick prototypes for exploring design ideas, one of the most important things to remember about wireframing is modularity. Just as in paper prototyping, you want to be able to create multiple design representations quickly. Being modular means not having too many concepts or details “hard coded” in any one wireframe. Build up concepts and details using “layers.” Most good wireframing tools provide support for layers that can be used to abstract related design elements into reusable groups. Use a separate layer for each repeating set of widgets on the screen. For example, the container “window” of the DESIGN PRODUCTION 347 application with its different controls can be specified once as a layer and this layer can be reused in all subsequent screens that use that window control. Similarly, if there is a navigation area that is not going to change in this wireframe deck, for example, the left-hand collections pane in Figure 9-5, use one shared layer for that. Layers can be stacked upon one another to construct a slide. This stacking also provides support for ordering in the Z axis to show overlapping widgets. Selection highlights, for example, showing that “Italy trip” is the currently selected collection in Figure 9-5, can also created using a separate “highlight” layer. Another tip for efficient wireframing is to use stencils, templates, and libraries of widgets. Good wireframing tools often have a strong community following of users who share wireframing stencils and libraries for most popular domains— for example, for interaction design—and platforms—for example, Web, Apple iOS, Google’s Android, Microsoft’s Windows, and Apple’s Macintosh. Using these libraries, wireframing becomes as easy as dragging and dropping different widgets onto layers on a canvas. Create your own stencil if your design is geared toward a proprietary platform or system. Start with your organization’s style guide and build a library of all common design patterns and elements. Apart from efficiency, stencils and libraries afford consistency in wireframing. Some advanced wireframing tools even provide support for shared objects in a library. When these objects are modified, it is possible to automatically update all instances of those objects in all linked wireframe decks. This makes maintenance and updates to wireframes easier. Sketchy wireframes Sometimes, when using wireframes to elicit feedback from users, if you want to convey the impression that the design is still amenable to changes, make wireframes look like sketches. We know from Buxton (2007a) that the style or “language” of a sketch should not convey the perception that it is more developed than it really is. Straight lines and coloring within the lines give the false impression that the design is almost finished and, therefore, constructive criticism and new ideas are no longer appropriate. However, conventional drawing tools, such as Microsoft Visio, Adobe Illustrator, OmniGraffle, and Adobe inDesign, produce rigid, computer-drawn boxes, lines, and text. In response, “There is a growing popularity toward something in the middle: Computer-based sketchy wireframes. These allow computer wireframes to look more like quick, hand-drawn sketches while retaining the reusability and polish that we expect from digital artifacts” (Travis, 2009). CHAPTER UX Goals, Metrics, and Targets 10 Objectives After reading this chapter, you will: 1. Understand the concepts of UX goals, metrics, and targets 2. Appreciate the need for setting UX target values for the envisioned system 3. Understand the influence of user classes, business goals, and UX goals on UX targets 4. Be able to create UX target tables, including identifying measuring instruments and setting target values 5. Know how UX targets help manage the UX lifecycle process 10.1 INTRODUCTION 10.1.1 You Are Here We are making splendid progress in moving through the Wheel UX lifecycle template. In this chapter we establish operational targets for user experience to assess the level of success in your designs so that you know when you can move on to the next iteration. UX goals, metrics, and targets help you plan for evaluation that will successfully reveal the user performance and emotional satisfaction bottlenecks. Because UX goals, metrics, and targets are used to guide much of the process from analysis through evaluation, we show it as an arc around the entire lifecycle template, as you can see in Figure 10-1. 10.1.2 Project Context for UX Metrics and Targets In early stages, evaluation usually focuses on qualitative data for finding UX problems. In these early evaluations the absence of quantitative data precludes the use of UX metrics and targets. But you may still want to establish them at this point if you intend to use them in later evaluations. However, there is another need why you might forego UX metrics and targets. In most practical contexts, specifying UX metrics and targets and following up with 360 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE them may be too expensive. This level of completeness is only possible in a few organizations where there are established UX resources. In most places, one round of evaluation is all one gets. Also, as designers, we can know which parts of the design need further investigation just by looking at the results of the first round of evaluation. In such cases, quantitative UX metrics and targets may not be useful but benchmark tasks are still essential as vehicles for driving evaluation. Figure 10-1 Regardless, the trend in the UX field is moving away from a focus on user You are here; the chapter on performance and more toward user satisfaction and enjoyment. We include UX goals, metrics, and the full treatment of UX goals, metrics, and targets here and quantitative targets in the context of the overall Wheel lifecycle data collection and analysis in the later UX evaluation chapters for completeness template. and because some readers and practitioners still want coverage of the topic. In any case, we find that this pivotal interaction design process activity of Benchmark Task specifying UX goals, metrics, and targets is often overlooked, either because of A benchmark task is a lack of knowledge or because of lack of time. Sometimes this can be unfortunate description of a task performed by a participant because it can diminish the potential of what can be accomplished with the in formative evaluation so resources you will be putting into user experience evaluation. This chapter will that UX measures such as time-on-task and error rates help you avoid that pitfall by showing you techniques for specifying UX goals, can be obtained and metrics, and targets. compared to a baseline value across the Fortunately, creating UX metrics and targets, after a little practice, does not performances of multiple take much time. You will then have specific quantified UX goals against which participants. to test rather than just waiting to see what happens when you put users in front of your interaction design. Because UX metrics and targets provide feasible objectives for formative evaluation efforts, the results can help you pinpoint where to focus on redesign most profitably. And, finally, UX goals, metrics, and targets offer a way to help manage the lifecycle by defining a quantifiable end to what can otherwise seem like endless iteration. Of course, designers and managers can run out of time, money, and UX GOALS, METRICS, AND TARGETS 361 patience before they meet their UX targets—sometimes after just one round of evaluation—but at least then they know where things stand. 10.1.3 Roots for UX Metrics and Targets The concept of formal UX measurement specifications in tabular form, with various metrics operationally defining success, was originally developed by Gilb (1987). The focus of Gilb’s work was on using measurements in managing software development resources. Bennett (1984) adapted this approach to usability specifications as a technique for setting planned usability levels and managing the process to meet those levels. These ideas were integrated into usability engineering practice by Good et al. (1986) and further refined by Whiteside, Bennett, and Holtzblatt (1988). Usability engineering, as defined by Good et al. (1986), is a process through which quantitative usability characteristics are specified early and measured throughout the lifecycle process. Carroll and Rosson (1985) also stressed the need for quantifiable usability specifications, associated with appropriate benchmark tasks, in iterative refinement of user interaction designs. And now we have extended the concept to UX targets. Without measurable targets, it is difficult to determine, at least quantitatively, whether the interaction design for a system or product is meeting your UX goals. 10.2 UX GOALS UX goals are high-level objectives for an interaction design, stated in terms of anticipated user experience. UX goals can be driven by business goals and reflect real use of a product and identify what is important to an organization, its customers, and its users. They are expressed as desired effects to be experienced in usage by users of features in the design and they translate into a set of UX measures. A UX measure is a usage attribute to be assessed in evaluating a UX goal. You will extract your UX goals from user concerns captured in work activity notes, the flow model, social models, and work objectives, some of which will be market driven, reflecting competitive imperatives for the product. User experience goals can be stated for all users in general or in terms of a specific work role or user class or for specific kinds of tasks. Examples of user experience goals include ease-of-use, power performance for experts, avoiding errors for intermittent users, safety for life-critical systems, high customer satisfaction, walk-up-and-use learnability for new users, and so on. 362 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE Example: User Experience Goals for Ticket Kiosk System We can define the primary high-level UX goals for the ticket buyer to include: n Fast and easy walk-up-and-use user experience, with absolutely no user training n Fast learning so new user performance (after limited experience) is on par with that of an experienced user [from AB-4-8] n High customer satisfaction leading to high rate of repeat customers [from BC-6-16] Some other possibilities: Exercise n High learnability for more advanced tasks [from BB-1-5] See Exercise 10-1, n Draw, engagement, attraction Identifying User Experience Goals for Your System n Low error rate for completing transactions correctly, especially in the interaction for payment [from CG-13-17] 10.3 UX TARGET TABLES Through years of working with real-world UX practitioners and doing our own user experience evaluations, we have refined the concept of a UX target table, in the form shown in Table 10-1, from the original conception of a usability specification table, as presented by Whiteside, Bennett, and Holtzblatt (1988). A spreadsheet is an obvious way to implement these tables. For convenience, one row in the table is called a “UX target.” The first three columns are for the work role and related user class to which this UX target applies, the associated UX goal, and the UX measure. The three go together because each UX measure is aimed at supporting a UX goal and is specified with respect to a work role and user class combination. Next, we will see where you get the information for these three columns. As a running example to illustrate the use of each column in the UX target table, we will progressively set some UX targets for the Ticket Kiosk System. Table 10-1 Our UX target table, as evolved from the Whiteside, Bennett, and Holtzblatt (1988) usability specification table Work Role: User Class UX UX Measuring UX Baseline Target Observed Goal Measure Instrument Metric Level Level Results UX GOALS, METRICS, AND TARGETS 363 10.4 WORK ROLES, USER CLASSES, AND UX GOALS Measuring Instrument Because UX targets are aimed at specific work roles, we label each UX target by A measuring instrument is work role. Recall that different work roles in the user models perform different the means for providing values for a particular UX task sets. measure; it is the vehicle So the key task sets for a given work role will have associated usage scenarios, through which values are generated and measured. A which will inform benchmark task descriptions we create as measuring typical measuring instruments to go with UX targets. Within a given work role, different user instrument for generating objective UX data is a classes will generally be expected to perform to different standards, that is, at benchmark task—for example, user performance different target levels. of a task gives time and error data—while a typical measuring instrument for generating subjective UX Example: A Work Role, User Class, and UX Goal data is a questionnaire. for the Ticket Kiosk System In Table 10-1, we see that the first values to enter for a UX target are work role, a corresponding user class, and related UX goal. As we saw earlier, user class definitions can be based on, among other things, level of expertise, disabilities and limitations, and other demographics. For the Ticket Kiosk System, we are focusing primarily on the ticket buyer. For this work role, user classes include a casual town resident user from Middleburg and a student user from the Middleburg University. In this example, we feature the casual town user. Translating the goal of “fast-and-easy walk-up-and-use user experience” into a UX target table entry is straightforward. This goal refers to the ability of a typical occasional user to do at least the basic tasks on the first try, certainly without training or manuals. Typing them in, we see the beginnings of a UX target in the first row of Table 10-2. Table 10-2 Choosing a work role, user class, and UX goal for a UX target Work Role: User Class UX Goal UX Measuring UX Baseline Target Observed Measure Instrument Metric Level Level Results Ticket buyer: Casual Walk-up ease of new user, for use for new user occasional personal use 364 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE 10.5 UX MEASURES Within a UX target, the UX measure is the general user experience characteristic to be measured with respect to usage of your interaction design. The choice of UX measure implies something about which types of measuring instruments and UX metrics are appropriate. UX targets are based on quantitative data—both objective data, such as observable user performance, and subjective data, such as user opinion and satisfaction. Some common UX measures that can be paired with quantitative metrics include: n Objective UX measures (directly measurable by evaluators) n Initial performance n Long-term performance (longitudinal, experienced, steady state) n Learnability n Retainability n Advanced feature usage n Subjective UX measures (based on user opinions) n First impression (initial opinion, initial satisfaction) n Long-term (longitudinal) user satisfaction Initial performance refers to a user’s performance during the very first use (somewhere between the first few minutes and the first few hours, depending on the complexity of the system). Long-term performance typically refers to performance during more constant use over a longer period of time (fairly regular use over several weeks, perhaps). Long-term usage usually implies a steady-state learning plateau by the user; the user has become familiar with the system and is no longer constantly in a learning state. Initial performance is a key UX measure because any user of a system must, at some point, use it for the first time. Learnability and retainability refer, respectively, to how quickly and easily users can learn to use a system and how well they retain what they have learned over some period of time. Advanced feature usage is a UX measure that helps determine user experience of more complicated functions of a system. The user’s initial opinion of the system can be captured by a first impression UX measure, whereas long-term user satisfaction refers, as the term implies, to the user’s opinion after using the system for some greater period of time, after some allowance for learning. UX GOALS, METRICS, AND TARGETS 365 Initial performance and first impression are appropriate UX measures for virtually every interaction design. Other UX measures often play support roles to address more specialized UX needs. Conflicts among UX measures are not unheard of. For example, you may need both good learnability and good expert performance. In the design, those requirements can work against each other. This, however, just reflects a normal kind of design trade-off. UX targets based on the two different UX measures imply user performance requirements pulling in two different directions, forcing the designers to stretch the design and face the trade-off honestly. Example: UX Measures for the Ticket Kiosk System For the walk-up ease-of-use goal of our casual new user, let us start simply with just two UX measures: initial performance and first impression. Each UX measure will appear in a separate UX target in the UX target table, with the user class of the work role and UX goal repeated, as in Table 10-3. 10.6 MEASURING INSTRUMENTS Within a UX target, the measuring instrument is a description of the method for providing values for the particular UX measure. The measuring instrument is how data are generated; it is the vehicle through which values are measured for the UX measure. Although you can get creative in choosing your measuring instruments, objective measures are commonly associated with a benchmark task—for example, a time-on-task measure as timed on a stopwatch, or an error rate measure made by counting user errors—and subjective measures are commonly associated with a user questionnaire—for example, the average user rating-scale scores for a specific set of questions. Table 10-3 Choosing initial performance and first impression as UX measures Work Role: User Class UX Goal UX Measure Measuring UX Baseline Target Observed Instrument Metric Level Level Results Ticket buyer: Casual new Walk-up ease Initial user user, for occasional of use for performance personal use new user Ticket buyer: Casual new Initial First user, for occasional customer impression personal use satisfaction 366 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE For example, we will see that the objective “initial user performance” UX measure in the UX target table for the Ticket Kiosk System is associated with a benchmark task and the “first impression” UX measure is associated with a questionnaire. Both subjective and objective measures and data can be important for establishing and evaluating user experience coming from a design. 10.6.1 Benchmark Tasks According to Reference.com, the term “benchmark” originates in surveying, referring to: Chiseled horizontal marks that surveyors made in stone structures, into which an angle-iron could be placed to form a “bench” for a leveling rod, thus ensuring that a leveling rod could be accurately repositioned in the same place in future. These marks were usually indicated with a chiseled arrow below the horizontal line. As a measuring instrument for an objective UX measure, a benchmark task is a representative task that you will have user participants perform in evaluation where you can observe their performance and behavior and take qualitative data (on observations of critical incidents and user experience problems) and quantitative data (user performance data to compare with UX targets). As such, a benchmark task is a “standardized” task that can be used to compare (as an engineering comparison, not a rigorous scientific comparison) performance among different users and across different design versions. Address designer questions with benchmark tasks and UX targets As designers work on interaction designs, questions arise constantly. Sometimes the design team simply cannot decide an issue for themselves and they defer it to UX testing (“let the users decide”). Perhaps the team does not agree on a way to treat one design feature, but they have to pick something in order to move forward. Maybe you do agree on the design for a feature but you are very curious about how it will play out with real users. Perchance you do not believe an input you got in your requirements from contextual analysis but you used it, anyway, and now you want to see if it pans out in the design. We have suggested that you keep a list of design questions as they came up in design activities. Now they play a role in setting benchmark tasks to get UX GOALS, METRICS, AND TARGETS 367 feedback from users regarding these questions. Benchmark tasks based on designer issues are often the only way this kind of issue will get considered in evaluation. Selecting benchmark tasks In general, of course, the benchmark tasks you choose as measuring instruments should closely represent tasks real users will perform in a real work context. Pick tasks where you think or know the design has weaknesses. Avoiding such tasks violates the spirit of UX targets and user experience evaluation; it is about finding user experience problems so that you can fix them, not about proving you are the best designer. If you think of UX targets as a measure of how good you are as a designer, you will have a conflict of interest because you are setting your own evaluation criteria. That is not the point of UX targets at all. Here are some guidelines for creating effective benchmark tasks. Create benchmark tasks for a representative spectrum of user tasks. Choose realistic tasks intended to be used by each user class of a work role across the system. To get the best coverage for your evaluation investment, your choices should represent the cross section of real tasks with respect to frequency of performance and criticality to goals of the users of the envisioned product. Benchmark tasks are also selected to evaluate new features, “edge cases” (usage at extreme conditions), and business-critical and mission-critical tasks. While some of these tasks may not be performed frequently, getting them wrong could cause serious consequences. Start with short and easy tasks and then increase difficulty progressively. Because your benchmark tasks will be faced by participant users in a sequence, you should consider their presentation order. In most cases, start with relatively easy ones to get users accustomed to the design and feeling comfortable in their role as evaluators. After building user confidence and engagement, especially with the tasks for the “initial performance” UX measure, you can introduce more features, more breadth, variety, complexity, and higher levels of difficulty. In some cases, you might have your user participants repeat a benchmark task, only using a different task path, to see how users get around in multiple ways. The more advanced benchmark tasks are also a place to try your creativity by introducing intervening circumstances. For example, you might lead the user 368 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE down a path and then say “At this point, you change your mind and want to do such and such, departing from where you are now.” For our ticket kiosk system, maybe start with finding a movie that is currently playing. Then follow with searching for and reserving tickets for a movie that is to be showing 20 days from now and then go to more complex tasks such as purchasing concert tickets with seat and ticket type selection. Include some navigation where appropriate. In real usage, because users usually have to navigate to get to where they will do the operations specific to performing a task, you want to include the need for this navigation even in your earliest benchmark tasks. It tests their knowledge of the fact that they do need to go elsewhere, where they need to go, and how to get there. Avoid large amounts of typing (unless typing skill is being evaluated). Avoid anything in your benchmark task descriptions that causes large user performance variation not related to user experience in the design. For example, large amounts of typing within a benchmark task can cause large variations in user performance, but the variations will be based on differences in typing skills and can obscure performance differences due to user experience or usability issues. Match the benchmark task to the UX measure. Obviously, if the UX measure is “initial user performance,” the task should be among those a first-time user realistically would face. If the UX measure is about advanced feature usage, then, of course, the task should involve use of that feature to match this requirement. If the UX measure is “long-term usage,” then the benchmark task should be faced by the user after considerable practice with the system. For a UX measure of “learnability,” a set of benchmark tasks of increasing complexity might be appropriate. Adapt scenarios already developed for design. Design scenarios clearly represent important tasks to evaluate because they have already been selected as key tasks in the design. However, you must remember to remove information about how to perform the tasks, which is usually abundant in a scenario. See guideline “Tell the user what task to do, but not how to do it” in the next section for more discussion. Use tasks in realistic combinations to evaluate task flow. To measure user performance related to task flow, use combinations of tasks such as those that will occur together frequently. In these cases, you should set UX targets UX GOALS, METRICS, AND TARGETS 369 for such combinations because difficulties related to user experience that appear during performance of the combined tasks can be different than for the same tasks performed separately. For example, in the Ticket Kiosk System, you may wish to measure user performance on the task thread of searching for an event and then buying tickets for that event. As another example, a benchmark task might require users to buy four tickets for a concert under a total of $200 while showing tickets in this price range for the upcoming few days as sold out. This would force users to perform the task of searching through other future concert days, looking for the first available day with tickets in this price range. Do not forget to evaluate with your power users. Often user experience for power users is addressed inadequately in product testing (Karn, Perry, & Krolczyk, 1997). Do your product business and UX goals include power use by a trained user population? Do they require support for rapid repetition of tasks, complex and possibly very long tasks? Does their need for productivity demand shortcuts and direct commands over interactive hand-holding? If any of these are true, you must include benchmark tasks that match this kind of skilled and demanding power use. And, of course, these benchmark tasks must be used as the measuring instrument in UX targets that match up with the corresponding user classes and UX goals. To evaluate error recovery, a benchmark task can begin in an error state. Effective error recovery is a kind of “feature” that designers and evaluators can easily forget to include. Yet no interaction design can guarantee error-free usage, and trying to recover from errors is something most users are familiar with and can relate to. A “forgiving” design will allow users to recover from errors relatively effortlessly. This ability is definitely an aspect of your design that should be evaluated by one or more benchmark tasks. Consider tasks to evaluate performance in “degraded modes” due to partial equipment failure. In large interconnected, networked systems such as military systems or large commercial banking systems, especially involving multiple kinds of hardware, subsystems can go down. When this happens, will your part of the system give up and die or can it at least continue some of its intended functionality and give partial service in a “degraded mode?” If your application fits this description, you should include benchmark tasks to evaluate the user’s perspective of this ability accordingly. 370 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE Do not try to make a benchmark task for everything. Evaluation driven by UX targets is only an engineering sampling process. It will not be possible to establish UX targets for all possible classes of users doing all possible tasks. It is often stated that about 20% of the tasks in an interactive system account for 80% of the usage and vice versa. While these figures are obviously folkloric guesses, they carry a grain of truth to guide in targeting users and tasks in establishing UX targets. Constructing benchmark task content Here we list a number of tips and hints to consider when creating benchmark task content. Remove any ambiguities with clear, precise, specific, and repeatable instructions. Unless resolving ambiguity is what we want users to do as part of the task, we must make the instructions in benchmark task descriptions clear and not confusing. Unambiguous benchmark tasks are necessary for consistent results; we want differences in user performance to be due to differences in users or differences in designs but usually not due to different interpretations of the same benchmark task. As a subtle example, consider this “add appointment” benchmark task for the “initial performance” UX measure for an interdepartmental event scheduling system. Schedule a meeting with Dr. Ehrich for a month from today at 10 AM in 133 McBryde Hall concerning the HCI research project. For some users, the phrase “1 month from today” can be ambiguous. Why? It can mean, for example, on the same date next month or it can mean exactly 4 weeks from now, putting it on the same day of the week. If that difference in meaning can make a difference in user task performance, you need to make the wording more specific to the intended meaning. You also want to make your benchmark tasks specific so that participants do not get sidetracked on irrelevant details during testing. If, for example, a “find event” benchmark task is stated simply as “Find an entertainment event for sometime next week,” some participants might make it a long, elaborate task, searching around for some “best” combination of event type and date, whereas others would do the minimum and take the first event they see on the screen. To mitigate such differences, add specific information about event selection criteria. Tell the user what task to do, but not how to do it. This guideline is very important; the success of user experience evaluation based on this task will depend on it. Sometimes we find students in early evaluation exercises UX GOALS, METRICS, AND TARGETS 371 presenting users with task instructions that spell out a series of steps to perform. They should not be surprised when the evaluation session leads to uninteresting results. The users are just giving a rote performance of the steps as they read them from the benchmark task description. If you wish to test whether your interaction design helps users discover how to do a given task on their own, you must avoid giving any information about how to do it. Just tell them what task to do and let them figure out how. Example (to do): “Buy two student tickets for available adjacent seats as close to the stage as possible for the upcoming Ben King concert and pay with a credit card.” Example (not to do): “Click on the Special Events button on the home screen; then select More at the bottom of the screen. Select the Ben King concert and click on Seating Options....” Example (not to do): “Starting at the Main Menu, go to the Music Menu and set it as a Bookmark. Then go back to the Main Menu and use the Bookmark feature to jump back to the Music Menu.” Do not use words in benchmark tasks that appear specifically in the interaction design. In your benchmark task descriptions, you must avoid using any words that appear in menu headings, menu choices, button labels, icon pop-ups, or any place in the interaction design itself. For example, do not say “Find the first event (that has such and such a characteristic)” when there is a button in the interaction design labeled “Find.” Instead, you should use words such as “Look for...” or “Locate...” Otherwise it is very convenient for your users to use a button labeled “Find” when they are told to “Find” something. It does not require them to think and, therefore, does not evaluate whether the design would have helped them find the right button on their own in the course of real usage. Use work context and usage-centered wording, not system-oriented wording. Because benchmark task descriptions are, in fact, descriptions of user tasks and not system functionality, you should use usage-centered words from the user’s work context and not system-centered wording. For example, “Find information about xyz” is better than “Submit query about xyz.” The former is task oriented; the latter is more about a system view of the task. Have clear start and end points for timing. In your own mind, be sure that you have clearly observable and distinguishable start and end points for each benchmark task and make sure you word the benchmark task description 372 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE to use these end points effectively. These will ensure your ability to measure the time on task accurately, for example. At evaluation time, not only must the evaluators know for sure when the task is completed, but the participant must know when the task is completed. For purposes of evaluation, the task cannot be considered completed until the user experiences closure. The evaluator must also know when the user knows that the task has been completed. Do not depend on the user to say when the task is done, even if you explicitly ask for that in the benchmark task description or user instructions. Therefore, rather than ending task performance with a mental or sensory state (i.e., the user knowing or seeing something), it is better to incorporate a user action confirming the end of the task, as in the (to do) examples that follow. Example (not to do): “Find out how to set the orientation of the printer paper to “landscape.” Completion of this task depends on the user knowing something and that is not a directly observable state. Instead, you could have the user actually set the paper orientation; this is something you can observe directly. Example (not to do): “View next week’s events.” Completion of this task depends on the user seeing something, an action that you may not be able to confirm. Perhaps you could have the user view and read aloud the contents of the first music event next week. Then you know whether and when the user has seen the correct event. Example (to do): “Find next week’s music event featuring Rachel Snow and add it to the shopping cart.” Example (to do): Or, to include knowing or learning how to select seats, “Find the closest available seat to the stage and add to shopping cart.” Example (to do): “Find the local weather forecast for tomorrow and read it aloud.” Keep some mystery in it for the user. Do not always be too specific about what the users will see or the parameters they will encounter. Remember that real first-time users will approach your application without necessarily knowing how it works. Sometimes try to use benchmark tasks that give approximate values for some parameters to look for, letting the rest be up to the user. You can still create a prototype in such a way that there is only one possible “solution” to this task if you want to avoid different users in the evaluation ending in a different state in the system. Example (to do): “Purchase two movie tickets to Bee Movie within 1.5 hours of the current time and showing at a theatre within 5 miles of this kiosk location.” UX GOALS, METRICS, AND TARGETS 373 Annotate situations where evaluators must ensure pre-conditions for running benchmark tasks. Suppose you write this benchmark task: “Your dog, Mutt, has just eaten your favorite book and you have decided that he is not worth spending money on. Delete your appointment with the vet for Mutt’s annual checkup from your calendar.” Every time a user performs this task during evaluation, the evaluator must be sure to have an existing appointment already in your prototype calendar so that each user can find it and delete it. You must attach a note in the form of rubrics (next point later) to this benchmark task to that effect—a note that will be read Ecological Validity and followed much later, in the evaluation activity. Ecological validity refers to the realism with which a Use “rubrics” for special instructions to evaluators. When necessary or design of evaluation setup matches the user’s real work useful, add a “rubrics” section to your benchmark task descriptions as special context. It is about how instructions to evaluators, not to be given to participants in evaluation sessions. accurately the design or evaluation reflects the Use these rubrics to communicate a heads-up about anything that needs to be relevant characteristics of the ecology of interaction, done or set up in advance to establish task preconditions, such as an existing i.e., its context in the world event in the kiosk system, work context for ecological validity, or a particular or its environment. starting state for a task. Benchmark tasks for addressing designer questions are especially good candidates for rubrics. In a note accompanying your benchmark task you can alert evaluators to watch for user performance or behavior that might shed light on these specific designer questions. Put each benchmark task on a separate sheet of paper. Yes, we want to save trees but, in this case, it is necessary to present the benchmark tasks to the participant only one at a time. Otherwise, the participant will surely read ahead, if only out of curiosity, and can become distracted from the task at hand. If a task has a surprise step, such as a midtask change of intention, that step should be on a separate piece of paper, not shown to the participant initially. To save trees you can cut (with scissors) a list of benchmark tasks so that only one task appears on one piece of paper. Write a “task script” for each benchmark task. You should write a “task script” describing the steps of a representative or typical way to do the task and include it in the benchmark task document “package.” This is just for use by the evaluator and is definitely not given to the participant. The evaluator may not have been a member of the design team and initially may not be too familiar with how to perform the benchmark tasks, and it helps the evaluator to be able to 374 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE anticipate a possible task performance path. This is especially useful in cases where the participant cannot determine a way to do the task; then, the evaluation facilitator knows at least one way. Example: Benchmark Tasks as Measuring Instruments for the Ticket Kiosk System For the Ticket Kiosk System, the first UX target in Table 10-3 contains an objective UX measure for “Initial user performance.” An obvious choice for the corresponding measuring instrument is a benchmark task. Here we need a simple and frequently used task that can be done in a short time by a casual new user in a walk-up ease-of-use situation. An appropriate benchmark task would involve buying tickets to an event. Here is a possible description to give the user participant: “BT1: Go to the Ticket Kiosk System and buy three tickets for the Monster Truck Pull on February 28 at 7:00 PM. Get three seats together as close to the front as possible. Pay with a major credit card.” In Table 10-4 we add this to the table as the measuring instrument for the first UX target. Let us say we want to add another UX target for the “initial performance” UX measure, but this time we want to add some variety and use a different benchmark task as the measuring instrument—namely, the task of buying a movie ticket. In Table 10-5 we have entered this benchmark task in the second UX target, pushing the “first impression” UX target down by one. Table 10-4 Choosing “buy special event ticket” benchmark task as measuring instrument for “initial performance” UX measure in first UX target Work Role: User Class UX Goal UX Measure Measuring UX Baseline Target Observed Instrument Metric Level Level Results Ticket buyer: Casual Walk-up ease Initial user BT1: Buy new user, for occasional of use for new performance special personal use user event ticket Ticket buyer: Casual Initial First new user, for occasional customer impression personal use satisfaction UX GOALS, METRICS, AND TARGETS 375 Table 10-5 Choosing “buy movie ticket” benchmark task as measuring instrument for second initial performance UX measure Work Role: User Class UX Goal UX Measure Measuring UX Baseline Target Observed Instrument Metric Level Level Results Ticket buyer: Casual Walk-up ease Initial user BT1: Buy new user, for occasional of use for new performance special personal use user event ticket Ticket buyer: Casual Walk-up ease Initial user BT2: Buy new user, for occasional of use for new performance movie ticket personal use user Ticket buyer: Casual Initial First new user, for occasional customer impression personal use satisfaction How many benchmark tasks and UX targets do you need? As in most things regarding human–computer interaction, it depends. The size and complexity of the system should be reflected in the quantity and complexity of the benchmark tasks and UX targets. We cannot even give you an estimate of a typical number of benchmark tasks. You have to use your engineering judgment and make enough benchmark tasks for reasonable, representative coverage without overburdening the evaluation process. If you are new to this, we can say that we have often seen a dozen UX targets, but 50 would probably be too much—not worth the cost to pursue in evaluation. How long should your benchmark tasks be (in terms of time to perform)? The typical benchmark task takes a range of a couple of minutes to 10 or 15 minutes. Some short and some long are good. Longer sequences of related tasks are needed to evaluate transitions among tasks. Try to avoid really long benchmark tasks because they may be tiring to participants and evaluators during testing. Ensure ecological validity The extent to which your evaluation setup matches the user’s real work context is called ecological validity (Thomas & Kellogg, 1989). One of the valid criticisms of lab-based user experience testing is that a UX lab can be kind of a sterile environment, not a realistic setting for the user and the tasks. But you can take steps to add ecological validity by asking yourself, as you 376 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE write your benchmark task descriptions, how can the setting be made more realistic? n What are constraints in user or work context? n Does the task involve more than one person or role? n Does the task require a telephone or other physical props? n Does the task involve background noise? n Does the task involve interference or interruption? n Does the user have to deal with multiple simultaneous inputs, for example, multiple audio feeds through headsets? As an example for a task that might be triggered by a telephone call, instead of writing your benchmark task description on a piece of paper, try calling the participant on a telephone with a request that will trigger the desired task. Rarely do task triggers arrive written on a piece of paper someone hands you. Of course, you will have to translate the usual boring imperative statements of the benchmark task description to a more lively and realistic dialogue: “Hi, I am Fred Ferbergen and I have an appointment with Dr. Strangeglove for a physical exam tomorrow, but I have to be out of town. Can you change my appointment to next week?” Telephones can be used in other ways, too, to add realism to work context. A second telephone ringing incessantly at the desk next door or someone talking loudly on the phone next door can add realistic task distraction that you would not get from a “pure” lab-based evaluation. Example: Ecological Validity in Benchmark Tasks for the Ticket Kiosk System To evaluate use of the Ticket Kiosk System to manage the work activity of ticket buying, you can make good use of physical prototypes and representative locations. By this we mean building a touchscreen display into a cardboard or wooden kiosk structure and place it in the hallway of a relatively busy work area. Users will be subject to the gawking and questions of curiosity seekers. Having co-workers join the kiosk queue will add extra realism. 10.6.2 User Satisfaction Questionnaires As a measuring instrument for a subjective UX measure, a questionnaire related to various user interaction design features can be used to determine a user’s satisfaction with the interaction design. Measuring a user’s satisfaction provides a subjective, but still quantitative, UX metric for the related UX measure. UX GOALS, METRICS, AND TARGETS 377 As an aside, we should point out that objective and subjective measures are not always orthogonal. As an example of a way they can intertwine, user satisfaction can actually affect user performance over a long period of time. The better users like the system, the more likely they are to experience good performance with it over the long term. In the following examples we use the QUIS questionnaire (description in Chapter 12), but there are other excellent choices, including the System Usability Scale or SUS (description in Chapter 12). Example: Questionnaire as Measuring Instrument for the Ticket Kiosk System If you think the first two benchmark tasks (buying tickets) make a good foundation for assessing the “first-impression” UX measure, then you can specify that a particular user satisfaction questionnaire or a specific subset thereof be administered following those two initial tasks, stipulating it as the measuring instrument in the third UX target of the growing UX target table, as we have done in Table 10-6. Example: Goals, Measures, and Measuring Instruments Before moving on to UX metrics, in Table 10-7 we show some examples of the close connections among UX goals, UX measures, and measuring instruments. Table 10-6 Choosing questionnaire as measuring instrument for first-impression UX measure Work Role: User UX Goal UX Measure Measuring UX Baseline Target Observed Class Instrument Metric Level Level Results Ticket buyer: Walk-up Initial user BT1: Buy special Casual new user, for ease of use performance event ticket occasional personal for new user use Ticket buyer: Casual Walk-up Initial user BT2: Buy movie new user, for ease of use performance ticket occasional personal for new user use Ticket buyer: Casual Initial First Questions new user, for customer impression Q1–Q10 in occasional satisfaction the QUIS personal use questionnaire 378 THE UX BOOK: PROCESS AND GUIDELINES FOR ENSURING A QUALITY USER EXPERIENCE Table 10-7 UX Goal UX Measure Potential Metrics Close connections among UX goals, UX measures, Ease of first-time use Initial performance Time on task and measuring instruments Ease of learning Learnability Time on task or error rate, after given amount of use and compared with initial performance High performance for Long-term Time and error rates experienced users performance Low error rates Error-related Error rates performance Error avoidance in Task-specific error Error count, with strict target levels (much safety critical tasks performance more important than time on task) Error recovery Task-specific time Time on recovery portion of the task performance performance Overall user satisfaction User satisfaction Average score on questionnaire User attraction to User opinion of Average score on questionnaire, with product attractiveness questions focused on the effectiveness of the “draw” factor Quality of user User opinion of Average score on questionnaire, with experience overall experience questions focused on quality of the