All Topics PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document covers various topics related to evaluation of a design process, including types of evaluation, goals, and methods like the evaluation methods and the importance of it. It provides overview of the process of evaluation and types of research methods.
Full Transcript
TOPIC 1: WHAT iS EVALUATiON azs-2024 What is Evaluation azs-2024 EVALUATION Evaluation is integral to the design process. It involves collecting and analyzing data about users’ or potential users’ experiences when interacting with a design artifact such as a screen sketch, protot...
TOPIC 1: WHAT iS EVALUATiON azs-2024 What is Evaluation azs-2024 EVALUATION Evaluation is integral to the design process. It involves collecting and analyzing data about users’ or potential users’ experiences when interacting with a design artifact such as a screen sketch, prototype, app, computer system, or component of a computer system. A central goal of evaluation is to improve the artifact’s design. Evaluation focuses on both the usability of the system (that is, how easy it is to learn and to use) and on the users’ experience when interacting with it (for example, how satisfying, enjoyable, or motivating the interaction is). azs-2024 Functionality Performance Usability Content What Quality User Experience (UX) Security & Privacy Accessibility azs-2024 Quality Assurance Competitive Advantage User Satisfaction Why Usability Improvement Performance Enhancement Optimization of Features azs-2024 Where Usability Lab Field Studies Online Simulated Platform Environment azs-2024 Evaluation Goals azs-2024 EVALUATION GOAL Usability Assessment Accessibility Assessment User Experience (UX) Feedback Collection Evaluation Identifying Design Issues Validation of Design Decisions Performance Evaluation Risk Mitigation azs-2024 Types of Evaluation azs-2024 Formative PURPOSE: To inform and Formative evaluation is typically conducted improve the development during the early stages of development or process. Helps identify implementation. strengths and weaknesses in It can be conducted iteratively throughout the the design, functionality, and development process, with multiple rounds of usability of the interactive evaluation and feedback incorporated into the application. Time/ design cycle. When Summative Summative evaluation is typically conducted after the interactive application has been developed and implemented. It is conducted when the application is considered to be in its final form and ready for release or deployment. PURPOSE: to assess the overall effectiveness and impact of the interactive application azs-2024 User Evaluation User evaluation, also known as user testing or usability testing, is a method used to assess the effectiveness, efficiency, and satisfaction of an interactive system or product from the perspective of end users. It involves observing users as they interact with the system and collecting feedback to identify usability issues, gather insights, and inform design improvements. Users Expert Evaluation Expert evaluation, also known as heuristic evaluation or usability inspection, is a method used to assess the usability and user experience of an interactive system or product by expert evaluators. In expert evaluation, evaluators analyze the system against a set of established usability principles or heuristics to identify usability issues, design flaws, and areas for improvement. Here's a detailed explanation of expert evaluation azs-2024 Qualitative Qualitative evaluations use qualitative and naturalistic methods, sometimes alone, but often in combination with quantitative data. Qualitative methods include three kinds of data collection: (1) indepth, open-ended interviews; (2) direct observation; and (3) written documents. Quantitative Quantitative evaluation is a research method used to measure and analyze numerical data Approach to assess the effectiveness, performance, and impact of a particular intervention, program, product, or system. It collects quantifiable information that researchers can use for mathematical calculations and statistical analysis to make real-life decisions based on these mathematical derivations. Data collection methods involves are survey or questionnaire, and one to one interview. Mixed- Method Mixed methodology is a design for colleting, analyzing, and mixing both quantitative and qualitative data in a single study or series of studies. Combining the two methods pays off in improved instrumentation for all data collection approaches and in sharpening the evaluator's understanding of findings azs-2024 Ethics of Evaluation azs-2024 EVALUATION ETHICS Strategies to protect the rights and dignity of evaluation participants should be incorporated into the way that you design and carry out your project. It is also important to consider safeguards that may be needed when your participants are children or other vulnerable populations, including some victims of crime. Help or benefit to others – promoting others’ interests, by helping individuals, organizations, or society as a whole. Do no harm – bringing no harm, such as physical injury and psychological harm (such as damage to reputation, selfesteem, or emotional well-being). Act fairly – treating people fairly and without regard to race, gender, socioeconomic status, and other characteristics. Respect others – respecting individuals’ rights to act freely and to make their own choices, while protecting the rights of those who may be unable to fully protect themselves. azs-2024 Ethical Issues Informed consent Confidentiality Everyone who participates in the evaluation It is not always possible to conduct evaluations should do so willingly. In general, people without identifying information, such as participating in an evaluation (or research), names. However, all evaluation information have the right to such as choose whether to should be kept confidential and not shared participate without penalties (e.g., participation with others. should not be a requirement for receiving services) or Withdraw from the project at any time. Ensuring safety Feedback Collection In conducting an evaluation, you may have You may need to comply certain procedure or concerns for participants’ safety, especially required some approval in getting data such as when working with victims of crime. Be those related to health (KKM) and school thoughtful about participants’ needs and take children (KPM) care to protect them azs-2024 EVALUATION ETHICS In designing an evaluation, work to maximize benefits and minimize risks. While you may not eliminate risk, you should reduce it to an acceptable level relative to the potential benefits. In addition, consider these suggestions: Keep evaluation procedures as brief and convenient as possible to minimize disruptions in subjects’ lives. Do not ask emotionally troubling questions, unless they are necessary to help you improve services. Provide incentives, such as food, money, or gift certificates. azs-2024 END azs-2024 FRAMEWORK: DECIDE DECIDE Framework To guide our evaluations we use the DECIDE framework, which provides the following checklist to help novice evaluators: Determine the overall goals that the evaluation addresses. Explore the specific questions to be answered. Choose the evaluation paradigm and techniques to answer the questions. Identify the practical issues that must be addressed, such as selecting participants. Decide how to deal with the ethical issues. Evaluate, interpret, and present the data. Determine the goals Goals should guide an evaluation, so determining what these goals are is the first step in planning an evaluation. For example, we can restate the general goal statements just mentioned more clearly as: Check that the evaluators have understood the users' needs. Identify the metaphor on which to base the design. Check to ensure that the final interface is consistent. Investigate the degree to which technology influences working practices. Identify how the interface of an existing product could be engineered to improve its usability. Explore the questions In order to make goals operational, questions that must be answered to satisfy them have to be identified. For example, the goal of finding out why many customers prefer to purchase paper airline tickets over the counter rather than e-tickets can be broken down into a number of relevant questions for investigation. Questions can be broken down into very specific sub-questions to make the evaluation even more specific For example, what does it mean to ask, "Is the user interface poor?": Is the system difficult to navigate? Sub-questions can, in turn, be further decomposed into even finer-grained questions, and so on. Choose the evaluation paradigm and techniques The evaluation paradigm determines the kinds of techniques that are used. Practical and ethical issues (discussed next) must also be considered and trade-offs made. For example, what seems to be the most appropriate set of techniques may be too expensive, or may take too long, or may require equipment or expertise that is not available, so compromises are needed. Identify the practical issues Some issues that should be considered include users, facilities and equipment, schedules and budgets, and evaluators' expertise. Depending on the availability of resources, compromises may involve adapting or substituting techniques. Users - It goes without saying that a key aspect of an evaluation is involving appropriate users. For laboratory studies, users must be found and screened to ensure that they represent the user population to which the product is targeted. For example, usability tests often need to involve users with a particular level of experience. Facilities and equipment - There are many practical issues concerned with using equipment in an evaluation. For example, when using video you need to think about how you will do the recording: how many cameras and where do you put them? Schedule and budget constraints? Identify the practical issues Time and budget constraints are important considerations to keep in mind. It might seem ideal to have 20 users test your interface, but if you need to pay them, then it could get costly. Planning evaluations that can be completed on schedule is also important, particularly in commercial settings. Expertise - Does the evaluation team have the expertise needed to do the evaluation? For example, if no one has used models to evaluate systems before, then basing an evaluation on this approach is not sensible. Decide how to deal with practical issues The Association for Computing Machinery (ACM) and many other professional organizations provide ethical codes that they expect their members to up-hold, particularly if their activities involve other human beings. For example, people's privacy should be protected, which means that their name should not be associated with data collected about them or disclosed in written reports. The following guidelines will help ensure that evaluations are done ethically and that adequate steps to protect users' rights have been taken. Tell participants the goals of the study and exactly what they should expect if they participate. Decide how to deal with practical issues Be sure to explain that demographic, financial, health, or other sensitive information that users disclose or is discovered from the tests is confidential. Make sure users know that they are free to stop the evaluation at any time. Pay users when possible because this creates a formal relationship. Avoid including quotes or descriptions that inadvertently reveal a person's identity. Ask users' permission in advance to quote them, promise them anonymity, and offer to show them a copy of the report before it is distributed. Evaluate, interpret, and present data Decisions are also needed about what data to collect, how to analyze it, and how to present the findings to the development team. To a great extent the technique used determines the type of data collected, but there are still some choices. For example, should the data be treated statistically? If qualitative data is collected, how should it be analyzed and represented? Some general questions also need to be asked: Reliability - The reliability or consistency of a technique is how well it produces the same results on separate occasions under the same circumstances. Different evaluation processes have different degrees of reliability. Validity - Validity is concerned with whether the evaluation technique measures what it is supposed to measure. This encompasses both the technique itself and the way it is performed. Evaluate, interpret, and present data Biases - Bias occurs when the results are distorted. For example, expert evaluators performing a heuristic evaluation may be much more sensitive to certain kinds of design flaws than others. Scope - The scope of an evaluation study refers to how much its findings can be generalized. For example, some modeling techniques, like the Keystroke Model, have a narrow, precise scope. Ecological validity - Ecological validity concerns how the environment in which an evaluation is conducted influences or even distorts the results. For example, laboratory experiments are strongly controlled and are quite different from workplace, home, or leisure environment. Laboratory experiments therefore have low ecological validity because the results are unlikely to represent what happens in the real world encompasses both the technique itself and the way it is performed. GOMS History of GOMS Developed in 1983 by Stuart Card, Thomas P. Moran and Allen Newell. Explained in their book “The Psychology of HCI”. GOMS Family What is GOMS? Goals – what the user intends to accomplish. Operators – are actions that are performed to reach the goal. Methods – are sequences of operators that accomplish a goal. Selections – there can be more than one method available to accomplish a single goal; used to describe when a user would select a certain method over the others. Goals These can be very wide reaching; from very high-level goals (e.g. write a report) to very low-level goals (e.g. type the word “Red”). Higher-level goals can be divided into smaller lower-level goals. Operators Operators are the simple actions that are used to accomplish your goals (e.g. ‘Left-click mouse button” or “Press ALT”). Operators cannot break down any further: they are atomic elements (similar to those found in a database). Generally, it is assumed that each operator requires a fixed quantity of time to perform the action and this time interval is dependent of context. For example, to double left click a mouse button takes 0.40 seconds of execution time, regardless of what you happen to be clicking on. Methods Methods are procedures that describe how to accomplish goals. A method is essentially the steps a user has to take to complete a task. For instance, one method to accomplish the goal ‘highlight word’ in a Window text editor would be to ‘move cursor’ to the beginning of the word and ‘hold shift and right arrow key’. Another method to accomplish the same goal could involve ‘holding down the left mouse button’ and ‘dragging to the beginning of the word’. Selections Selection rules specify which method is best to use when completing a goal, based on the given context. Since there could be several ways to achieving the same result, a selection rule utilizes the user’s knowledge of the best method to achieve the required goal. Selection rules generally take a form of a conditional statement, such as “IF the word to highlight is less than five character, USE the arrow key and shift ELSE the mouse dragging method”. Goal Sub Goal Operators – insert, enter, collect Methods Selection Goal Get money Sub goal Get money from parents Operators Provide Scenario – buy book Ask parents to give money Parents take the money from wallet/ Methods under pillow/in cupboard/in car Give the money to you Selection Get money from ATM machine/ Get money from parents/ Get money from friends/ Get money from apps cash in stores GOMS Quantitatively, GOMS offer good predictive models of performance time and learning. For example, when choosing between two systems you can apply a GOMS model. Application 1 has a lower start-up cost, but will be slower to perform frequent tasks; Application 2 will be faster to perform tasks, but has a longer learning time, etc. With these quantitative predictions, you can examine such tradeoffs in the light of what is important to your company. Qualitatively, GOMS can be used to design training programs and help systems. This approach has been shown to be an efficient way to organize help systems, tutorial, and training programs as well as user documentation. NGOMSL ‘Natural GOMS Language’ allows a more flexible representation of a task using ‘human’ language: Method for goal: Deleting icon Step 1. Select icon for deletion (1.10 sec) Step 2. Drag icon to thrash bin (1.1 sec) Step 3. Update user with audio cue (0.22 sec) This NGOMSL model predicts that it will take 2.42 seconds to delete an icon. Problems with GOMS The model assumes a certain level of skill – it cannot accurately be applied for beginners. The model does not take into account time for learning the system or remembering how to use it after a long period of disuse; for example, can you remember where all the options are in Windows 98? The model removes human error from the equation; even high skilled users make occasional mistakes. Mental workload is not addresses in the model; it is far more taxing when remembering a longer process than a short one. E.g. It is a far less stressful task to highlight text than entering network setting manually. Problems with GOMS Users can get tired; you are not going to be quick typing three hours as you were when you started. Differences among users is not accounted for within the model. E.g. Those who are left handed are not given special preferences. Predicting whether a system will be functional or acceptable for users is not included in the model. E.g. Just because workers can enter data quickly into a new database system, it does not mean it is particularly user-friendly and easy to work with. How computer-assisted technology integrates into everyday business is not addressed in the model. E.g. Entering commands using a keyboard could be the quickest method of data capture for astronauts, however trying to type ‘zero gravity’ may be difficult. Norman’s Cycle Norman’s Cycle - Seven Stages of Action The human action cycle, also known as the Seven Stages of Action, is a psychological model which describes the steps humans take when they interact with computer systems. The model can be used to help evaluate the efficiency of a user interface (UI). Understanding the cycle requires understanding the user interface design principles of affordance, feedback, visibility and tolerance. Evaluation Execution stage stage The Gulfs Two of the many challenges people must overcome to successfully interact with technology are: Execution: Taking action to accomplish a specific goal Evaluation: Understanding the state of the system These challenges are described as the “Gulf of Execution” and the “Gulf of Evaluation” because, without effective design elements to support users, they can become insurmountable barriers between users and their goals. The Gulfs The Gulf of Execution: The difference between a user’s The Gulf of Evaluation: The Gulf of intentions and the allowable actions is the Gulf of Evaluation occurs when a user has trouble Execution. assessing the state of the system. Norman illustrates this gulf with the example of a video It reflects the amount of effort that the person cassette recorder (VCR). must exert to interpret the state of the system Let imagine that a user would like to record a television and to determine how well the expectations show. They see the solution to this problem as simply and intentions have been met. pressing the ‘Record’ button. However, in reality, to record Simply put, the user is expecting feedback a show on a VCR, several actions must be taken: from an action and not receiving (at best) what 1. Press the record button. they expected or (at worst) nothing at all. 2. Specify time of recording, usually involving several steps to Determining whether something is on or off is change the hour and minute settings. a classic example of the gulf of evaluation. 3. Select channel to record on - either by entering the channel’s number or selecting it with up/down buttons. 4. Save the recording settings, perhaps by pressing an “OK” or “menu” or “enter” button. The Gulfs The Human Action Cycle by Don Norman Prof. Ts. Dr. Ariffin Abdul Mutalib What is Human Action Cycle? The human action cycle, also known as the Seven Stages of Action, is a psychological model which describes the steps humans take when they interact with computer systems. The model can be used to help evaluate the efficiency of a user interface (UI). Understanding the cycle requires understanding the user interface design principles of affordance, feedback, visibility and tolerance. The Human Action Cycle An affordance is a quality of an object, or an environment, that allows an individual to act. Feedback describes the situation when output from (or information about the result of) an event or phenomenon in the past will influence an occurrence or occurrences of the same event/phenomenon (or the continuation/development of the original phenomenon) in the present or future. When an event is part of a chain of cause-and-effect that forms a circuit or loop, then the event is said to “feedback” into itself. The design should make all needed options and materials for a given task visible without distracting the user with extraneous or redundant information. Good designs don’t overwhelm users with alternatives or confuse with unneeded information. The design should be flexible and tolerant, reducing the cost of mistakes and misuse by allowing undoing and redoing, while also preventing errors wherever possible by tolerating varied inputs and sequences and by interpreting all reasonable actions reasonable. The human action cycle describes how humans may form goals and then develop a series of steps required to achieve that goal, using the computer system. The user then executes the actions, thus the model includes both cognitive activities and physical activities. The framework is divided into three stages of seven steps in total: Goal formation stage 1. Goal formation. Execution stage 2. Translation of goals into a set of unordered tasks required to achieve goals. 3. Sequencing the tasks to create an action sequence. 4. Executing the action sequence. Evaluation stage 5. Perceiving the results after having executed the action sequence. 6. Interpreting the actual outcomes based on the expected outcomes. 7. Comparing what happened with what the user wished to happen. Example https://www.youtube.com/watch?v=SK0XIxsFK6Y EXPERT REVIEW Prof. Ts. Dr. Ariffin Abdul Mutalib Heuristic Evaluation UI or IxD expert, not Guideline Review content expert Consistency Inspection Can be at early Cognitive stage or late Walkthrough Formal Usability inspection Can occur early or late in design phase Can be scheduled at several points in the development process The expert reviewers, like users, should take training courses, read manuals, take tutorials and try the systems in as closes as possible to a realistic work environment, complete with noise and distractions. For detailed review of the screen, experts may need a controlled environment. Heuristic Evaluation Experts critique the design in its conformance with a list of heuristics (like Nielsen’s 10 heuristics). Experts need to be familiar with the heuristics being used There is a need for guiding questions Heuristic evaluation is very popular. It is a method for quick, cheap, and easy evaluation of a user interface design. It is done as a systematic inspection of a user interface design for usability. The goal of heuristic evaluation is to find the usability problems in the design so that they can be attended to as part of an iterative design process. Heuristic evaluation involves having a small set of evaluators examine the interface and judge its compliance with recognized usability principles. Guideline Review UI is analyzed in its conformance with organization’s guideline document, or other appropriate guidelines (like safety and health guidelines). There are normally extensive guidelines (can go thousands), and normally takes days for an exercise. A set of guidelines is used and an evaluator tests the interface against this set. One of the most well known work in this area is by Smith and Mosier, who created 944 guidelines for user interface design. In their description of the guideline review method, the system analysts and human factor experts are expected to ‘tailor’ the list of 944 guidelines (that is to say reduce it) for the specific application being evaluated. Challenges in guideline reviews include bias in the tailoring of any list and doubts about the credibility and application of the guidelines being used. A classic example is the guideline that font sizes should be 12 or more – clearly not applicable in mobile contexts. Consistency Inspection Experts determine the consistent representation of various design elements. Macro and micro interface Terminology Color, Layout, Input dan output format, Etc… In training material and online help A quality control technique for evaluating and improving a user interface. The interface is methodically reviewed for consistency in design, both within a screen and between screens, in graphics (colour, typography, layout, icons), text (tone, style, spelling), and interaction (consistency of task steps and command names). Of course, consistency should not be carried too far: things that need to be distinguished should be distinct Walkthrough A number of experts simulate users, walking through the interface to carry out a set of typical tasks. High-frequency use task are starting points. Rare and critical tasks must be walked through too. Each expert do the walkthrough individually. Then, the moderator collects their report. All experts discuss their findings together. The moderator will note their points in the discussion. An approach to evaluating a user interface based on stepping through common tasks that a user would need to perform and evaluating the user’s ability to perform each step. This approach is intended especially to help understand the usability of a system for first-time or infrequent users, that is, for users in an exploratory learning mode. Based on a user’s goals, a group of evaluators steps through tasks, evaluating at each step how difficult it is for the user to identify and operate the interface element most relevant to their current sub-goal and how clearly the system provides feedback to that action Formal Usability Inspection A review of tasks a user completes while using the product. An inspection team is created that typically includes design engineers, usability engineers and customer (client) support engineers. In some cases customers and or users can also participate. The purpose of the inspection is to evaluate a product from the user's perspective, and find and fix usability concerns, and in so doing, improve the ease of use of the product. The process begins with a planning phase during which the team is assembled and a package that includes user profiles (sometimes as personas), task scenarios, and the prototype or product being evaluated is prepared. Then, the exercise executes with designers observe the session. They may also interview after the session. Dilemma Experts must be able to understand the target users and the context of use. Experts do the exercise based on their experience. Sometimes experts have different findings. Introduction to Functional Testing Functional testing is a crucial step in ensuring the quality and reliability of applications across desktop, web, mobile, and tablet platforms. This presentation will explore best practices for effectively testing the functionality of software on various devices and environments. Importance of Functional Testing in Formative Evaluation Functional testing is crucial during the formative evaluation stage of product development. It helps ensure the application's core features and user workflows are working as intended, providing valuable feedback to refine the design and improve the user experience. By identifying and addressing functional issues early on, teams can deliver a more polished, high-quality product at launch. Defining Functional Requirements Functional requirements are the core capabilities and behaviors a system must exhibit to meet user needs. They describe what the system should do, not how it should do it. Clearly defining these requirements is crucial for successful functional testing during formative evaluation. Identify key user workflows and features. Document expected inputs, outputs, and system responses. Ensure requirements are measurable, testable, and aligned with user goals. In functional testing - can the system do what it was built to do? Functional vs Non-Functional Testing In non-functional testing, it focuses on evaluating the non-functional such as the system's performance, reliability, and stability. Can the system do what it was built to do well enough? Functional Testing for Desktop Applications Applications Comprehensive OS Compatibility Performance Scenarios Evaluation Develop robust test cases Verify functionality across Assess the responsiveness that cover the full range of different operating systems, and stability of desktop user interactions and including Windows, macOS, applications under varying workflows for desktop and Linux. workloads and system software. configurations. Functional Testing for Web Applications 1 Cross-Browser Compatibility 2 Responsive Design Ensure consistent functionality and Validate the application's user experience across different adaptability to different screen web browsers and versions. sizes and resolutions. 3 Transaction Flows Thoroughly test end-to-end user workflows, such as login, checkout, and form submissions. Functional Testing for Mobile Devices Device Compatibility Gesture-based Interactions Test the application on a wide range o f Validate the responsiveness and accuracy m o b ile d evices, includ ing of touch-based gestures, such as swiping, sm artp ho nes and tab lets, to id entify tapping, and pinch-to-zoom. any p latfo rm - sp ecific issues. Offline Functionality Ensure the application can gracefully handle intermittent network connectivity and provide a seamless user experience. Functional Testing for Tablet Devices Screen Size Pen-based Battery Life Optimization Interactions Considerations Verify the application's Test the accuracy and Assess the application's layout, navigation, and responsiveness of stylus power consumption and content display adapt to the input, including handwriting optimize for extended battery larger screen size of tablets. recognition and annotation life on tablet devices. features. Challenges in Functional TestingAcross Platforms Device Fragmentation Maintaining comprehensive test coverage across the vast array of desktop, mobile, and tablet devices can be a significant challenge. User Interactions Ensuring consistent functionality and user experience across diverse input methods, such as mouse, keyboard, touch, and stylus, can be complex. Automation Integration Integrating automated testing frameworks with different platforms and technologies can require significant effort and expertise. Best Practices for Effective Functional Testing 1 Prioritize Critical Workflows Focus on the most important user scenarios and test cases that have the highest impact on the application's core functionality. 2 Leverage Automation Implement a comprehensive test automation strategy to streamline the testing process and ensure consistent results across platforms. 3 Continuous Integration Integrate functional testing into the continuous integration and delivery pipeline to catch issues early and improve overall quality. Continuous Improvement in Functional Testing 1. Regularly review and update functional test cases to address evolving requirements and user scenarios. 2. Leverage data-driven insights from testing to continuously optimize test coverage and efficiency. 3. Collaborate with cross-functional teams to identify emerging trends and incorporate them into the testing strategy. 4. Automate routine functional tests to free up manual testing resources for more complex, high-impact scenarios. 5. Foster a culture of learning and continuous improvement, where the testing team constantly seeks to enhance their skills and practices. Conclusion and Key Takeaways Comprehensive functional testing is essential for ensuring the reliability and user-friendliness of applications across desktop, web, mobile, and tablet platforms. By following best practices and addressing the unique challenges of cross- platform testing, organizations can deliver high-quality software that meets the needs of their users. Introduction to User Experience (UX) Evaluation User experience evaluation is the process of assessing and understanding how users interact with a product or service. It involves gathering data on user behaviors, attitudes, and preferences to identify areas for improvement and enhance the overall user experience. Importance of UX Evaluation Conducting thorough UX evaluation is crucial for ensuring a product or service meets user needs and provides a seamless, delightful experience. It uncovers usability issues, identifies opportunities for improvement, and aligns the design with user expectations, ultimately driving customer satisfaction and business success. By understanding how users interact with a product, teams can make informed decisions to enhance functionality, accessibility, and overall user engagement. UX evaluation is an essential step in the design process, enabling teams to create products that truly resonate with the target audience. Types of UX Evaluation Methods Usability Testing: Observing users as they interact with a product and identifying pain points. Heuristic Evaluation: Expert review of a product's design against established usability principles. Cognitive Walkthroughs: Analyzing how users would approach and complete specific tasks. User Interviews: Gathering qualitative feedback and insights directly from target users. Surveys and Questionnaires: Collecting quantitative data on user attitudes, behaviors, and preferences. A/B Testing: Comparing user responses to different design variations to identify optimal solutions. Usability Testing Usability testing is a cornerstone of UX evaluation, allowing teams to directly observe how users interact with a product and identify pain points. By monitoring user behaviors, researchers uncover design flaws, workflow inefficiencies, and areas for improvement. During usability testing, participants are asked to complete specific tasks while researchers gather valuable insights through observation, interviews, and data collection. This hands-on approach provides deep understanding of the user experience, informing design decisions to enhance overall product usability. Heuristic Evaluation Heuristic evaluation is an expert-driven UX assessment method that involves reviewing a product's design against established usability principles or "heuristics." These include factors like visibility of system status, user control and freedom, consistency and standards, and more. By identifying design flaws and deviations from best practices, heuristic evaluations uncover potential pain points and opportunities for improving the overall user experience. Heuristic Evaluation Exercise Cognitive Walkthroughs Step-by-Step Observation Evaluating the User Journey Collaborative Refinement Cognitive walkthroughs By closely following the Cognitive walkthroughs are involve systematically user's actions and decision- often conducted in a stepping through a user's making, cognitive collaborative setting, thought process as they walkthroughs provide deep allowing designers to gather attempt to complete insights into the user real-time feedback from specific tasks. Designers experience. Designers can users and stakeholders. This observe and analyze how identify breakdowns, facilitates a shared users approach problem- confusion points, and understanding and drives an solving to uncover usability opportunities to streamline iterative design process. issues. the interaction flow. Cognitive Walkthroughs Exercise User Interviews 1 Gathering Insights User interviews allow designers to directly engage with target users, uncovering in-depth insights about their needs, pain points, and experiences. 2 Contextual Understanding By observing users in their natural environments, designers gain a deeper understanding of how a product or service fits into their daily lives. 3 Collaborative Refinement User interviews facilitate an open dialogue, enabling designers to gather real-time feedback and collaborate with users to refine the product experience. Surveys and Questionnaires Quantitative Reach Broad Scalable Targeted Data Audiences Insights Questions Surveys and Unlike one-on-one By deploying surveys Surveys and questionnaires enable interviews, surveys and questionnaires questionnaires enable designers to collect and questionnaires through digital designers to ask specific, quantitative data on user can reach a wider pool platforms, designers targeted questions that behaviors, preferences, of users, allowing can efficiently collect address their unique and satisfaction levels. designers to gather and analyze data from a research objectives, These structured feedback from a more large number of ensuring the collected instruments provide representative sample participants, providing data is directly relevant measurable insights that of the target audience. scalable insights to to the design process. can be analyzed for inform design trends and patterns. decisions. Analyzing UX Evaluation Data 1 Identify Themes and Patterns 2 Prioritize Insights Synthesize data from various UX Analyze the significance and impact of evaluation methods to uncover common each finding to determine which issues themes, pain points, and user behaviors should be addressed as high-priority, that can inform design improvements. ensuring the most critical user needs are met. 3 Quantify User Feedback 4 Uncover Root Causes Leverage quantitative data from surveys Delve deeper into qualitative insights and usage metrics to measure the from user interviews and walkthroughs magnitude of user satisfaction, usability, to understand the underlying reasons and engagement, providing a data-driven behind user behaviors and pain points, basis for design decisions. enabling more impactful solutions. ImplementingFindings and ImprovingUX Prioritize Fixes Iterative Refinement Systematically address the most critical Regularly gather user feedback and test usability issues identified during the design changes to continually improve the evaluation process to quickly enhance the product based on evolving user needs. product experience. Collaborative Approach Measure Impact Engage cross-functional teams, including Establish key performance indicators to developers and stakeholders, to collectively track the success of UX enhancements and implement solutions and drive meaningful ensure the product continues to meet user UX improvements. expectations. Introduction to System System Testing in Formative System testing is a crucial step in the formative evaluation process, ensuring the overall functionality and integration of a software system. This introductory section provides an overview of the key aspects of system testing, paving the way for a deeper understanding of its objectives, scope, and approaches. Defining System Testing 1 Comprehensive Evaluation 2 Integrated Approach System testing examines the system as It validates the system's behavior in a a whole, verifying that all components in a real-world, end-to-end scenario, work together seamlessly. scenario, rather than testing individual individual modules in isolation. 3 Validation of Requirements System testing ensures that the final product meets the specified functional and non- non-functional requirements. Objectives of System Testing Functional Verification Performance Optimization Compliance Validation Ensuring the system performs all Verifying that the system meets all the intended functions Identifying and addressing any meets relevant standards, correctly. performance bottlenecks or regulations, and guidelines. issues. Scope of System Testing in Formative Evaluation End-to-End Functionality Non-Functional Requirements System testing encompasses the entire software It evaluates the system's performance, security, software system, validating the integration and security, usability, and other non-functional aspects. and interactions between all components. aspects. Integration with External Systems Compliance and Regulations System testing ensures seamless integration with The testing process verifies that the system adheres with any third-party systems or services the software adheres to relevant industry standards, laws, and software interacts with. and guidelines. Approaches to System Testing 1 Black-Box Testing Evaluating the system's functionality without knowledge of its internal structure or implementation details. 2 White-Box Testing Examining the system's internal components, logic, and code to ensure they are they are working as intended. 3 Agile Testing Integrating system testing throughout the development lifecycle, with a focus on focus on continuous feedback and iterative improvements. Test Planning and Execution Test Planning Defining the test scope, objectives, and strategies based on the system requirements. Test Case Design Developing comprehensive test cases that cover all the system's functionalities and scenarios. Test Execution Systematically running the test cases and documenting the results for analysis. Analyzing System Test Results Defect Identification Performance Evaluation Acceptance Criteria Pinpointing and categorizing any Assessing the system's compliance Determining whether the system any issues or defects discovered compliance with performance and system has met the defined discovered during system testing. and scalability requirements. acceptance criteria for release. testing. Incorporating Findings into Iterative Design Design Feedback Collection Gather insights and feedback from system system testing to identify areas for improvement. Design Refinement Incorporate the findings into the next iteration iteration of the design and development process. Continuous Improvement Repeat the system testing and feedback loop to loop to continuously enhance the software software product. System Testing - Performance Testing Testing Performance testing is a crucial aspect of software development, ensuring that applications can handle the handle the expected user load and deliver a seamless seamless experience. This introduction explores the key the key principles and techniques for conducting performance evaluations during the formative stage of an stage of an application's lifecycle. Importance of Performance Testing in Testing in Formative Evaluation 1 Identify Bottlenecks 2 Optimize User Experience Performance testing during the By understanding an application's formative stage helps uncover potential performance characteristics, developers performance bottlenecks, allowing developers can make informed developers to address them early in the decisions to enhance the user development process. experience and ensure smooth operation. 3 Mitigate Risks Proactive performance testing reduces the risk of unexpected issues and costly post-launch post-launch fixes, leading to a more successful application deployment. Key Performance Metrics to Measure Response Time Throughput Resource Utilization Measure the time it takes for the Evaluate the application's ability to Monitor the application's use of the application to respond to user handle a large number of of system resources, such as CPU, user requests, ensuring a smooth concurrent users and transactions, CPU, memory, and network, to smooth and responsive user identifying potential scaling issues. identify areas for optimization. experience. optimization. Selecting Appropriate Testing Techniques Techniques Load Testing Stress Testing Simulate high-traffic scenarios to assess the Intentionally push the application beyond its normal application's performance under heavy user loads. normal operating capacity to identify its breaking loads. breaking point. Endurance Testing Spike Testing Evaluate the application's ability to maintain Assess the application's response to sudden, performance over extended periods, identifying unexpected increases in user activity or data volume. identifying potential memory leaks or other long- volume. long-term issues. Designing Effective Test Scenarios 1 Identify User Profiles Understand the different types of users and their expected interactions with the application. 2 Define User Journeys Map out the typical steps users take to accomplish their tasks, ensuring comprehensive test coverage. 3 Incorporate Realistic Data Use real-world or representative data to create test scenarios that closely mimic mimic actual user behavior. Conducting Performance Tests in an Interactive Environment Plan Execute Monitor Analyze Carefully plan the Utilize appropriate Continuously monitor Thoroughly analyze the performance test, testing tools and monitor the the test results to defining objectives, techniques to execute application's behavior identify performance test scenarios, and the performance test in behavior and performance issues expected outcomes. a controlled performance metrics and potential areas for environment. metrics during the test for improvement. test execution. Analyzing and Interpreting Test Results Data Aggregation Performance Analytics Reporting Collect and consolidate Utilize data visualization and Generate comprehensive reports to performance data from various analytics tools to identify trends, reports to communicate findings sources to get a holistic view of the trends, patterns, and potential findings and recommendations to the application's behavior. bottlenecks in the performance to stakeholders and development data. development teams. Incorporating Findings into Application Improvements Identify Priorities Based on the performance test results, determine the most critical issues that need to need to be addressed. Implement Optimizations Collaborate with the development team to to implement targeted optimizations and and enhancements to the application. Validate Improvements Conduct follow-up performance tests to verify verify that the implemented changes have have positively impacted the application's application's performance. Iterate and Refine Continuously monitor the application's performance and repeat the testing and improvement process to maintain optimal optimal performance. System Testing - Exception Handling Exception handling is a critical aspect of system testing, ensuring software can gracefully manage unexpected scenarios and provide a seamless user experience. This presentation will explore the key principles and techniques for effective exception handling during formative evaluation. Importance of Exception Handling Handling during Formative Evaluation Evaluation 1 Identify Vulnerabilities 2 Validate Resilience Exception handling in formative testing Thorough exception testing ensures the testing helps uncover potential system ensures the system can handle system weaknesses and edge cases that disruptions and maintain functionality, cases that could cause failures. functionality, improving overall robustness. 3 Enhance User Experience Effective exception management results in graceful error handling and a more seamless, seamless, user-friendly application. Identifying Potential Exceptions in the System Input Validation Resource Constraints Dependency Failures Analyze user inputs and edge Identify potential bottlenecks Determine external system edge cases that could trigger and failures due to limited dependencies that could cause trigger exceptions, such as system resources like memory, cause exceptions if they become invalid data formats or out-of- storage, or network bandwidth. become unavailable or of-range values. unresponsive. Designing Test Cases for Exception Exception Scenarios 1 Boundary Conditions Create test cases to validate system behavior at the limits of expected input or resource usage. 2 Simulated Failures Design tests that deliberately trigger exceptions by simulating system, network, or network, or dependency failures. 3 Error Propagation Verify how exceptions are handled and propagated through the application, ensuring application, ensuring graceful degradation. Implementing Exception Handling Mechanisms Structured Exception Handling Error Logging and Notifications Leverage language-specific exception handling Implement robust logging and alerting systems to handling constructs (e.g., try-catch-finally) to capture and report exceptions for further to manage and respond to exceptions. analysis. Graceful Degradation User-Friendly Error Messages Design fallback mechanisms to ensure the system Provide clear, informative error messages to help system can continue functioning, even in the face to help users understand and recover from the face of exceptions. exceptions. Verifying the Effectiveness of Exception Handling Fault Injection Monitoring and Metrics Comprehensive Testing Deliberately introduce faults and Collect and analyze exception- Ensure thorough test coverage of and exceptions to validate the exception-related metrics to coverage of exception scenarios, system's ability to detect and assess the overall effectiveness of scenarios, including edge cases respond appropriately. effectiveness of exception cases and failure modes. handling. Analyzing and Reporting Exception Handling Handling Outcomes Exception Type Frequency Impact Resolution Null Pointer Exception 25 Moderate Improved input validation Database Timeout 12 High Increased connection pool size File Not Found 8 Low Implemented graceful fallback Continuous Improvement of Exception Handling Strategies Monitor and Analyze Refine Test Cases Enhance Mechanisms Continuously track exception- Update and expand test cases Iteratively improve exception related metrics and analyze to cover new exception handling mechanisms, such as trends to identify areas for scenarios and edge cases error handling, logging, and improvement. discovered during deployment. fallback procedures. System Testing - Security Testing Security testing is a critical process that helps identify and mitigate mitigate vulnerabilities in software systems. It ensures that applications are protected against a wide range of threats, from from unauthorized access to data breaches and system failures. failures. Importance of Security Testing in Formative Formative Evaluation Early Detection Continuous Improvement Security testing during formative evaluation helps Security testing throughout the development helps identify and address vulnerabilities early in the process enables ongoing monitoring and in the development lifecycle, reducing the risk and improvement, ensuring that applications remain and cost of remediation. remain secure as they evolve. 1 2 3 Proactive Approach By integrating security testing into formative evaluation, teams can adopt a proactive approach to approach to security, building secure applications applications from the ground up. Common Security Vulnerabilities 1 Injection Flaws 2 Broken Authentication Vulnerabilities that allow attackers to inject Weaknesses in user authentication and session malicious code into applications, compromising session management that can lead to compromising data and systems. unauthorized access. 3 Cross-Site Scripting (XSS) 4 Sensitive Data Exposure Vulnerabilities that allow attackers to inject Weaknesses that can lead to the disclosure of malicious scripts into web applications, of sensitive information, such as login credentials compromising user data and security. credentials or personal data. Threat Modeling and Risk Assessment Threat Modeling Risk Assessment Informed Decisions Systematically identifying and Evaluating the likelihood and Threat modeling and risk analyzing potential threats to an impact of identified threats, and assessment provide a structured an application, including attackers, prioritizing security efforts based structured approach to making attackers, their goals, and the on the level of risk. making informed decisions about assets they may target. about security controls and countermeasures. Security Testing Methodologies Reconnaissance Vulnerability Exploitation Reporting Identification Gather information Attempt to exploit Document the findings, about the target Systematically scan the identified findings, including the system, including its the system for known vulnerabilities to gain the impact of identified components, known vulnerabilities gain unauthorized identified configurations, and vulnerabilities and access or disrupt vulnerabilities and potential potential attack system operations. recommendations for vulnerabilities. vectors. for remediation. Automated Security Testing Tools Burp Suite OWASP ZAP Nmap SQLmap A powerful web An open-source web A network scanning tool An open-source tool that application security application security tool that can be used to that can be used to testing platform that scanner that can be to to identify active hosts, detect and exploit SQL automates the process of find and fix vulnerabilities hosts, open ports, and SQL injection process of identifying and vulnerabilities in web running services on a vulnerabilities in web and exploiting applications. network. applications. vulnerabilities. Integrating Security Testing into the Development Lifecycle Requirements Design Identify security requirements and incorporate them Perform threat modeling and risk assessment to incorporate them into the application design and to guide the implementation of secure design and development process. patterns and controls. Implementation Deployment Conduct regular security testing throughout the Ensure that the application is deployed in a secure the development process to identify and address secure environment and that ongoing monitoring address vulnerabilities. monitoring and testing are in place. Reporting and Remediation of Security Vulnerabilities Vulnerability Description Risk Level Remediation SQL Injection Attackers can inject High Implement input malicious SQL code to validation and to gain unauthorized parametrized queries unauthorized access to queries to prevent SQL access to the SQL injection. database. Cross-Site Scripting Attackers can inject High Properly encode and (XSS) malicious scripts into sanitize all user input web pages, to prevent XSS compromising user vulnerabilities. data and security. Weak Authentication Attackers can gain Medium Implement strong unauthorized access to password policies and the system due to enable multi-factor weak password authentication to policies or lack of enhance security. multi-factor authentication. Introduction to Usability Evaluation Usability evaluation is a critical process that assesses the user- friendliness and effectiveness of a digital product or service. It helps identify areas for improvement and ensures the design meets the needs of the target audience. Defining Usability Usability is a measure of how effectively, efficiently, and satisfactorily a product or service can be used to achieve specific goals by its target users. It encompasses factors like ease of learning, memorability, error tolerance, and overall user satisfaction. At its core, usability focuses on optimizing the interaction between humans and digital interfaces to create intuitive, seamless experiences that meet user needs and expectations. It goes beyond just aesthetics, emphasizing functionality, accessibility, and the overall user experience. Importance of Usability Evaluation Conducting usability evaluations is crucial for ensuring digital products and services meet user needs and expectations. It helps: Identify pain points and frustrations in the user experience Uncover hidden usability issues that may be overlooked during design Validate design decisions and gather data-driven insights Improve user satisfaction and drive increased adoption of the product By prioritizing usability, organizations can deliver more intuitive, accessible, and valuable digital experiences to their users. Usability Techniques methods that are used to analyze the quality, effectiveness, and accuracy of any product/service/process. 1. User Evaluation 2. Expert Evaluation Types of Usability Evaluation Usability evaluation can be conducted through a variety of methods, each offering unique insights and advantages. Common approaches include user testing, heuristic evaluation, cognitive walkthroughs, and field studies, among others. The choice of method depends on factors such as project timeline, budget, and the stage of the design process. Combining multiple evaluation techniques can provide a comprehensive understanding of the user experience. Usability Testing Methods Observation Think-Aloud Interviews and Remote Usability Protocol Surveys Testing Observing users interacting with the Asking users to Gathering direct Conducting testing product in a verbalize their feedback from users sessions with controlled setting, thoughts, feelings, through structured geographically noting their and actions as they interviews or dispersed users, often behaviors, challenges, complete tasks, questionnaires to using screen-sharing and thought providing valuable understand their and video processes to identify insights into the perceptions, needs, conferencing tools to usability issues. user's mental model. and pain points. observe interactions. Participant Recruitment and Screening 1 Identify Target Users Clearly define the user profiles and target audience for the usability evaluation to ensure relevant participant selection. 2 Recruit Diverse Participants Actively seek out a diverse pool of participants that represent the product's intended user base, including variations in age, background, and technical expertise. 3 Pre-Screening Questionnaire Utilize a pre-screening questionnaire to assess participants' fit, gather relevant demographic information, and identify any special needs or accessibility requirements. Usability Evaluation Metrics Effectiveness Efficiency Satisfaction Learnability Measures how well Evaluates the time Assesses the user's Measures how users can complete and effort required overall attitude and quickly and easily tasks and achieve for users to feelings towards the new users can their goals using the accomplish their product, including understand and product or service. tasks, highlighting their subjective navigate the areas for experiences and product's functions optimization. emotional and features. responses. Usability Evaluation Metrics Effectiveness You need to prepared some task to be completed Measures how well users Answer some questions can complete tasks and Eg: achieve their goals using the product or service. Usability Evaluation Metrics Satisfaction You need to prepared some task to be completed Assesses the user's overall Answer some questions attitude and feelings Eg: towards the product, including their subjective experiences and emotional responses. Usability Evaluation Metrics Learnability You need to prepared some task to be completed Measures how quickly and Answer some questions easily new users can Eg: understand and navigate the product's functions and features. Data Collection and Analysis 1 Gathering Usability Data 2 Analyzing Quantitative Metrics Collect comprehensive data through Examine key performance indicators like methods like user observation, think-aloud task completion rates, error frequencies, protocols, and post-task interviews to gain and time-on-task to identify areas for deep insights into user behaviors and pain improvement in the user experience. points. 3 Interpreting Qualitative Feedback 4 Identifying Patterns and Trends Carefully review user comments, opinions, Look for recurring themes, behaviors, and and reactions to uncover the underlying pain points across the user data to derive reasons behind usability issues and user meaningful insights and inform design satisfaction levels. decisions. Reporting Findings and Recommendations Comprehensive Report Data-Driven Insights Consolidate the key findings from the Analyze the quantitative and qualitative data usability evaluation into a detailed report, collected to uncover meaningful insights and highlighting both positive and negative patterns that can inform design decisions. aspects of the user experience. Actionable Recommendations Storytelling Approach Provide a clear set of recommendations and Present the findings and recommendations prioritized action items to address the in a compelling, easy-to-understand manner, identified usability issues and improve the using visual aids and user narratives to bring overall user experience. the insights to life. Chapter 15 EVALUATION STUDIES: From Controlled to Natural Settings Goals Explain how to do usability testing Outline the basics about conducting experiments Describe how to do in the wild studies Discuss why remote studies maybe needed and how to do remote testing and in-the-wild studies www.id-book.com 2 Usability testing Involves testing how people perform on products in controlled settings The people being tested are observed and the time it takes them to complete a task and the number and kinds of errors they make are recorded Data is recorded on video and key presses are logged The data is used to calculate performance times and to identify and explain errors User satisfaction is evaluated using questionnaires and interviews Observations about how the product is used in more natural contexts, including in-the wild may be included www.id-book.com 3 Quantitative measures Number of participants successfully completing the task Time to complete the task Time to complete the task after time away from it Number and type of errors per task Number of errors per unit of time Number of times people navigate to an item such as online help Number of people making the same or similar errors Source: Wixon and Wilson, 1997 www.id-book.com 4 Usability lab with designers watching a user and assistant through a one-way mirror and recording www.id-book.com 5 Layout of a usability lab used by US Health & Human Services Dept. source: https://www.usability.gov/how-to-and-tools/guidance/hhs-usability-lab.html www.id-book.com 6 Tobii Glasses Mobile Eye-Tracking System Source: https://www.tobiipro.com/news-events/on-demand-webinars/ Wearable-eye-tracking-for-research/ www.id-book.com 7 Equipment is getting smaller As more testing is done remotely and in-the- wild, equipment has got smaller New lightweight models of eye-tracking glasses that look like ordinary glasses Video and audio recording is often done using mobile phones which produce increasingly good quality with each new model of phone Replace bulky recording devices and tripods, also more discrete www.id-book.com 8 Remote testing Participants can carry out tasks in their own environment without an evaluator being present Zoom, Teams and other digital communications are often used Testing can be synchronous or asynchronous Advantage of remote testing, especially asynchronous testing, is that several participants can be tested at the same time in their own environments Covid-19 pandemic encouraged creative ways of conducting remote testing that was safe Remote testing may make it easier for people with disabilities to participate Downside: no one present to help with equipment problems and less personal than co-present testing www.id-book.com 9 Case 1: iPad usability (Budiu & Nielsen, 2010) Now a classic study – innovative in its day! Study conducted in two cities: Fremont, CA and Chicago, IL Tests had to be done quickly, as information was needed by third-party app developers Also needed to be done secretly so that the competition was not aware of the study before the iPad was launched Seven participants with over three months experience with iPhones www.id-book.com 10 iPad usability testing procedure Signed an informed consent form explaining: What the participant would be asked to do The length of time needed for the study The compensation that would be offered for participating Participants’ right to withdraw from the study at any time A promise that the person’s identity would not be disclosed An agreement that the data collected would be confidential and available only to the evaluators Participants were asked to explore the iPad Next, they were asked to perform randomly-assigned specified tasks www.id-book.com 11 Examples of the tasks used in the iPad evaluation Adapted from Budiu and Nielsen, 2010 Source: iPad App and Website Usability Study. Used courtesy of the Neilsen Norman Group. www.id-book.com 12 Problems and actions Examples of problems detected: Accessing the Web was difficult Lack of affordance and feedback Getting lost in an application Knowing where to tap Actions by evaluators: Reported to developers Made available to public on Neilsen Norman Group. www.id-book.com 13 Problems and actions (continued) Accessibility for all users is important Study did not address how iPad would be used in people’s everyday lives Another study was done a year later to examine this and other issues that there was insufficient time to address in this study www.id-book.com 14 Case 2: Remote testing with extended VR (XR) (Sanni Siltanen et al., 2021) Study at Krone Corp. (Finland) to examine new immersive experiences in VR Needed to find ways to test experts and company employees remotely because of Covid-19 pandemic Also needed a sophisticated setup to test the collaborative XR platforms across multiple locations Testing was conducted remotely with experts in 8 countries: Finland, India, China, Germany, Indonesia, Malaysia, USA, United Arab Emirates Some tests in participants premises where special disinfecting routines had to be carried out, other tests were done remotely Little advice available about how to do remote testing during a pandemic www.id-book.com 15 A participant in a test session Source: Siltanen et al. (2021) MDPI / CC BY 4.0 www.id-book.com 16 Screenshots from the DesignSpace VR environments Source: Siltanen et al. (2021) MDPI / CC BY 4.0 www.id-book.com 17 Testing setup used with DesignSpace environments Source: Siltanen et al. (2021) MDPI / CC BY 4.0 www.id-book.com 18 Experiments Test hypotheses Predict the relationship between two or more variables Independent variable is manipulated by the researcher Dependent variable influenced by the independent variable Typical experimental designs have one or two independent variables Validated statistically and replicable www.id-book.com 19 Experimental designs Different participants (between participants): Single group of participants is allocated randomly to the experimental conditions Same participants (within participants): All participants participate in both conditions Matched participants (pairwise design): Participants are matched in pairs, for example, based on expertise, gender, and so on www.id-book.com 20 Different, same, matched participant design www.id-book.com 21 In-the-wild studies Done in natural settings “In-the-wild” is a term for prototypes being used freely in natural settings – broader and typically less controlled than traditional field studies Seek to understand what users do naturally and how technology impacts them In-the-wild studies are used in product design to: Identify opportunities for new technology Determine design requirements Decide how best to introduce new technology Evaluate technology in use www.id-book.com 22 An in-the-wild study of a pain- monitoring device Monitoring patients’ pain is a known challenge for physicians Goal of the study was to evaluate the use of a pain- monitoring device for use after ambulatory surgery Painpad is a keypad device It was usability tested extensively in the lab before being brought into two hospitals Goal was to understand how Painpad was used in the natural environment and as part of routines in two UK hospitals. How pain-monitoring changed with Painpad www.id-book.com 23 Painpad A tangible device for inpatient self-logging of pain Source: Price et al., 2018. Reproduced with permission of ACM Publications. www.id-book.com 24 Data collection and participants Two studies in two hospitals involving 54 people 13 males, 41 females Privacy was an important concern Hospital stay ranged from 1-7 days, mean and median age 64.6, 64.5 years Patients given Painpad after surgery and prompted to report pain levels every two hours Nurses also collected scores All data entered into charts Patients in one hospital were given a user-satisfaction survey when they left Also rated Painpad on a 1-5 Likert scale www.id-book.com 25 Data analysis and presentation Three types of data were collected: Satisfaction with Painpad was based on questionnaire responses Patients’ compliance with the two-hour routine How data collected from Painpad compared with data collected by nurses Data showed: Satisfaction with Painpad 4.63 on Likert scale Patients’ compliance was mixed: some liked it while others disliked or didn’t notice the prompts Patients recorded more scores with Painpad than through the nurses www.id-book.com 26 Long-term studies Some studies involve leaving designs with participants for extended periods so that participants have time to get to know how to use a product in their own setting Typically these are complex technical products that take time to learn, such as visualization tools These evaluations often start with an interview and then the product is left with the participant for 2-4 weeks Data is then collected, often consisting of: structured interviews, daily use diaries, automatic logging of usage www.id-book.com 27 How many participants is enough for user testing? The number is a practical issue Depends on: Schedule for testing Availability of participants Cost of running tests Typically 5-10 participants if possible Some experts argue that testing should continue until no new insights are gained Others suggest that 5 participants is enough to identify any serious problems www.id-book.com 28 How many participants is enough for other types of evaluation? Experiments test hypotheses to discover new knowledge by investigating the relationship between two or more variables The number of participants needed depends on the type of experiment being conducted It is advisable to consult a statistician before deciding In-the-wild studies also vary from a few people in a home to a software team to a whole community www.id-book.com 29 Usability testing and research Usability Testing Experiments for Research Improve products Discover knowledge Few participants Often many participants Results inform design Results validated statistically Usually not completely replicable Must be replicable Conditions controlled as Strongly controlled much as possible conditions Procedure planned Experimental design Results reported to Scientific report to developers scientific community www.id-book.com 30 Summary Usability testing takes place in controlled spaces - usability labs, temporary labs Usability testing focuses on performance measures - how long and how many errors are made when completing a set of predefined tasks Indirect observation (video and keystroke logging), user satisfaction questionnaires, and interviews are also collected Remote testing has been conducted since the early 1990s but it became important during the Covid19 pandemic Remote testing uses portable equipment - video and audio recording using smart phones, mobile eye-tracking and automated keystroke logging 31 www.id-book.com Summary (continued) Long-term studies of several weeks are used to evaluate complex products that participants need time to learn and use in their own work Experiments test a hypothesis by manipulating certain variables while keeping others constant The experimenter controls independent variable(s) in order to measure changes in the dependent variable(s) In-the-wild studies are carried out in natural settings to discover how people interact with technology in the real world In-the-wild studies involve deployment of prototypes or technologies in natural settings Sometimes the findings of in-the-wild studies are unexpected, especially for studies that explore how novel technologies are used by participants in their own homes, places of work, or outside www.id-book.com 32