ED-209 Unit 2 Performance Assessment PDF

Lesson 1. What is Performance Assessment 1.1 Definition of Performance Assessment According to Lane (2010) as cited by Mcmillan (2018) performance assessment is an assessment appr...

Lesson 1. What is Performance Assessment 1.1 Definition of Performance Assessment According to Lane (2010) as cited by Mcmillan (2018) performance assessment is an assessment approach that involves a student’s demonstration of a skill or competency in creating a product, constructing a response, or making a presentation. Performance assessments are multistep assignments with clear criteria,expectations, and processes that measure how well a student transfers knowledge and applies complex skills to create or refine an original product. Students actually perform the skill or behavior, rather than ask answer questions about how to do something or choose a correct answer for alternatives given.Performance assessment is one in which the teacher observes and make judgment about the student’s demonstration of a skill or competency in creating a product, constructing a response, or making a presentation.The term performance is shorthand for performance-based or performance-and-product. In this type of assessment, students use their knowledge and skills to construct something. Examples of this is demonstrating a skill such as playing the piano or guitar, delivering a speech on stage, or demo-teaching. The emphasis is on the students’ ability to perform tasks by producing their own work with their knowledge and skills. In some cases, students create a product such a video, short story, collage, or brochures. Performance assessment is simply applying the teaching/learning methods used successfully for years in the adult world. Musicians, ballet dancers, athletes, architects, and doctors all learn by getting feedback on what they have constructed and demonstrated in practice. This same approach can be applied to learning all content areas, targeting important skill outcomes. Other terms, such as alternative assessment and authentic assessment, are sometimes used interchangeably with performance assessment, but they actually mean something different. An alternative assessment is any method that differs from conventional paper-and-pencil tests, most particularly objective tests. Examples of alternative assessments include observations, exhibitions, oral presentations, experiments, portfolios, interviews, and projects. (Some think of essays as a type of alternative assessment because they require students to construct responses.) As what we previously learned, authentic assessment involves the direct examination of a student’s ability to use knowledge to perform a task that is like what is encountered in real life or in the real world. Authenticity is judged in the nature of the task completed and in the context of the task (e.g., in the options available, constraints, and access to resources). Authentic classroom assessment is excellent for motivating students—it gets them engaged and requires application thinking skills. Questions for Discussion 1. How do you contrast performance assessment from: alternative assessment and authentic assessment? 2. When is the appropriate time to implement performance assessment? 1.2 Characteristics of Performance Assessment Figure 2.1 summarizes the characteristics of an effective performance assessment. Figure 2.1 Characteristics of Performance Assessment Activity 2.1 Form yourselves into 5 groups. Assign one characteristic of performance assessment to 1 or 2 members. Elucidate on the characteristic assigned to you. Each group will consolidate the answers of all members. Share group output with the class. Comment/react on the answers posted by other groups. Read: Performance assessment: The state of the art. (SCOPE Student Performance Assessment Series) by Lane, Suzanne. (2010). Question for Discussion From what you have read, what characteristics of performance assessment can you add to the list above? 1.3. Strengths and Limitations of Performance Assessment The major benefits of performance assessments are tied closely to providing effective instruction. This explains much of the appeal of the approach. Learning occurs while students complete the assessment. Teachers interact with students as they engage in the task, hopefully providing feedback and prompts that help students learn through multiple opportunities to demonstrate what their skills. Because the assessments are usually tied to real-world challenges and situations, students are better prepared for such thinking and reasoning outside of school. Students justify their thinking and learn that often no single answer is correct. In this way, the assessments influence the instruction to be more meaningful and practical. Students value the task more because they view it as rich rather than superficial, engaging rather than uninteresting, and active rather than passive. For these reasons, there are many significant advantages when you use performance assessments. Strengths of Performance Assessment According to Mcmillan (2018), performance assessments are better suited to measure complex thinking targets than are selected-response tests or simple constructed-response items. Students are more engaged in active learning as a part of the assessment because they need to be engaged to perform successfully. Since the emphasis is on what students do, skills are more directly assessed, and there are more opportunities to observe the process students use to arrive at answers or responses. Because performance assessment engages students in an activity that ultimately leads to a task or product that can be scored, students tend to go “way beyond the things they learn in class”. The result is a better understanding of students’ skills by their teacher, but also a keener knowledge of the topic by the students themselves Students who traditionally do not perform well on paper-and-pencil tests have an opportunity to demonstrate their learning in a different way. Another advantage of performance assessments is that multiple, specific criteria for judging success are identified. You should share these criteria with students before the assessment so that the students can use them as they learn. In this way, students learn how to evaluate their own performance through self- assessment. They learn how to ask questions and, in many assessments, how to work effectively with others. Finally, performance assessment motivates educators to explore the purposes and processes of schooling. Because of the nature of the assessments, teachers revisit their learning goals, instructional practices, and standards. They explore how students will use their classroom time differently and whether there are adequate resources for all students. Limitations of Performance Assessment The limitations of using performance assessment lie in three areas: reliability/precision, sampling, and time. Unfortunately, performance assessments are subject to considerable measurement error, which lowers reliability/precision. Like essay items, the major source of measurement error with performance assessments is with scoring. Because scoring requires professional judgment, usually by only one person, there will be variations and error due to bias and other fac-tors, similar to what affects evaluating essay answers. Although procedures exist that can minimize scoring error—such as carefully constructed criteria, tasks, and scoring rubrics; systematic scoring procedures; and using more than one rater—reliability/precision is likely to be lower than what is achieved with other types of assessment. Inconsistent student performance also contributes to error. That is, student performance at one time may differ noticeably from what the student would demonstrate at another time (this might occur, for example, if on the day of the performance the student is ill).Because it takes considerable time for students to do performance assessments, you will have relatively few samples of student proficiency. Furthermore, we know that performance on one task may not provide a very good estimate of student success on other tasks. This means that if you intend to use the results of performance assessment to form conclusions about capability in a larger domain of learning targets, you need to accumulate information from multiple tasks. It also helps to select tasks that can optimize generalization to the learning targets. Suppose the learning target is concerned with skills associated with making a PowerPoint presentation. If the task is relatively restricted (e.g., using only a few PowerPoint features with a short presentation, making a 2-minute speech), generalization is more limited than when the task encompasses additional skills (e.g., the PowerPoint is longer and contains many features, making a 15-minute speech). Your choice, then, is to use many restricted tasks or few tasks to reach the same level of generalizability.The third major limitation of performance assessment concerns time. First, it is very time consuming for teachers to construct good tasks, develop scoring criteria and rubrics, administer the task, observe students, and then apply the rubrics to score the performance or product. For performances that are difficult to score at the time of the performance, such as when a student makes a speech, adequate time needs to be taken with each student as he or she performs the task. Second, it is difficult, in a timely fashion, to interact with all students and give them meaningful feedback as they learn and make decisions. Finally, it is difficult to estimate the amount of time students will need to complete performance assessments, especially if the task is one you haven’t used previously and if students are unaccustomed to the format and/or expectations. The strengths and weaknesses of performance assessments are summarized in Table 1.1. Table 1.1 Strengths and Limitations of Performance Assessment Questions for Discussion 1. How can students and teachers benefit from performance assessment? 2. Knowing the limitations of performance assessment, what are your suggestions to avoid or minimize them? Key Points In contrast to paper-and-pencil tests, performance assessment requires students to construct an original response (performance or product) to a task that is scored with teacher judgment. Authentic assessment involves a performance task that approximates what students are likely to have to do in real-world settings. Performance assessment integrates instruction with evaluation of student achievement and is based on constructivist learning theory. Multiple criteria for judging successful performance are developed. Effective performance assessment engages students in meaningful activities that enhance their thinking skills and demonstrate their ability to apply what they have learned. Limitations of performance assessments include the resources and time needed to conduct them, bias and unreliability in scoring, and a lack of generalization to larger domains of knowledge. Learning Targets for Performance Assessment Performance assessments are primarily used for four types of learning targets—deep understanding, reasoning, skills, and products. Deep understanding and reasoning involve in-depth, complex thinking about what is known and application of knowledge and skills in novel and more sophisticated ways. Skills include student proficiency in reasoning, communication, and psychomotor tasks. Products are completed works, such as term papers, projects, and other assignments in which students use their knowledge and skills. 3.1 Deep Understanding and Reasoning Deep Understanding The essence of performance assessment includes the development of students’ deep understanding of something. The idea is to involve students meaningfully in hands-on activities for extended periods of time so that their understanding is richer and more extensive than what can be attained by more conventional instruction and traditional paper-and-pencil assessments. Deep understanding in performance assessments focuses on the use of knowledge and skills. Student responses are constructed in unique ways to demonstrate depth of thought and subtleties of meaning in novel situations. Students are asked to demonstrate what they understand through the application of knowledge and skills. Reasoning Like deep understanding, reasoning is essential with most performance assessments. Students will use reasoning skills as they demonstrate skills and construct products. Typically, students are given a problem to solve or are asked to make a decision or other outcome, such as a letter to the editor or school newsletter, based on information that is provided. They use cognitive processes such as analysis, synthesis, critical thinking, inference, prediction, generalizing, and hypothesis testing. 3.2. Skills In addition to reasoning skills, students are required to demonstrate communication, presentation, and/or psychomotor skills. These targets are ideally suited to performance assessment. We’ll consider each one. A. Communication and Presentation Skills. Learning targets focused on communication skills involve student performance of reading, writing, speaking, and listening. For reading, targets can be divided into process—what students do before, during, and after reading—and product—what students get from the reading. Reading targets for elementary students progress from targets such as phonemic awareness skills (e.g., decoding, phonological awareness, blending), to skills needed for comprehension and understanding (such as discrimination, contextual cues, inference, blending, sequencing, and identifying main ideas). For effective performance assessment, each of these areas needs to be delineated as a specific target. For instance, a word identification target may include naming and matching uppercase and lowercase letters, recognizing words by sight, recognizing sounds and symbols for consonants at the beginnings and ends of words, and sounding out three-letter words. For older students, reading targets focus on comprehension products and strategies and on reading efficiency, including stating main ideas; identifying the setting, characters, and events in stories; drawing inferences from context; and reading speed. More advanced reading skills include sensitivity to word meanings related to origins, nuances, or figurative meanings; identifying contradictions; and identifying possible multiple inferences. All reading targets should include the ability to perform a specific skill for novel reading materials. A variety of formats should also be represented. Writing skill targets are also related to a student’s grade level. The emphasis for young students is on their ability to construct letters and copy words and simple sentences legibly. For writing complete essays or papers, elaborate delineations of skills have been developed. Typically, important dimensions of writing are used as categories, as illustrated in the following writing targets: Purpose Clarity of purpose; awareness of audience and task; clarity of ideas Organization Unity and coherence Details Appropriateness of details to purpose and support for main point(s) of writer’s response Voice/tone Personal investment and expression Usage, mechanics, Correct usage (tense formation, agreement, and grammar word choice), mechanics (spelling, capitalization, punctua-tion), grammar, and sentence construction Other dimensions can be used when the writing skill being measured is more specific, such as writing a persuasive letter, a research paper, or an editorial. Writing targets, like those in reading, should include the ability to perform the skill in a variety of situations or contexts. That is, if students have been taught persuasive writing by developing letters to editors, the student may write a persuasive advertisement or speech to demonstrate that he or she has obtained the skill. Oral communication skill targets can be generalized to many situations or focused on a specific type of presentation, such as giving a speech, singing a song, speaking a foreign language, or competing in a debate. When the emphasis is on general oral communication skills, the targets typically center on the following three general categories (Russell & Airasian, 2012): Physical expression Eye contact, posture, facial expressions, gestures, and body movement Vocal expression Articulation, clarity, vocal variation, loudness, pace, and rate Verbal expression Repetition, organization, summarizations, reasoning, completeness of ideas and thoughts, selection of appropriate words to convey precise meanings A more specific set of oral communication skill targets is illustrated in the following guidelines for high school students A.Speaking clearly, expressively, and audibly 1.Using voice expressively 2.Speaking articulately and pronouncing words correctly 3.Using appropriate vocal volume B.Presenting ideas with appropriate introduction, development, and conclusion 1.Presenting ideas in an effective order 2.Providing a clear focus on the central idea 3.Providing signal words, internal summaries, and transitions C.Developing ideas using appropriate support materials 1.Being clear and using reasoning processes 2.Clarifying, illustrating, exemplifying, and documenting ideas D.Using nonverbal cues 1.Using eye contact 2.Using appropriate facial expressions, gestures, and body movement E.Selecting language to a specific purpose 1.Using language and conventions appropriate for the audience For specific purposes, the skills are more targeted. For example, if a presentation involves a demonstration of how to use a microscope, the target could include such criteria as clarity of explanations, understanding of appropriate steps, appropriateness of examples when adjustments are necessary, dependency on notes, and whether attention is maintained, as well as more general features such as posture, enunciation, and eye contact. B. Psychomotor Skills. There are two steps in identifying psychomotor skill learning targets. The first step is to describe clearly the physical actions that are required. These may be developmentally appropriate skills or skills that are needed for specific tasks. The psychomotor area is divided into five categories as shown in Table 2.1 to Table 2.1 Examples of Psychomotor Skills help you describe the behavior: fine-motor skills (such as holding a pencil, measuring chemicals, and using scissors), gross-motor actions (such as jumping and lifting), more complex athletic skills (such as shooting a basketball or playing golf), some visual skills, and verbal/auditory skills for young children.The second step is to identify the level at which the skill is to be performed. One effective way to do this is to use an existing classification of the psychomotor domain. This system is hierarchical. At one level there is guided response, which essentially involves imitating a behavior or following directions. At higher levels students show more adaptability and origination, a greater ability to show new actions and make adjustments as needed. 2.3 Products Performance assessment products are completed works. For years, students have done papers, reports, and projects. What makes these products different when used for performance assessment is that they are more engaging and authentic, and they are scored more systematically with clear criteria and standards. For example, rather than having sixth graders report on a foreign country by summarizing the history, politics, and economics of the country, students write promotional materials for the country that would help others decide if it would be an interesting place to visit. In chemistry, students are asked to identify an unknown Table 2.2 Performance Product and Skills Varying in Authenticity substance. Why not have them identify the substances from a local landfill, river, or body of water? In music, students can demonstrate their proficiency and knowledge by creating and playing a new song. Table 2.2 presents some other examples, varying in authenticity.As a learning target, each product needs to be clearly described in some detail so that there is no misunderstanding about what students are required to do. It is insufficient to simply say, for example, “Write a report on one of the planets and present it to the class.” Students need to know about the specific elements of the product (e.g., length, types of information needed, nature of the audience, context, materials that can be used, what can be shown to the audience) and how they will be evaluated. One effective way to do this is to show examples of completed projects to students. These are not meant to be copied, but they can be used to communicate standards and expectations. If the examples can demonstrate different levels of proficiency, so much the better. A good way to generate products is to think about what people in different occupations do. What does a city planner do? What would an expert witness produce for a trial? How does a mapmaker create a map that is easy to understand? What kinds of stories does a newspaper columnist write? How would an advertising agent represent parks to attract tourists? Activity 2.2 Form a group with 5 to 6 members. Each group will choose a grade level and a period of instruction (Ex. Science 7, 2nd Quarter). Consult the DepEd Curriculum Guide (refer to the content and performance standard, learning competencies, or the grade level standards) for the subject area you have chosen. Identify the learning targets that can be used for performance assessment in the subject area and period of instruction you have chosen. Use Worksheet # 2.1 in the Required Assignment to consolidate your work. Key Points Performance assessment is used most frequently with deep understanding, reasoning, skill, and product learning targets. Communication skill targets include reading, writing, and speaking. Psychomotor skill targets consist of physical actions—fine motor, gross motor, complex athletic, and visual, and verbal/auditory. Product targets are completed student works, such as papers, written reports, and projects. Presentation targets include oral presentations and reports. Designing Performance Assessment Task Once learning targets have been identified and you have decided that a performance assessment is the method you want to use, three steps will guide you in constructing the complete performance task. The first is to identify the performance task in which students will be engaged; the second is to develop descriptions of the task and the context in which the performance is to be conducted; the third is to write the specific question, prompt, or problem the students will receive (Figure 2.2). 3.1 Step 1: Identify the Performance Task The performance task is what students are required to do in the performance assessment, either individually or in groups. The tasks can vary by subject and by level of complexity. Some performance tasks are specific to a content area, and others integrate several subjects and skills. With regard to level of complexity, it is useful to distinguish two types: restricted and extended. Figure 2.2 Steps in Constructing Performance Task Restricted- and Extended-Type Performance Tasks Restricted-type tasks target a narrowly defined skill and require relatively brief responses. The task is structured and specific. These tasks may look similar to short essay questions and interpretive exercises that have open-ended items. The difference is in the relative emphasis on characteristics listed in Figure 2.1. Often the performance task is structured to elicit student explanations of their answer. Students may be asked to defend an answer, indicate why a different answer is not correct, tell how they did something; draw a diagram, construct a visual map, graph, or flowchart, or show some other aspect of their reasoning. In contrast, short essay questions and interpretive exercises are designed to infer reasoning from correct answers. Although restricted-type tasks require relatively little time for administration and scoring in comparison with extended-type tasks (providing greater reliability and sampling), it is likely that fewer of the important characteristics of authentic performance assessments are included. Many publishers provide performance assessments in a standardized format, and most of them contain restricted-type tasks. Further examples of restricted-type performance tasks are listed in Table 2.3. Table 2.3 Examples of Restricted-and Extended-Type Perfomance Assessment Tasks Questions for Discussion When is the best time to use Restricted-type performance assessment? Extended-type tasks are more complex, elaborate, and time consuming. Extended-type tasks often include collaborative work with small groups of students. The assignment usually requires that students use a variety of sources of information (e.g., observations, library, interviews). Judgments will need to be made about which information is most relevant. Products are typically developed over several days or even weeks, with opportunities for revision. This allows students to apply a variety of skills and makes it easier to integrate different content areas and reasoning skills. It is not too difficult to come up with ideas for what would be an engaging extended-type task. As previously indicated, one effective approach is to think about what people do in different occupations. Another way to generate ideas is to check curriculum guides and teacher’s editions of textbooks because most will have activities and assignments that tap student application and reasoning skills. Perhaps the best way to generate ideas is by brainstorming with others, especially members of the community. They can be particularly helpful in thinking about authentic tasks that involve reasoning and communication skills. Some ideas that could be transformed into extended-type tasks are included in Table 2.3. Once you have a general idea for the task, you need to develop it into a more detailed set of specifications. (You can get ideas on performance task from Other Learning Resources in this Unit.) Questions for Discussion Which purpose of assessment is extended-type performance assessment is best suited for? Formative or summative? Why? Activity 2.3 With the same group in Activity 2, choose a a learning target or combinations of learning targets (from those you identified in Activity 2) which you can assess using Restriced-type performance assessment. Choose another learning target or set of learning targets which you can assess using Extended-type performance assessment. Identify the task for each type of performance assessment only. Use Worksheet #2.2 and #2.3 in the Required Assignment to consolidate your work. 3.2 Step 2: Prepare the Task Description The performance task needs to be specified so that it meets the criteria for good performance assessment and is clear to students. This is accomplished by preparing a task description. The purpose of the task description is to provide a blueprint or listing of specifications to ensure that essential criteria are met, that the task is reasonable, and that it will elicit desired student performance. The task description is not the same as the actual format and wording of the question or prompt that is given to students; it is more like a lesson plan. The task description should include the following: Content and skill targets to be assessed Description of student activities Group or individual Help allowed Resources needed Teacher role Administrative process Scoring procedures It is essential to clearly describe the specific targets to be assessed to make certain that the activities and scoring are well matched to ensure both valid and practical assessments. Think about what students will actually do to respond to the question or solve the problem by specifying the context in which they will work: Will they consult other experts, use library resources, do experiments? Are they allowed to work together, or is it an individual assignment? What types of help from others are allowed? Is there sufficient time to complete the activities? Once the activities are described, the resources needed to accomplish them can be identified. Are needed materials and resources available for all students? What needs to be obtained before the assessment? It will be helpful to describe your role in the exercise. Will you consult your students or give them ideas? Are you comfortable with and adequately prepared for what you will do? What administrative procedures are required? (like securing permission) Finally, identify scoring procedures. Will scoring match the learning targets? Is adequate time available for scoring? Do you have the expertise needed to do the scoring? Is it practical? One effective way to begin to design the task is to think about what has been done instructionally The assessment task should be structured to mirror the nature of classroom instruction so that what you are asking students to do is something that they are already at least somewhat familiar with. Once the task description is completed and you are satisfied that the assessment will be valid and practical, you are ready to prepare the specific performance task question or prompt. Activity 2.4 With your group, continue working on Worksheet #2.2 and Worksheet #2.3. This time you will write a description of the task you identified in Acitivity 2.3. (Work only on the description of the audience, time constraints, materials and equipment and administrative procedures) 3.3. Step 3: Prepare the Performance Task Question or Prompt The actual question, problem, or prompt that you give to students will be based on the task description. It needs to be stated so that it clearly identifies what the final outcome or product is, outlines what students are allowed and encouraged to do, and explains the criteria that will be used to judge the product. A good question or prompt also provides a context that helps students understand the meaningfulness and relevance of the task. It’s often best to use or adapt performance tasks that have already been developed. Several professional organizations have organized networks and other resources for developing performance tasks. Many subject-oriented professional organizations, have good resources for identifying performance tasks, and the Internet can be used to tap into a vast array of examples. Just Google “performance assessment” with your area of teaching and grade level and lots of ideas will pop up. Whether you develop your own tasks or use intact or modified existing ones, you will want to evaluate the task on the basis of the following suggestions (summarized in Figure 2.3). 1. The Performance Task Should Integrate the Most Essential Aspects of the Content Being Assessed with the Most Essential Skills. Performance assessment is ideal for focusing student attention and learning on the “big ideas” of Figure 2.3 Checklist for Wriing Perfromance Tasks a subject, the major concepts, principles, and processes that are important to a discipline. If the task encourages learning of peripheral or tangential topics or specific details, it is not well suited to the goal of performance assessment. Tasks should be broad in scope. Similarly, reasoning and other skills essential to the task should represent essential processes. The task should be written to integrate content with skills. For example, it would be better to debate important content or contemporary issues rather than something relatively unimportant. A good test for whether the task meets these criteria is to decide if what is assessed could be done as well with more objective, less time-consuming measures. Examples Poor: Estimate the answers to the following three addition problems. Explain in your own words the strategy used to give your answer. Improved: Sam and Tyron were planning a trip to a nearby state. They wanted to visit as many different major cities as possible. Using the map, estimate the number of major cities they will be able to visit on a single tank of gas (14 gallons) if their car gets 25 miles to the gallon Questions for Discussion What do you mean by the statement: “The task should be written to integrate content with skills.” 2. The Task Should Be Authentic. This suggestion lies at the heart of authentic performance assessment. As indicated earlier, authentic tasks are relevant to realworld and real-life contexts (Groeber, 2007), as cited by McMillan (2018). Grant Wiggins developed a set of six standards for judging the degree of authenticity in an assessment task (Wiggins, 1998). He suggests that a task is authentic if it: A. Is realistic. The task replicates the ways in which a person’s knowledge and abilities are “tested” in real-world situations. B. Requires judgment and innovation. The student has to use knowledge and skills wisely and effectively to solve unstructured problems, and the solution involves more than following a set routine or procedure or plugging in knowledge. C. Asks the student to “do” the subject. The student has to carry out exploration and work within the discipline of the subject area, rather than restating what is already known or what was taught. D. Replicates or simulates the contexts in which adults are “tested” in the workplace, in civic life, and in personal life. Contexts involve specific situations that have particular constraints, purposes, and audiences. Students need to experience what it is like to do tasks in workplace and other real-life contexts. E. Assesses the student’s ability to efficiently and effectively use a repertoire of knowledge and skill to negotiate a complex task. Students should be required to integrate all knowledge and skills needed, rather than to demonstrate competence of isolated knowledge and skills. F. Allows appropriate opportunities to rehearse, practice, consult resources, and get feedback on and refine performances and products. Rather than rely on secure tests as an audit of performance, learning should be focused through cycles of performance-feedback-revision-performance, on the production of known high- quality products and standards, and learning in context. Examples Poor: Compare and contrast different kinds of literature. Improved: You have been asked to make a presentation to our school board about different types of literature. Prepare a PowerPoint presentation that you would use to explain different types of literature, including poems, biographies, mysteries, and fictional novels. Provide examples of each type, explain the characteristics of each, and explain why you like some better than others. Create charts or figures as part of your presentation, which should be no longer than 15 minutes. 3. Structure the Task to Assess Multiple Learning Targets. As pointed out in the first suggestion, it is best if the task addresses both content and skill targets. Within each of these areas there may be different types of targets. For instance, assessing content may include both knowledge and understanding and, as in the preceding example, both reasoning and communication skills. It is also common to include different types of communication and reasoning skills in the same task (e.g., students provide both a written and an oral report or need to think critically and synthesize to arrive at an answer). 4. Structure the Task So That You Can Help Students Succeed. Good performance assessment involves the interaction of instruction with assessment. The task needs to be something that students learn from, which is most likely when there are opportunities for you to increase student proficiency by asking questions, providing resources, and giving feedback. In this kind of active teaching you are intervening as students learn, rather than simply providing information. Part of teachability is being certain that students have the needed prerequisite knowledge and skills to succeed. 5. Think Through What Students Will Do to Be Sure That the Task Is Feasible. Imagine what you would do if given the task. What resources would you need? How much time would you need? What steps would you take? It should be realistic for students to implement the task. This depends both on your own expertise and willingness and on the costs and availability of equipment, materials, and other resources so that every student has the same opportunity to be successful. 6. The Task Should Allow for Multiple Solutions. If a performance task is properly structured, more than one correct response is not only possible but desirable. The task should not encourage drill or practice for which there is a single solution. The possibility of multiple solutions encourages students to personalize the process and makes it easier for you to demand that students justify and explain their assumptions, planning, predictions, and other responses. Different students may take different paths in responding to the task. Questions for Discussion Can you cite examples of assessment wherein students can give multiple correct answer to a question? 7. The Task Should Be Clear. An unambiguous set of directions that explicitly indicates the nature of the task is essential. If the directions are too vague, students may not focus on the learning targets or may waste time trying to figure out what they should be doing. A task such as “Give an oral report on a foreign country” is too general. Students need to know the reason for the task, and the directions should provide sufficient detail so that students know how to proceed. Do they work alone or with others? What resources are available? How much time do they have? What is the role of the teacher? Here is an example of a clearly defined task: Your assignment is to construct an original experiment that will show what causes objects to sink. Your answer should include examples that illustrate three characteristics. In demonstrating your answer you will have five minutes to show different objects sinking in water, accompanied by explanations of how each characteristic is important. 8. The Task Should Be Challenging and Stimulating to Students. One of the things you hope for is that students will be motivated to use their skills and knowledge to be involved and engaged, sometimes for many days or weeks. You also want students to monitor themselves and think about their progress. This is more likely to occur when the task is something students can get excited about or can see some relevance for, and when the task is not too easy or too difficult. Persistence is fostered if the task is interesting and thought provoking. This is easier if you know your students’ strengths and limitations and are familiar with what kinds of topics would motivate them. One approach is to blend what is familiar with novelty. Tasks that are authentic are not necessarily stimulating and challenging. 9. Include Explicitly Stated Scoring Criteria. By now you are familiar with this admonition. Specifying criteria helps students understand what they need to do and communicates learning priorities and your expectations. Students need to know about the criteria before beginning work on the task. Sometimes criteria are individually tailored to each task; others are more generic for several different kinds of tasks. What is shared with students as part of the task, however, may not be the same instrument or scale you use when evaluating their work. The identification of criteria, and how you translate those criteria into a scale for evaluation, is discussed in the next section. From a practical perspective, the development of the task and scoring criteria is iterative: One influences the other as both are developed. 10.Include Constraints for Completing the Task. It’s best if the performance is done under constraints that are defined by context, rules, and regulations. According to Borich and Tombari (2004), these constraints include the following: Time. How much time should a learner or group of learners have to plan, revise, and finish the task? Reference material. What resources (dictionaries, textbooks, class notes, CDROMs) will learners be able to consult while they are completing the assessment task? Other people. Will your learners be able to ask for help from peers, teachers, and experts as they take a test or complete a project? Equipment. Will your learners have access to computers, calculators, spell checkers, or other aids or materials as they complete the assignment? Scoring criteria. Will you inform your learners about the explicit standards that you use to evaluate the product or performance? The intent of considering such constraints is to define in a more realistic way the nature of the situation in which the performance or product is demonstrated. Performance tasks will vary, depending on your style of teaching, learning targets, students, and context. Most of the variance will be contained in the following: Is the task individual, small group, or large group? Does the task focus on process or product, or both? Is the task short or long? Is the task contained in the classroom or will it require activities outside of class? What modalities for presentation are used—oral, written, or psychomotor? Activity 2.5 With your group, continue working on Worksheet 2.2 and Worksheet #2.3. This time you will write a clear and detailed description of what the students are expected to do in order to accomplish the task. Write the procedure using a first person pronoun just like you are actually conversing with them. 3.5. Performance Criteria After students have completed the task, will then review their work and make a professional judgment about its quality. Rather than relying on unstated rules for making these judgments, performance assessments include performance criteria, what you call on or use to determine student proficiency. Performance criteria, then, serve as the basis for evaluating the quality of student work. Performance criteria (or scoring criteria or simply criteria) are what you look for in student responses to evaluate their progress toward meeting the learning target. In other words, performance criteria are the dimensions or traits in products or performance that are used to illustrate and define understanding, reasoning, and proficiency. Explicitly defined performance criteria help to make what is a subjective process clear, consistent, and defensible. To establish good criteria, you should begin with identification of the most important dimensions or traits of the performance or product. This is a summary of the essential qualities of student proficiency. These dimensions should reflect your instructional goals as well as teachable and observable aspects of the perfor-mance. Ask yourself this question: “What distinguishes an adequate from an inad-equate demonstration of the target?” One of the best ways to identify criteria is to work backward from examples of student work. These examples (or exemplars) are analyzed to determine what traits or dimensions distinguish them and are used as the basis for concluding that one student’s work meets a specific standard or target. The dimensions become criteria. For example, for evaluating a speech, dimensions could include content, organization, and delivery. Delivery may be composed of additional criteria, such as posture, gestures, facial expressions, and eye contact. For a singing perfor-mance, you could include pitch, rhythm, diction, and tone quality as criteria, then determine additional criteria for each of these four. As you might imagine, you can go into great detail describing dimensions. But to be practical, you need to bal-ance specificity with what is manageable. The following is an examples of reasonable criteria for a specific learning target. Learning target: Students will be able to write a persuasive paper to encourage the reader to accept a specific course of action or point of view.Criteria: Appropriateness of language for the audience. Plausibility and relevance of supporting arguments. Level of detail presented. Evidence of creative, innovative thinking. Clarity of expression. Organization of ideas. Learning target: Pupils will be able to write the letters of the alphabet in cursive form correctly Criteria: Letter formation Letter slant Neatness Relationship to line Criteria used to evaluate a Power Point Presentation 1. The topic has been extensively and accurately research. 2. A storyboard, consisting of logically and sequentially numbered slides, has been developed. 3. The introduction is interesting and engages the audience. 4. The fonts are easy to read and font size vary appropriately for headings and texts. 5. The use of italics, bold, and underline contribute to the overall presentation. 6. The background and color enhance the text. 7. The graphics, animation, and sounds entrance the overall presentation. 8. Graphics are of proper size. 9. The text is free of spelling, punctuation, capitalization, and grammatical errors Activity 2.6 You are now about to complete your Performance Task Design for Restrictive-type and Extended-type Performance Assessment by identifying the performance criteria for the task that your group has identified. Complete the entry for Criteria for Evaluation in Worksheet 2.2 and Worksheet #2.3. Get more ideas about performance criteria from the materials in Other Learning Resources in this Unit. Key Points The performance task defines what students are required to do. Restricted-type tasks target a narrowly defined skill and have a brief response. Extended-type tasks target complex tasks and have extensive responses. These may take several days or even weeks to complete. The task description needs to clearly indicate the target, student activities, resources needed, teacher role, administrative procedures, and scoring procedures. Effective tasks have multiple targets that integrate essential content and skills, are grounded in real-world contexts, rely on teacher help, are feasible, allow for multiple solutions, are clear, are challenging and stimulating, and include scoring criteria. Criteria are narrative descriptions of the dimensions used to evaluate the students. Scoring and Evaluating The second essential part of evaluating performance assessments is to have a well- developed, clear approach to scoring and evaluating the extent to which different levels of the criteria are demonstrated. There are three common approaches to this scoring—checklists, rating scales, and rubrics (see Figure 12.4) Figure 2.4 Types of Performance Assessment Scoring 4.1. Checklists A checklist is a simple listing of the criteria or dimension , such as behaviors, traits, or characteristics that can be scored as either present or absent. You will simply check whether or not each criterion was met or each dimension demonstrated. It is a yes/no type of decision. Checklists are good for evaluating a sequence of steps that are required. For example, it would make sense to use a checklist to evaluate whether a student followed the proper steps in using a microscope or diagnosing a rough-sounding motor. Figure 2.5 shows an example of a checklist that could be used to evaluate a PowerPoint presentation. Figure 2.5 Checklist for Evaluating a PowerPoint Presentation Below is a checklist for using a microscope: Figure 2.5 Checklist for Using A Microscope No opportunity Observed to observe Wipes slides with lens paper Places drop or two of culture on slide Adds a few drops of water Places slide on stage Turns to low power Looks through eyepiece with one eye Adjusts mirror Turns to high power Adjusts for maximum enlargement and resolution Checklist are best suits for complex behaviors or performances that can be divided into a series of clearly defined, specific actions. Dissecting a frog, bisecting an angle, balancing a scale, making an audio tape recording, or tying a shoe are behaviors that require sequences of action that can be clearly identified and listed on a checklist. Checklists are scored on a yes/no, present or absent, 0 or 1 point basis and should provide the opportunity for observers to indicate that they had no opportunity to observe the performance. Some checklists also include frequent mistakes that learners make when performing the task. in such cases, a score of +1 maybe given for each positive behavior, -1 for each mistake, and 0 for no opportunity to observe. Activity 2.7 With your group, think of a possible performance assessment that you will implement in the subject you will teach. Create a checklist for the performance that you identified. Use Worksheet #4 4.2. Rating Scales A rating scale is used to indicate the degree to which a particular dimension is present, beyond a simple yes/no. It provides a way to record and communicate qualitatively different levels of performance. Several types of rating scales are available; we will consider three: numerical, qualitative, and numerical/quantita-tive combined. The numerical scale uses numbers on a continuum to indicate different levels of proficiency in terms of frequency or quality. The number of points on the scale can vary, from as few as 2 to 10 or more. The number of points is determined on the basis of the decision that will be made. If you are going to use the scale to indi-cate low, medium, and high, then 3 points are sufficient. More points on the scale permit greater discrimination, provide more diagnostic information, and permit more specific feedback to students. Here are some examples of numerical scales: Complete Understanding 5 4 3 2 1 No Understanding of the Problem of the Problem Little or No Organization 1 2 3 4 5 6 7 Clear and Complete Organization Emergent Reader 1 2 3 Fluent Reader A qualitative scale uses verbal descriptions to indicate student performance. There are two types of qualitative descriptors. One type indicates the different gradations of the dimension: Minimal Partial Never Seldom Occasionally Frequently Always Consistently Sporadically Rarely Complete Nearly complete Some Limited understanding understanding understanding understanding Uses capital Uses capital Rarely uses letter letters capital letters appropriately appropriately appropriately most or all some of the of the time time Always speaks Speaks clearly Speaks clearly Rarely speak clearly most of the some of the time clearly time A second type of qualitative scale includes gradations of the criteria and some indication of how the performance compares to established standards. This is the most frequently used type of rating scale for performance assessments. Descriptors such as the following are used: novice emergent proficient advanced inadequate needs improvement good excellent excellent proficient needs improvement absent developing adequate fully developed limited partial thorough emerging developing achieving not there yet shows growth proficient Activity 2.8 With your group, think of a possible performance assessment that you will implement in the subject you will teach. Design a rating scale for the performance or product that you identified. Use Worksheet #2.5 (You may get more ideas from the internet.) 4.3. Rubrics According to Lane, as cited by Mcmillan (2018) rubrics are the most common and most effective way to score performance assessments. A rubric is a scoring guide that includes a scale that spans different levels of competency. This scale is used with the criteria to establish a two-dimensional table, with the criteria on one side and the scale on the other. Within the table are descriptions of how teachers differentiate between different scale points for each criterion. That is, a rubric uses descriptions of different levels of quality on each of the criteria. The rubric organizes and gives more detail to the criteria. They are worded in ways that communicate to students how their teacher evaluates the essence of what is being assessed. Wiggins (1998) uses the following questions to help under- stand the function of rubrics By what criteria should performance be judged? What should we look for to judge performance success? What does the range in the quality of the performance look like? How should the different levels of quality be described and distinguished from one another For example, if a teacher is evaluating the logic of an argument, one of the criteria could be the trustworthiness and relevance of supporting facts. Different levels of quality for those criteria could be expressed as follows: No supporting facts Facts presented have weak trustworthiness and relevance Facts presented have acceptable trustworthiness or relevance Facts presented are clearly trustworthy and relevant The goal of having rubrics, then, is to communicate your standards-based judgments so that it is clear how your judgments will be made. By doing this, stu- dents are informed about specific strengths and deficiencies. An example of an excellent rubric is shown in Figure 2.6. Developing Rubrics Combining several different procedures iis the best way of developing rubrics. It is helpful to begin by clarifying how the discipline defines different levels of performance. This will give you an idea of the nature and number of gradations that should be used. It is helpful to begin by clarifying how the discipline defines different levels of performance. This will give you an idea of the nature and number of gradations that should be used. It is also helpful to obtain Figure 2.6 Example of Rubric for Essay Response Recommendation Article samples of how others have described and scored performance in the area to be assessed. Another approach, alluded to earlier, is to gather performance samples and determine the characteristics of the works that distinguish effective from ineffective ones. The samples could be from students as well as so-called experts in the area. You could start by putting a group of student samples into three qualitatively different piles to indicate three levels of performance. Then examine the samples to see what distinguishes them. The identified characteristics provide the basis for the dimensions of the rating scale. At this point, you can review your initial thinking about the scale with others to see whether they agree with you. With feedback from others, you can write the first draft of the descriptors at each point of the rating scale. Use the first draft of the rubric with additional samples of student work to verify that it functions as intended. Revise as needed, and try it again with more samples of student work until you are satisfied that it provides a valid, reliable/precise, and fair way to judge student performance. Don’t forget to use student feedback as part of the process. Holistic or Analytic? An important decision is whether the rubric will be holistic or analytic. A holistic rubric is one in which each category of the scale con-tains several criteria, yielding a single score that gives an overall impression or rating. Advantages of using a holistic rubric are its simplicity and the ability to provide a reasonable summary rating. All the traits are efficiently combined, the work is scored quickly, and only one score results. For example, in gymnastics, if a student received a single holistic score between 1 and 10, in which separate judgments for various dimensions (flexibility, balance, position, etc.) are combined. The disadvantage of a holistic score is that it reveals little about what needs to be improved. Thus, for feedback purposes, holistic scores provide little specific information about what the student did well and what needs further improvement. When the purpose of the assessment is summative, at the end of a unit or course, a holistic rubric is appropriate. But even when used summatively, holistic scales can vary greatly in the specificity of what is used in the judgments. For example, the following holistic rubric for reading is rather skimpy; very little is indicated about what went into the judgment. Level 4: Sophisticated understanding of text indicated with constructed meaning. Level 3: Solid understanding of text indicated with some constructed meaning. Level 2: Partial understanding of text indicated with tenuous constructed meaning. Level 1: Superficial understanding of text with little or no constructed meaning. Popham (2007) refers to this type of holistic rubric as hypergeneral. Such rubrics are so general and limited that there is little indication of the criteria that should be used to make judgments about student proficiency. This does not provide much instructional guidance or student awareness of criteria. Contrast this rubric with the one in Figure 2.7, which is also concerned with reading. It is obvious that this more developed and specific rubric provides a detailed expla-nation of how the reading was judged and why each level was assigned. Even with this more specific scale, however, how do you judge a student who showed multiple connections between the text and the reader’s ideas/experiences but had interpretations that were not directly supported by appropriate text references? This kind of problem, in which the traits being assessed do not all conform within a single category, is almost certain to exist with holistic scales for some students. Another example of a holistic rubric is illustrated in Figure 2.8 for graphing data. Note how several criteria are included in each of the three levels. An analytic rubric (or analytic-trait rubric) is one in which each criterion receives a separate score. If analytic scoring were used in gymnastics, each crite- rion such as flexibility, balance, and position would be scored separately. This kind of rubric provides much better diagnostic information and feedback for the learner, and is more useful for formative assessment. Students are able to see their strengths and weaknesses more clearly. They are able to connect their preparation and effort with each evaluation. However, analytic rubrics take longer to create and score. Figure 2.7 Example of Holistic Rubric Figure 2.8 Holistic Rubric for Graphic Display of Data In general, to the extent possible based on practical constraints, it is best to use analytic rubrics. Like other good assessment techniques, once established, good analytic rubrics, with appropriate modifications, will serve you well for many years. An analytic rubric is illustrated in Figure 10.9. This rubric transforms the holistic one in Figure 10.8 about graphing data into an analytic one. In this example four criteria are evaluated separately—title, labels, accuracy, and neat-ness. The rubric also shows the weight that each criterion will have in determining the overall score. Actually, an analytic rubric can be as simple as a numerical scale that follows each criterion, such as the following, which could be used to evaluate creative writing: Figure 10.9 Example of Analytic Rubric for Graphic Display of Data However, such rubrics still do not indicate much about why ideas were “competent” and not “outstanding” or why vivid images were rated “marginal.” Analytic rubrics use language that is as descriptive as possible about the nature of the criterion that differentiates it from one level to the next. It will be much more helpful, for example, for students to know that “eye contact with the audience was direct and sustained for most of the presentation,” rather than receiving feedback such as “excellent” or “completely.” The difference between holistic and analytic rubrics is illustrated in Figure 2.10.The following suggestions, summarized in Figure 2.11, will provide further help as you develop rubrics. 1.Be Sure the Criteria Focus on Important Aspects of the Performance. There are many ways to distinguish between different examples of student work. You want to use those criteria that are essential in relation to the learning targets you are assessing. Because it is not feasible to include every possible way in which performances may differ, you need to identify those that are most important. For example, if you are making judgments about writing and use mechanics as one of the criteria, it would not be practical to include every grammatical rule in charac- terizing the descriptions. Rather, you need to select the few most important aspects, such as tense formation, agreement, and punctuation. Figure 2.10 Differences Between Holistic and Analytic Rubric 2.Match the Type of Rating with the Purpose of the Assessment. If your pur-pose is more global and you need an overall judgment, a holistic scale should be used. If the major reason for the assessment is to provide feedback about different aspects of the performance, an analytical approach would be best. 3.Descriptions of the Criteria Should Be Directly Observable. Try to keep the descriptions focused on behaviors or aspects of products or skills that you can observe directly. You want to use clearly visible, overt behaviors for which rela- tively little inference is required (e.g., behaviors such as loudness, eye contact, and enunciation are easily and reliably observed). It is best to avoid high-inference criteria that are judged on the basis of behavior, such as attitudes, interests, and effort, because the behaviors are easily faked and are more susceptible to rater error and bias. This means that when the target is a disposition or affective in nature, the focus needs to be on behaviors that can be directly observed. Avoid the use of adverbs that communicate standards, such as adequately, correctly, and poorly. These evaluative words should be kept separate from what is observed. Examples Poor: Demonstrates a positive attitude toward learning keyboarding skills. Improved: Voluntarily gives to the teacher or other students two reasons why it is important to learn keyboarding skills. 4.Criteria Should Be Written So That Students, Parents, and Others Under-stand Them. Recall that you should share criteria with students before instruction. The purpose of this is to encourage students to incorporate the descriptions as standards in doing their work and to self-monitor. Obviously, if the descriptions are unclear, students cannot apply them to their work, and the meaningfulness of your feedback is lessened. Consequently, pay attention to wording and phrases; write so that students easily comprehend the criteria. A helpful approach to ensure understanding is simple but often overlooked—ask the students! It is also helpful to provide examples of student work that illustrate different descriptions. 5.Characteristics and Traits Used in the Scale Should Be Clearly and Specifi-cally Defined. You need to have sufficient detail in your descriptions so that the criteria are not vague. If a few general terms are used, observed behaviors are open to different interpretations. The wording needs to be clear and unambiguous. Figure 2.11 Checklist for Writing and Implementing Rubrics Examples (wood shop assignment to build a letter holder) Poor: Construction is sound. Improved: Pieces fit firmly together; sanded to a smooth surface; glue does not show; varnish is even. Note the clarity and specificity of the analytic scale illustrated in Figure 2.12. This is an example of an excellent rubric, in this case for writing a persuasive essay. 6.Take Appropriate Steps to Minimize Scoring Error. The goal of any scoring system is to be objective and consistent. Because performance assessment involves professional judgment, some types of errors in particular should be avoided to achieve objectivity and consistency. The most common errors are associated with the personal bias and halo effects of the person who is making the judgment. Personal bias results in three kinds of errors. Generosity error occurs when the teacher tends to give higher scores; severity error results when teachers use the low end of the scale and underrate students’ performances. A third type of personal bias is central tendency error, in which students are rated in the middle. As explained earlier, the halo effect occurs when the teacher’s general impression of the student affects scores given on individual traits or performances. If the teacher has an overall favorable impression, he or she may tend to give ratings that are higher than what is warranted; a negative impression results in the opposite. The halo effect is mitigated if the identity of the student is concealed (though this is not possible with most performance assessments), by using clearly and sufficiently described criteria, and by periodically asking others to review your judgments. Halo effects can also occur if the nature of a response to one dimension, or the general appearance of the student, affects your subsequent judgments of other dimensions. That is, if the student does extremely well on the first dimension, there may be a tendency to rate the next dimensions higher, and students who look and act nice may be rated higher. Perhaps the best way to avoid the halo effect is to be aware of its potential for affecting your judgment and monitoring yourself so that it doesn’t occur. Other sources of scoring error, such as order effects and rater exhaus-tion, should also be avoided. Figure 2.12 Example of an Elementary Persuasive Essay Rubric To be consistent in the way you apply the criteria, rescore some of the first products scored after finishing all the students, and score one criterion for all stu-dents at the same time. This helps avoid order and halo effects that occur because of performance on previous dimensions. Scoring each product several times, each time on a different criterion, allows you to keep the overall purpose of the rubric in mind. 7.The Scoring System Needs to Be Feasible. There are several reasons to limit the number and complexity of criteria that are judged. First, you need to be practical with respect to the amount of time it takes to develop the scoring criteria and do the scoring. Generally, five to eight different criteria for a single product are sufficient and manageable. Second, students will be able to focus only on a limited number of aspects of the performance. Third, if holistic descriptions are too complex, it is difficult and time consuming to keep all the facets in mind. Finally, it may be difficult to summarize and synthesize too many separate dimensions into a brief report or evaluation. One last suggestion will be helpful as you design effective rubrics. Because performance assessment is well established, there are numerous examples of rubrics for every subject and grade level. Along with many books and guides, just like finding performance tasks, the Internet can be used to access all kinds of rubrics (like all material on the Internet, the quality of these examples will vary, so be a critical consumer!). Activity 29 Using the performance criteria that you identified in Activity 2.6, design a holistic rubric for the performance. Use Worksheet #2.6. Then convert your holistic rubric into an analytic rubric. Use Worksheet #2.7 Key Points Scoring performance assessment is done with checklists, rating scales, and rubrics. Rating scales are used to indicate different levels of performance. Holistic rubrics contain several dimensions together; analytic rubrics provide a separate score for each dimension. Complete scoring rubrics include both descriptions and evaluative labels for different levels of the dimension. Scoring criteria are based on clear definitions of different levels of profi-ciency and samples of student work. High-quality scoring criteria focus on important aspects of the performance,match the type of rating (holistic or analytical) with the purpose of the assessment, are directly observable, are understandable, are clearly and spe-cifically defined, minimize error, and are feasible.

ED-209 Unit 2 Performance Assessment PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue