Assessment 1 & 2 Review Material PDF
Document Details
Uploaded by ImpressedEmpowerment5089
Urdaneta City University
Tags
Related
Summary
This document provides an overview of assessment and measurement concepts, including types of measurement (norm-referenced and criterion-referenced), test types, and assessment approaches.
Full Transcript
URDANETA CITY COLLEGE of TEACHER EDUCATION UNIVERSITY Owned and operated by the City Government of Urdaneta Assessment refers to the process of...
URDANETA CITY COLLEGE of TEACHER EDUCATION UNIVERSITY Owned and operated by the City Government of Urdaneta Assessment refers to the process of gathering, describing or quantifying information about the student performance. It includes paper and pencil test, extended responses and performance assessment are usually referred to as “authentic assessment” tasks. Measurement is a process of obtaining a numerical description of the degree to which an individual possesses a particular characteristic. Measurement answers the questions “How much?” Evaluation refers to the process of examining the performance of student. It also determines whether or not the student has met the lessons’ instructional objectives. Test is an instrument or systematic procedure designed to measure the quality, ability, skill or knowledge of students by giving a set of question in a uniform manner. Since test is a form of assessment, tests also answer the question “How does individual student perform?” Testing is a method used to measure the level of achievement or performance of the learners. It also refers to the administration, scoring and interpretation of an instrument designed to elicit information about performance in a sample of a particular area of behavior. TYPES OF MEASUREMENT There are two ways of interpreting the student performance in relation to classroom instruction. These are the Norm-referenced tests and Criterion-referenced tests. Norm-referenced Test is a test designed to measure the performance of a student compared with other students. Each individual is compared with other examinees and assigned a score usually expressed as a percentile, a grade equivalent score or a stanine. The achievement of student is reported for broad skill areas, although some norm-referenced tests do report student achievement for individual. The purpose is to rank student with respect to the achievement of the others in broad areas of knowledge and to discriminate high and low achievers. Criterion-referenced Test is a test designed to measure the performance of the students with respect to some particular criterion or standard. Each individual is compared with a pre-determined set of standard for acceptable achievement. The performance of the other examinees is relevant. A student’s score is usually expressed a percentage and student achievement is reported for individual skills. The purpose is to determine whether each student has achieved specific skills or concepts. And to find out how much students know before instruction begins and after it has finished. Common characteristics of Norm-referenced Tests and Criterion-referenced Tests (Linn et. al., 1995) 1. Both require a specification of the achievement domain to be measured. 2. Both require a relevant and representative sample of test items. 3. Both use the same types of test items. 4. Both used the same rules for item writing (except for item difficulty) 5. Both are judge with the same qualities of goodness (validity and reliability) 6. Both are useful in educational assessment. Differences between Norm-referenced Tests and Criterion-referenced Tests: Norm-Referenced Tests Criterion-Referenced Tests 1. Typically covers a large domain of learning tasks, 1. Typically focuses on a delimited domain of with just a few items measuring each specific learning tasks, with a relative large number of task. items measuring each specific task. 2. Emphasizes discrimination among individuals in 2. Emphasizes among individuals can and terms of relative level of learning. cannot perform. 3. Favors items of large difficulty and typically omits 3. Matches item difficulty to learning tasks, very easy and very hard items. without altering item difficulty or omitting easy and hard items. 4. Interpretation requires a clearly defined group. 4. Interpretation requires a clearly defined and delimited achievement domain. TYPES OF ASSESSMENT There are four types of assessment in terms of their functional role in relation to classroom instruction. These are the placement assessment, diagnostic assessment, formative assessment and summative assessment. A. Placement Assessment is concerned with the entry performance of student. The purpose of placement evaluation is to determine the prerequisite skills, degree of mastery of the course objectives and the best mode of learning. B. Diagnostic Assessment is a type of assessment given before the instruction. It aims to identify the strengths and weaknesses of the students regarding the topics to be discussed. The purpose of diagnostic assessment is: 1. to determine the level of competence of the students; 2. to identify the students who have already knowledge about the lessons; and 3. to determine the causes of learning problems and formulate a plan for remedial action. C. Formative Assessment is a type of assessment used to monitor the learning progress of the students during and after instruction. Purposes of formative assessment: 1. to provide feedback immediately to both student and teacher regarding the success and failures of learning; 2. to identify the learning errors that is in need of correction; and 3. to provide information to the teacher for modifying instruction and used for improving learning and instruction. D. Summative Assessment is a type of assessment usually given at the end of a course or unit. Purposes of summative assessment: 1. to determine the extent to which the instructional objectives have been met; 2. to certify student mastery of the intended outcome and used for assigning grades; 3. to provide information for judging appropriateness of the instructional objectives; and 4. to determine the effectiveness of instruction. MODES OF ASSESSMENT A. Traditional Assessment 1. Assessment in which students typically select an answer or recall information to complete the assessment. Test may be standardized or teacher made test. These tests may be multiple-choice, fill-in-the-blanks, true-false, or matching type. 2. Indirect measures of assessment since the test items are designed to represent competence by extracting knowledge and skills from their real life context. 3. Items on standardized instruments tend to test only the domain of knowledge and skill to avoid ambiguity to the test takers. 4. One-time measures rely on a single correct answer to each item. There is a limited potential for traditional test to measure higher order thinking skills. B. Performance Assessment 1. Assessment in which students are asked to perform real-world tasks that demonstrate meaningful application of essential knowledge and skills. 2. Direct measures of student performance because tasks are designed to incorporate contexts, problems, and solution strategies that students would use in real life. 3. Designed ill-structured challenges since the goal is to help students prepare for the complex ambiguities in life. 4. Focus on processes and rationales. There is no single correct answer; instead students are led to craft polished, thorough and justifiable responses, performances and products. 5. Involve long-range projects, exhibits, and performances are linked to the curriculum. 6. Teacher is an important collaborator in creating tasks, as well as in developing guidelines for scoring and interpretation. C. Portfolio Assessment 1. Portfolio is a collection of student’s work specifically selected to tell a particular story about the student. 2. A portfolio is not a pile of student work that accumulates over a semester or a year. 3. A portfolio contains a purposefully selected subset of student work. 4. It measures growth and development of students. Factors to Consider when Constructing Good Test Items A. Validity is the degree to which the test measures what is intended to measure. It is the usefulness of the test for a given purpose. A valid test is always reliable. B. Reliability refers to the consistency of score obtained by the same person when retested using the same instrument or one that is parallel to it. C. Administrability refers to the uniform administration of test to all students so that the scores obtained will not vary due to factors other than differences of the students’ knowledge and skills. There should be a clear provision for instruction for the students, proctors and even who will check the test or the scorer. D. Scorability. The test should be easy to score, directions for scoring is clear, provide the answer sheet and the answer key. E. Appropriateness. The test item that the teacher constructs must assess the exact performances called for in the learning objectives. The test item should require the same performance of the student as specified in learning objectives. F. Adequacy. The test should contain a wide sampling of items to determine the educational outcomes or abilities so that the resulting scores are representatives of the total performance in the areas measured. G. Fairness. The test should not be biased to the examinees. It should not be offensive to any examinee subgroups. A test can only be good if it is also fair to all test takers. H. Objectivity represents the agreement of two or more raters or test administrators concerning the score of a student. If the two raters who assess the same student on the same test cannot agree on score, the test lacks objectivity and the score of neither judge is valid, thus, lack of objectivity reduces test validity in the same way that lack reliability influence validity. TABLE OF SPECIFICATION Table of Specification is a device for describing test items in terms of the content and the process dimensions. That is, what a student is expected to know and what he or she is expected to do with that knowledge. It is described by combination of content and process in the table of specification. Sample of One way table of specification in Linear Function Content Number of Class Number of Items Test Item Sessions Distribution 1. Definition of Linear Function 2 4 1-4 2. Slope of a Line 2 4 5-8 3. Graph of Linear Function 2 4 9-12 4. Equation of Linear Function 2 4 13-16 5. Standard Forms of a Line 3 6 17-22 6. Parallel and Perpendicular Lines 4 8 23-30 7. Applications of Linear Functions 5 10 31-40 TOTAL 20 40 40 Example: Number of items for the topic “definition of linear function” Number of class sessions = 2 Desired number of items = 40 Total number of class sessions = 20 Number of items = Number of class sessions x desired total number of items Total number of class sessions 2 40 = Number of items = 40 20 Sample of Two Way table of specification in Linear Function Content Class Knowledg Comprehensi Applicatio Analysi Synthesi Evaluatio Tota Hours e on n s s n l 1. Definition of 2 1 1 1 1 4 Linear Function 2. Slope of a Line 2 1 1 1 1 4 3. Graph of Linear 2 1 1 1 1 4 Function 4. Equation of 2 1 1 1 1 4 Linear Function 5. Standard Forms 3 1 1 1 1 1 1 6 of a Line 6. Parallel and 4 2 2 2 2 8 Perpendicular Lines 7. Applications of 5 1 1 3 2 3 10 Linear Functions TOTAL 20 4 6 8 8 7 7 40 ITEM ANALYSIS Item analysis refers to the process of examining the student’s response to each item in the test. There are two characteristics of an item: desirable and undesirable characteristics. An item that has desirable characteristics can be retained for subsequent use and that with undesirable characteristics is either be revised or rejected. Three criteria in determining the desirability and undesirability of an item: a. difficulty of an item b. discriminating power of an item c. measures of attractiveness Difficulty index (DF) refers to the proportion of the number of students in the upper and lower groups who answered an item correctly. In a classroom achievement test, the desired indices of difficulty not lower than 0.20 nor higher than 0.80. The average index of difficulty from 0.30 or 0.40 to a maximum of 0.60. PUG PLG DF PUG= proportion of the upper group who got an item right 2 PLG= proportion of the lower group who got an item right Level of Difficulty of an Item Index Range Difficulty Level 0.00 – 0.20 Very Difficult 0.21-0.40 Difficult 0.41-0.60 Moderately Difficult 0.61-0.80 Easy 0.81-1.00 Very Easy Index of Discrimination Discrimination index is the difference between the proportion of high performing students who got the item right and the proportion of low performing students who got an item right. The high and low performing students usually defined as the upper 27% of the students based in the total examination score and the lower 27% of the students based on the total examination score. Discrimination index is the degree to which the item discriminates between high performing group and low performing group in relation of scores on the total test. Index of discrimination are classified into positive discrimination, negative discrimination and zero discrimination. Positive Discrimination – if the proportion of the students who got an item right in the upper performing group is greater than the proportion of the low performing group. Negative Discrimination – if the proportion of the students who got an item right in the low performing group is greater than the students in the upper performing group. Zero Discrimination – if the proportion of the students who got an item right in the upper performing group and low performing group are equal. Discrimination Index Item Evaluation 0.40 and up Very Good Item 0.30 – 0.38 Reasonably good item but possibly subject to improvement 0.20 – 0.29 Marginal item, usually needing and being subject to improvement Below 0.19 Poor item, to be rejected or improved by revision Maximum Discrimination is the sum of the proportion of the upper and lower groups who answered the item correctly. Possible maximum discrimination will occur if the half or less of the sum of the upper and lower groups answered an item correctly. Discriminating Efficiency is the index of discrimination divided by the maximum discrimination. Notations: PUG= proportion of the upper group who got an item right PLG= proportion of the lower group who got an item right Di = discrimination index DM = maximum discrimination DE = discrimination efficiency Formula: Di PUG PLG Di DE DM DM PUG PLG Example: Eighty students took an examination in Algebra, 6 students in the upper group got the correct answer and students in the lower group got the correct answer from item number 6. Find the Discriminating Efficiency. Given: Number of students took the exam = 80 27% of 80 = 21.6 or 22, which means that there are 22 students in the upper performing group and 22 students in the lower performing group. 6 PUG 27% 22 4 PLG 18% 22 Di PUG PLG 27% 18% 9% D DE i DM.09 .45 0.20or 20% DM PUG PLG 27% 18% 45% VALIDITY OF A TEST Validity refers to the appropriateness of score-based inferences; or decisions made based on the students’ test results. The extent to which a test measures what it’s supposed to measure. Important Things to Remember About Validity 1. Validity refers to the decisions we make, and not to the test itself or to the measurement. 2. Like reliability, validity is not an all or nothing concept; it is never totally absent or absolutely perfect. 3. A validity estimate, called a validity coefficient, refers to specific type of validity. It ranges between 0 to 1. 4. Validity can never be finally determined; it is specific to each administration of the test. TYPES OF VALIDITY 1. Content Validity – a type of validation that refers to the relationship between a test and the instructional objectives, establishes content so that the test measures what it is supposed to measure. Things to remember about validity: a. The evidence of the content validity of your test is found in the Table of Specification b. This is the most important type of validity to you, as a classroom teacher. c. There is no coefficient for content validity. It is determined judgmentally, not empirically. 2. Criterion-related Validity – a type of validation that refers to the extent to which scores from a test relate to theoretically similar measures. It is a measure of how accurately a student’s current test score can be used to estimate a score on a criterion measure, like performance in courses, classes or another measurement instrument. a. Construct Validity – a type of validation that refers to a measure of the extent to which a test measures a hypothetical and unobservable variable or quality such as intelligence, math achievement, performance anxiety, etc. it established through intensive study of the test or measurement instrument. b. Predictive Validity – a type of validation that refers to a measure of the extent to which a person’s current test results can be used to estimate accurately what that person’s performance or other criterion, such as test scores will be at the later time. 3. Concurrent Validity – a type of validation that requires the correlation of the predictor or concurrent measure with the criterion measure. Using this, we can determine whether a test is useful to us as predictor or a substitute (concurrent) measure. The higher the validity coefficient, the better the validity evidence of the test. In establishing the concurrent validity evidence, no time interval is involved between the administration of the new test and the criterion or established test. Factors Affecting the Validity of a Test Item 1. The test itself. 2. The administration and scoring of a test. 3. Personal factors influencing how students response to the test. 4. Validity is always specific to a particular group. Ways to Reduce the Validity of the Test Item 1. Poorly constructed test items 2. Unclear directions 3. Ambiguous items 4. Reading vocabulary too difficult 5. Complicated syntax 6. Inadequate time limit 7. Inappropriate level of difficulty 8. Unintended clues 9. Improper arrangement of items RELIABILITY OF A TEST Reliability refers to the consistency of measurement; that is, how consistent test results or other assessment results from one measurement to another. We can say that a test is reliable when it can be used to predict practically the same scores when test administered twice to the same group of students and with a reliability index of 0.50 or above. Factors Affecting the Reliability of a Test 1. Length of the test 2. Moderate item difficulty 3. Objective scoring 4. Heterogeneity of the student group 5. Limited time Four Methods of Establishing Reliability 1. Test-Retest Method. A type of reliability determined by administering the same test twice to the same group of students with any time interval between tests. The result of the test scores are correlated using the Pearson Product Correlation Coefficient (r) and this correlation coefficient provides a measure of stability. This indicates how stable the test result over a period of time. 2. Equivalent-Form Method (Parallel or Alternate). A type of reliability determined by administering two different but equivalent forms of the test to the same group of students in close succession. The equivalent forms are constructed to the same set of specifications that is similar in content, type of items and difficulty. The result of the test scores are correlated using the Pearson Product Correlation Coefficient (r) and the correlation coefficient provides a measure of the degree to which generalization about the performance of students from one assessment to another assessment is justified. It measures the equivalence of the tests. 3. Split-half Method. Administer test once, score two equivalent halves of the test. To split the test into halves that are equivalent, the usual procedure is to score the even-numbered and the odd-numbered separately. This provides two score for each student. The result of the test scores are correlated using the Spearman- Brown Formula and this correlation coefficient provides a measure of internal consistency. It indicates the degree to which consistent results are obtained from two halves of the test. 4. Kuder-Richardson Formula. Administer the test once, core total test and apply the Kuder-Richardson Formula. The Kuder-Richardson formula is applicable only in situation where students’ responses are scored dichotomously and therefore is most useful with traditional test items that are scored as right or wrong. KR-20 estimates of reliability that provides information about the degree to which the items in the test measure the same characteristics. It is an assumption that all items are of equal difficulty. RUBRICS Rubrics is a scoring scale and instructional tool to assess the performance of student using a task-specific set of criteria. It contains two essential parts: the criteria for the task and levels of performance of each criterion. It provides teachers an effective means of students-centered feedback and evaluation of the work of the students. It also enables teachers to provide detailed and informative evaluations of their performance. Rubrics is very important most especially if you are measuring the performance of students against a set of standard or pre-determined set of criteria. Through the use of scoring rubrics, the teachers can determine the strengths and weaknesses of the students. Hence, it enables the students to develop their skills. Types of Rubrics 1. Holistic Rubrics – does not list separate levels of performance for each criterion. Rather, holistic rubrics assigns a level of performance along with a multiple criteria as a whole, in other words, you put all the components together. Advantages: quick scoring, provide overview of students’ achievement. Disadvantages: does not provide detailed information about the student performance in specific areas of the content and skills, may be difficult to provide one overall score. Example: 3 – Excellent Researcher includes 10-12 sources no apparent historical inaccuracies can easily tell where the sources of information was drawn from all relevant information is included 2 – Good Researcher includes 5-9 sources few historical inaccuracies can tell with difficulty where information came from bibliography contains most relevant information 1 – Poor Researcher includes 1-4 sources lots of historical inaccuracies cannot tell from which source of information came bibliography contains very little information 2. Analytic Rubrics – the teacher or the rater identify and assess components of a finished product. Breaks down the final product into component parts and each part are scored independently. The total score is the sum of all the rating for all the parts that are to be assessed or evaluated. In analytic scoring, it is very important for the rater to treat each part as separate to avoid bias toward the whole product. Advantages: more detailed feedback, scoring more consistent across students and graders Disadvantage: time consuming to score Example: Criteria Limited Acceptable Proficient 1 2 3 Made good observations observations are most observations all observations are absent of vague are clear and clear and detailed detailed Made good predictions predictions are most predictions are all predictions are absent or irrelevant reasonable reasonable Appropriate conclusion conclusion is absent conclusion is conclusion is or inconsistent with consistent with consistent with observations most observations observations Advantages of Using Rubrics When assessing the performance of the students using performance based assessment, it is very important to use scoring rubrics. The advantages of using rubrics in assessing students’ performance are: 1. Rubrics allow assessment to become more objective and consistent; 2. Rubrics clarify the criteria in specific terms; 3. Rubrics clearly show the students how work will be evaluated and what is expected; 4. Rubrics promote student awareness of the criteria to use in assessing peer performance; 5. Rubrics provide useful feedback regarding the effectiveness of the instruction; and 6. Rubrics provide benchmarks against which to measure and document process Steps in Developing Rubrics 1. Identify your standards, objectives and goals for your students. Standard is a statement of what the students should be able to know or be able to perform. It should indicate that your students should meet these standards. Know also the goals for instruction, what are the learning outcomes? 2. Identify the characteristics of a good performance on that task, the criteria. When the students perform or present their work, it should indicate that they performed well in the task given to them; hence, they met that particular standard. 3. Identify the levels of performance for each criterion. There is no guidelines with regards to the number of levels of performance, it vary according to the task and needs. It can have as few as two levels of performance or as many as the teacher can develop. In this case, the rater can sufficiently discriminate the performance of the students in each criterion. Through this level of performance, the teacher or the rater can provide more detailed feedback about the performance of the students. It is easier also for the teacher and students to identify the areas needed for improvement. PERFORMANCE BASED ASSESSMENT Performance based assessment is a direct and systematic observation of the actual performances of the students based from a pre-determined performance criteria. It is an alternative form of assessing the performance of the students that represent a set of strategies for the application of knowledge, skills, and work habits through the performance of tasks that are meaningful and engaging to students. Framework of Assessment Approaches Selection Type Supply Type Product Performance True-False Completion Essay, story or poem Oral representation of report Multiple-Choice Label a diagram Writing portfolio Musical, dance, or dramatic performance Matching Type Short answer Research report Typing test Concept map Portfolio exhibit, Art Diving exhibit Writing journal Laboratory demonstration Cooperation in group works Forms of Performance Based Assessment 1. Extended Response Task a. Activities for single assessment may be multiple and varied. b. Activities may be extended over a period of time. c. Products from different students may be different in focus. 2. Restricted-Response Tasks a. Intended performances more narrowly defined that on extended-response tasks. b. Questions may begin like multiple-choice or short-answer stem, but then asks for explanation, or justification. c. May have introductory material like an interpretative exercise, but then asks for an explanation of the answer, not just the answer itself. 3. Portfolio is a purposeful collection of student work that exhibits the student’s efforts, progress and achievements in one or more areas. Uses of Performance Based Assessment 1. Assessing the cognitive complex outcomes such as analysis, synthesis and evaluation. 2. Assessing non-writing performances and products. 3. Must carefully specify the learning outcomes and construct activity or task that actually called forth. Focus of Performance Based Assessment Performance Based Assessment can assess the process, product or both depending on the learning outcomes. It also involves doing rather than just knowing about the activity or task. The teacher will assess the effectiveness of the process or procedures and the product used in carrying out the instruction. Use the process when: 1. There is no product; 2. The process is orderly and directly observable; 3. Correct procedures/ steps is crucial to later success; 4. Analysis of procedural steps can help in improving the product; and 5. Learning is at the early stage. Use the product when: 1. Different procedures result in an equally good product; 2. Procedures not available for observation; 3. The procedures have been mastered already; and 4. Products have qualities that can be identified and judged. Assessing the Performance The final step in performance assessment is to assess and score the students’ performance. To assess the performance of the students, the evaluator can use checklist approach, narrative or anecdotal approach, rating scale approach, and memory approach. The evaluator can give feedback on students’ performance in the form of a narrative report or a grade. There are different ways to record the results of performance-based assessment: 1. Checklist Approach are observation instruments that divide performance whether it is certain or not certain. The teacher has to indicate only whether or not certain elements are present in the performance. 2. Narrative/Anecdotal Approach is a continuous description of student behavior as it occurs, recorded without judgment or interpretation. The teacher will write narrative reports of what was done during each of the performances. From these reports, teachers can determine how well their students met their standards. 3. Rating Scale Approach is a checklist that allows the evaluator to record information on a scale, noting the finer distinction that just presence or absence of a behavior. The teacher will indicate to what degree the standards were met. Usually, teachers will use a numerical scale. 4. Memory Approach. The teacher observes the students when performing the tasks without taking any notes. They use the information from their memory to determine whether or not the students were successful. This approach is not recommended to use for assessing the performance of the students. PORTFOLIO ASSESSMENT Portfolio assessment is the systematic, longitudinal collection of student work created in response to specific, known instructional objectives and evaluated in relation to the same criteria. Student portfolio is a purposeful collection of student work that exhibits the student’s efforts, progress and achievements in one or more areas. The collection must include student participation in selecting contents, the criteria for selection, the criteria for judging merit and evidence of student self-reflection. Comparison of Portfolio and Traditional Forms of Assessment Traditional Assessment Portfolio Assessment Measures student’s ability at one time Measures student’s ability over time Done by the teacher alone, students are not aware Done by the teacher and the students; the students of the criteria are aware of the criteria Conducted outside instruction Embedded in instruction Assigns student a grade Involves student in own assessment Does not capture the students’ language ability Capture many facets of language learning performance Does not include the teacher’s knowledge of Allows for expression of teacher’s knowledge of student as a learner student as a learner Does not give student responsibility Students learn how to take responsibility Three Types of Portfolio 1. Working Portfolio It is also known as “teacher-student portfolio” as the name implies that it is a project “in the work”. It contains the work in progress as well as the finished samples of work use to reflect on process by the students and teachers. It documents the stages of learning and provides a progressive record of student growth. This is an interactive teacher-student portfolio that aids in communication between teacher and student. The working portfolio may be used to diagnose student needs. In this, both student and teacher have evidence of students’ strengths and weaknesses in achieving learning objectives, information extremely useful in designing future instruction. 2. Showcase Portfolio It is also known as best works portfolio or display portfolio. In this king of portfolio, it focuses on the student’s best and most representative work, it exhibit the best performance of the student. Best works portfolio may document student efforts with respect to curriculum objectives; it may also include evidence of student activities beyond school. It is just like an artist’s portfolio where a variety of work is selected to reflect breadth of talent. Hence, in this portfolio, the student selects what he or she thinks is representative work. The most rewarding use of student portfolios is the display of the students’ best work, the work that makes them proud. In this case, it encourages self-assessment and builds self-esteem to students. The pride and sense of accomplishment that students feel make the effort well worthwhile and contribute to a culture for learning in the classroom. 3. Progress Portfolio It is also known as Teacher Alternative Assessment Portfolio. It contains examples of students’ work with the same types done over a period of time and they are utilized to assess their progress. Uses of Portfolios 1. It can provide both formative and summative opportunities for monitoring progress toward reaching identified outcomes. 2. Portfolios allow students to document aspects of their learning that do not show up well in traditional assessments. 3. Portfolios are useful to showcase periodic or end of the year accomplishments of students such as in poetry, reflections on growth, samples of best works, etc. 4. Portfolios may also be used to facilitate communication between teachers and parents regarding their child’s achievement and progress in a certain period of time. 5. The administrators may use portfolios for national competency testing to grant high school credit, to evaluate educational programs. 6. Portfolios may be assembled for combination of purposes such as instructional enhancement and progress documentation. A teacher reviews students’ portfolios periodically and make notes for revising instruction for next year use. According to Mueller (2010), there are seven steps in developing portfolios of students. Below are the discussions of each step. 1. Purpose: What is the purpose(s) of the portfolio? 2. Audience: For what audience(s) will the portfolio be created? 3. Content: What samples of student work will be included? 4. Process: What processes (e.g., selection of work to be included, reflection on work, conferencing) will be engaged in during the development of the portfolio? 5. Management: How will time and materials be managed in the development of the portfolio? 6. Communication: How and when will the portfolio be shared with pertinent audiences? 7. Evaluation: if the portfolio is to be used for evaluation, when and how should it be evaluated? Guidelines for Assessing Portfolios 1. Include enough documents (items) on which to base judgment. 2. Structure the contents to provide scorable information. 3. Develop judging criteria and a scoring scheme for raters to use in assessing the portfolios. 4. Use observation instruments such as checklists and rating scales when possible to facilitate scoring. 5. Use trained evaluators or assessors. Traditional Assessment – it refers to the use of pen-and-paper objective test. Alternative Assessment – it refers to the use of methods other than pen-and-paper objective test which includes performance test, projects, portfolios, journals, and the likes. Authentic Assessment – it refers to the use of assessment methods that simulate true-to-life situations. This could be objective test that reflect real-life situations or alternative methods that are parallel to what we experience in real life. Features of authentic Assessment 1. Meaningful performance task 2. Clean standards and public criteria 3. Quality products and performance 4. Positive interaction between the assessee and assessor 5. Emphasis on meta-cognition and self-evaluation 6. Learning that transfer Performance-based Assessment Performance-based assessment is a process of gathering information about student’s learning through actual demonstration of essential and observable skills and creation of products that are grounded in real world contexts and constraints. It is an assessment that is open to many possible answers and judged using multiple criteria or standards of excellence that are pre-specified and public. Reasons for using Performance-based Assessment 1. Dissatisfaction of the limited information obtained from selected-response test. 2. Influence of cognitive psychology, which demands not only for the learning of declarative but also for procedural knowledge. 3. Negative impact of conventional test. E.g. high stake assessment teaching for the test. 4. It is appropriate in experiential, discovery-based, integrated and problem-based learning. Types of Performance-based Task 1. Demonstration Type – this is a task that requires no product. Example: Constructing a building, cooking demonstration, entertaining tourist, teamwork presentations 2. Creation Type – this is a task that requires tangible products. Example: Project plan, research paper, project flyers, discovered Methods of Performance-Based Assessment 1. Written-Open-Ended – a written prompt is provided Formats: essays, open-ended test 2. Behavior-Based – utilizes-directs observations of behaviors in situations or simulated contexts Formats: Structured and Unstructured 3. Interview-Based – examinees respond in one-to-one conference setting with the examiner to demonstrate mastery of the skills. Formats: Structured and Unstructured 4. Product-Based – examinees create a work sample or a product utilizing the skills/abilities. Formats: Restricted and Extended 5. Portfolio-Based – collections of work that are systematically gathered to serve many purposes. How to Assess a Performance 1. Identify the competency that has to be demonstrated by the students with or without a product. 2. Describe the task to be performed by the students either individually or as a group, the resources needed; time allotment, and other requirements to be able to assess the focused competency. 7-Criteria in Selecting a Good Performance Assessment Task 1. Generalizability – the likelihood that the students’ performance on the task will generalize to comparable task. 2. Authenticity – the task is similar to what the students might encounter in the real-world as opposed essentially only in the school. 3. Multiple Foci – the task measures multiple instructional outcomes. 4. Teachability – the task allows one to master the skill that one should be proficient in. 5. Feasibility – the task is realistically implementable in relation to its cost, space, time and equipment requirements. 6. Scorability – the task can be reliably and accurately evaluated. 7. Fairness – the task is fair to all the students regardless of their social status or gender. Characteristics of Authentic Assessment 1. Authentic assessment requires the students to perform meaningful tasks in real-world situation. 2. It promotes the development of higher order thinking skills because evaluators including self and peer have to think wisely and precisely the rating most appropriate to the students’ performance from excellent down to poor. 3. It tenders direct evidence of application and construction of knowledge and skills acquired. For instance, the student demonstrates and construct on paper mosaic projects made of waste papers or old newspapers. 4. It includes portfolio collection of entries. 5. It demonstrates application of a particular knowledge and skills. 6. It fosters role-playing of the lessons learned by students which serves as show window to them. 7. It identifies performance of students’ acquired skills and expertise. For example, the student is identified on his expertise on milkfish deboning. Through this skill acquired, he could demonstrate the technology transfer of bangus deboning to the people in the community where milkfish is abundant. By doing so, the end-users can avail of this technology and put up small and medium enterprise (SME) as their livelihood project. Hence, they could augment their income, alleviate poverty, and improve quality of life. 8. It assesses directly holistic projects by multiple human judgment like self, peer, subject teacher, and teacher-adviser. 9. It trains the students to evaluate their own work as well as to their peers. 10. It is designed on criterion-referenced measure rather than norm-referenced measure. The strengths and weaknesses of the students have been identified rather than compare students’ performance with other students. Distinctions between Authentic Assessment and Traditional Assessment 1. Authentic assessment is personalized, natural and flexible relevant to the students’ level of difficulty, skills and abilities. Traditional assessment is impersonal and absolute owing to uniformity of test without regard to the skills and abilities of the students. 2. Authentic assessment is fair because skills and abilities are appropriate to the learners. Traditional is unfair since learners are forced to accept the contexts of the tests even if these are inappropriate to them. 3. Authentic assessment gives the student (self) and peer (classmate) the chance to evaluate their own work and work of their classmates, respectively. in traditional assessment, only the subject teacher evaluates the performance of the students and there is tendency of subjectivity. 4. Authentic assessment identifies the strengths and weaknesses of the students’ skills and abilities. Traditional assessment compares the performance of the students to others. 5. Authentic assessment promotes good rapport or bonding between the teacher and student due to their mutual understanding. Traditional assessment has poor relationship between teacher and students caused by impersonalized and absolute tests since the purpose is to compare the test results of students to others. 6. Authentic assessment develops the students’ responses because they are made to perform the learning task in a real-world situation. Traditional assessment requires the students to choose the options prepared by the teacher. 7. Authentic assessment gives the students freedom to choose evidence of good performance. In traditional assessment, the teacher prepares the test and students have to respond on what is asked on the test. This leaves the teacher being able to showcase his expertise. 8. Authentic assessment makes use of performance test in real-world situation and portfolio assessment. Traditional assessment involves paper-and-pencil tests in which the students are required to choose the correct answer among the options prepared by the teacher. For instance, alternative-response test, multiple choice and matching type. Samples of Traditional Assessment and Authentic Assessment Traditional Assessment Authentic Assessment 1. Which of the following water is 1. Place separately the fresh water, most acidic? Marine water, brackish water in a a. Fresh Water basin. Get a pH paper or pH meter b. Marine Water and soak it in a basin of water. c. Brackish Water Change the pH paper for every basin of water. Then record. Ask: What is the pH of fresh water? Marine water? Brackish water? Which is most acidic? Why? 2. How many milliliters (ml) 2. Get a 100ml graduated cylinder are there in one liter? and 1 liter empty soft drink bottle. a. 1, 150 ml Let the student fill the graduated b. 1, 100 ml cylinder with water and decant it to c. 1, 000 ml the empty bottle until it is filled. Ask: How many times did you fill the graduated cylinder. 3. How many grams (g) are there Get a table balance with sets of in 1 kilogram (kg)? weights. Place 1kg of mangoes on a. 1, 000 g the table balance and 10 sets of b. 1, 050 g weights of 100g each. You count the c. 1, 100 g set of weights you put on the table balance and multiply (100 x 10). Ask: How many grams are there 1 kilogram? 4. How many cups are there in 4. Get and empty 1gallon ice cream 1 gallon? Container and a measuring cup. Let a. 14 cups the student fill the cup with water b. 15 cups and pour it to the empty container c. 16 cups until it is filled up. Ask: How many cups of water did you pour into a gallon container of ice cream? How many cups are there in 1gallon? 5. How many teaspoon (tsp) are 5. Get a set of measuring spoons. there in 1 tablespoon (tbsp)? Let the student fill the teaspoon with a. 2 tsp water and pour it to the tablespoon b. 3 tsp until filled. c. 4 tsp Ask: How many times did you fill the tablespoon? How many teaspoons are there in 1 tablespoon? PERFORMANCE-BASED ASSESSMENT - a form of testing that requires students to perform a task rather than an answer from a ready-made list. PERFORMANCE BASED ASSESSMENT Performance based assessment is a direct and systematic observation of the actual performances of the students based from a pre-determined performance criteria. It is an alternative form of assessing the performance of the students that represent a set of strategies for the application of knowledge, skills, and work habits through the performance of tasks that are meaningful and engaging to students. Framework of Assessment Approaches Selection Type Supply Type Product Performance True-False Completion Essay, story or poem Oral representation of report Multiple-Choice Label a diagram Writing portfolio Musical, dance, or dramatic performance Matching Type Short answer Research report Typing test Concept map Portfolio exhibit, Art Diving exhibit Writing journal Laboratory demonstration Cooperation in group works Forms of Performance Based Assessment 4. Extended Response Task d. Activities for single assessment may be multiple and varied. e. Activities may be extended over a period of time. f. Products from different students may be different in focus. 5. Restricted-Response Tasks d. Intended performances more narrowly defined that on extended-response tasks. e. Questions may begin like multiple-choice or short-answer stem, but then asks for explanation, or justification. f. May have introductory material like an interpretative exercise, but then asks for an explanation of the answer, not just the answer itself. 6. Portfolio is a purposeful collection of student work that exhibits the student’s efforts, progress and achievements in one or more areas. Uses of Performance Based Assessment 4. Assessing the cognitive complex outcomes such as analysis, synthesis and evaluation. 5. Assessing non-writing performances and products. 6. Must carefully specify the learning outcomes and construct activity or task that actually called forth. Focus of Performance Based Assessment Performance Based Assessment can assess the process, product or both depending on the learning outcomes. It also involves doing rather than just knowing about the activity or task. The teacher will assess the effectiveness of the process or procedures and the product used in carrying out the instruction. Use the process when: 6. There is no product; 7. The process is orderly and directly observable; 8. Correct procedures/ steps is crucial to later success; 9. Analysis of procedural steps can help in improving the product; and 10. Learning is at the early stage. Use the product when: 5. Different procedures result in an equally good product; 6. Procedures not available for observation; 7. The procedures have been mastered already; and 8. Products have qualities that can be identified and judged. Assessing the Performance The final step in performance assessment is to assess and score the students’ performance. To assess the performance of the students, the evaluator can use checklist approach, narrative or anecdotal approach, rating scale approach, and memory approach. The evaluator can give feedback on students’ performance in the form of a narrative report or a grade. There are different ways to record the results of performance-based assessment: 5. Checklist Approach are observation instruments that divide performance whether it is certain or not certain. The teacher has to indicate only whether or not certain elements are present in the performance. 6. Narrative/Anecdotal Approach is a continuous description of student behavior as it occurs, recorded without judgment or interpretation. The teacher will write narrative reports of what was done during each of the performances. From these reports, teachers can determine how well their students met their standards. 7. Rating Scale Approach is a checklist that allows the evaluator to record information on a scale, noting the finer distinction that just presence or absence of a behavior. The teacher will indicate to what degree the standards were met. Usually, teachers will use a numerical scale. 8. Memory Approach. The teacher observes the students when performing the tasks without taking any notes. They use the information from their memory to determine whether or not the students were successful. This approach is not recommended to use for assessing the performance of the students. Process-Oriented Performance-Based Assessment - assessed through the tasks the students underwent in order to arrive at the products or outputs. - about what they do to what they know Competencies – groups or clusters of skills or abilities needed for a particular task - learning objectives are stated in directly observable behaviors of the students. - learning objectives focus on the behaviors which exemplify a “best practice” for a particular task. - objectives starts with general statement and then broken down into easily observable behaviors - complex contains two or more task Ex. Simple – draw a leaf Complex – draw and color a leaf with green crayon Task Designing – contributes to overall understanding of the subject or course. Some standards for designing a task: 1. Identify an activity that would highlight the competencies to be evaluated. 2. Identify an activity that would entail more or less the same sets of competencies. 3. Find a task that would be interesting and enjoyable for the students. Scoring Rubrics – scoring scale used to assess student performance along a task-specific set of criteria. It contains the essential criteria for the task and appropriate levels of performance for each criterion. Descriptors – tells students precisely what performance looks like at each level and how their work may be distinguished from the work of others for each criterion. Example: Criteria 1 2 3 Number of Appropriate Hand X1 1-4 5-9 10-12 Gestures Appropriate Facial Lots of inappropriate Few inappropriate No apparent Expression X1 facial expression expression inappropriate facial expression Voice inflection X2 Monotone voice used Can vary voice Can easily vary voice inflection with inflection difficulty Incorporate proper ambiance through Recitation contains Recitation has some feelings in the voice X3 very little feelings feelings Why include Levels of Performance? 1. Clearer expectations – for the students to know what is expected of them and teachers to know what to look for in student performance. 2. More consistent and objective assessment – distinguish between good and bad performance when evaluating students’ work 3. Better feedback – can clearly recognize areas that needs improvement Product-Oriented Performance-Based Assessment - actual student performance is assessed through a product. Competencies – linked with an assessment of the level of expertise Three Levels: (1) Novice/ Beginner Level – finished product illustrate the minimum expected parts or functions (2) Skilled Level – finished product contain additional parts and functions on top of the minimum requirements which tend to enhance the final output. (3) Expert Level – finished product contain the basic minimum parts and functions, have additional features on top of the minimum and is aesthetically pleasing. Task Designing – the design of the task depends on what the teacher to observe as outputs of the students. Concepts that may be associated with task designing: a. Complexity – needs to be within the range of the students’ ability b. Appeal – should be interesting enough so that students are encouraged to pursue the task to completion. It should lead to self-discovery of information of the students. c. Creativity – needs to encourage students to exercise creativity and divergent thinking. d. Goal-based – projects are assigned to students not just to for the sake of producing something but for the purpose of reinforcing learning. Scoring Rubrics – descriptive scoring schemes that are developed by teachers to guide the analysis of the products or processes of students’ efforts. Criteria Setting – statements which identify “what really counts” in the final output. Some of the most often used major criteria for product assessment are Quality, Creativity, Comprehensiveness, Accuracy, and Aesthetics. From the major criteria, the next task is to identify sub-statements that would make the major criteria more focused and objective. It will be noted that each score category describes the characteristics of a response that would receive the respective score. Describing the characteristics of responses within each score category increases the likelihood that two independent evaluators would assign the same score to a given response. In effect, this increases the objectivity of the assessment procedure using rubrics. In the language of test and measurement, we are actually increasing the “inter-rater reliability”. When are scoring rubrics an appropriate evaluation? There are many instances in which scoring rubrics may be used successfully: evaluate group activities, extended projects and oral presentations, etc. Where and when a scoring rubric is used does not depend on the grade level or subject, but rather on the purpose of the assessment. Assessment in the Affective Domain - part of a system that was published in 1965 for identifying, understanding and addressing how people learn. - Describes learning objectives that emphasize a feeling, an emotion, or a degree of acceptance and rejection. - far more difficult domain to objectively analyze and assess since affective objectives vary from simple attention to selected phenomena to complex but internally consistent qualities of character and conscience. “Schooled” but not “Educated” – this simply refers to the fact that much of the processes in education today are aimed at developing cognitive aspects of development and a very little or no time is spent on the development of the affective domain. The Taxonomy in the Affective Domain - contains a large number of objectives in the literature expresses as interests, attitudes, appreciation, values and emotional sets or biases. Krathwohl’s Taxonomy of Affective Domain: 1. Receiving – being aware or sensitive to the existence of certain ideas, material or phenomena and being willing to tolerate them. 2. Responding – committed in some small measure to the ideas, materials, or phenomena involved by actively responding to them. 3. Valuing – willing to be perceived by others as valuing certain ideas, materials, or phenomena. 4. Organization – to relate the value to those already held and brought into a harmonious and internally consistent philosophy. 5. Characterization by value or value set –to act consistently accordance with the values he or she has internalized. Affective Learning Competencies - Often stated in form of instructional objectives Instructional Objectives - specific, measurable, short-term, observable student behaviors. - Foundation upon which you can build lessons and assessments that you can prove meet your overall course or lesson goals. - Tools you use to make sure to reach your goals. - Used to ensure that learning is focused clearly enough that both students and teacher know what is going on, and so learning can be objectively measured. Behavioral Verbs Appropriate for the Affective Domain Receiving Responding Valuing Organization Characterization Accept Complete Accept Codify Internalize Attend Comply Defend Discriminate verify Develop Cooperate Devote Display Recognize Discuss Pursue Order Examine Seek Organize Obey Systematize Respond Weigh In the affective domain, and in particular,, when we consider learning competencies, we also consider the following focal concepts: Attitudes - a mental predisposition to act that is expressed by evaluating a particular entity with some degree of favor or disfavor. Attitudes are comprised of four components: - Cognitions – statement of beliefs and expectations which vary from one individual to the next. - Affect – feelings with respect to the focal object such as fear, liking, or anger. - Behavioral Intentions – goals, aspirations, and our expected responses to the attitude subject. - Evaluation – central component of attitudes; imputations of some degree of goodness or badness to an attitude object; functions of cognitive, affect and behavioral intentions of the object; stored in the memory. Attitudes influence the war person acts and think in a social community we belong, they can function as frameworks and references for forming conclusions and interpreting or acting for or against an individual, concept or idea. It influences behavior. Motivation – reason or set of reasons for engaging in a particular behavior. The reasons include basic needs, object, goal, state of being, and ideas that is desirable. It also refers to initiation, direction, intensity and persistence of human behavior. There are two kinds of motivation: intrinsic motivation and extrinsic motivation. Intrinsic Motivation – brings pleasure or make people feel what they are learning is morally significant. Extrinsic Motivation – comes when a student compelled to do something because of factors external to him. Theories that explain Human Motivation: Hierarchy of Human Needs Theory by Abraham Maslow - Human needs have wants and desires which influence behavior; only unsatisfied needs can influence behavior, satisfied needs cannot. - Needs are arranged in order of importance, from basic to complex (physiological, safety and security, social, self-esteem, self-actualization) - The person advances to the next level of needs only after the lower need is at least minimally satisfied - The further the progress up the hierarchy, the more individuality, humanness and psychological health a person will show. The Motivation-Hygiene Theory by Frederick Herzberg - Concludes that certain factors in the workplace result in job satisfaction, while others do not, but if absent, it leads to dissatisfaction. - Motivators – challenging work, recognition, responsibility, which give positive satisfaction - Hygiene factors – status, job security, salary and fringe benefits – do not motivate if present, but if absent, it will result in demotivation. - Like hygiene, the presence of it will make one healthier, but the absence of it will cause health deterioration. ERG Theory (Existence, Relatedness and Growth) - existence category (physiological and safety) are lower order needs - relatedness category (love and self-esteem) as middle order needs - growth category (self-actualization and self-esteem) as higher order needs Motivation in education can have several effects on how students learn and behave towards subject matter. It can direct behavior toward particular goals -- lead to increase effort and energy; increase initiation of, and persistence in activities; enhance cognitive processing and determine what consequences are reinforcing that leads to improve performance. Self-efficacy – impression that one is capable of performing in a certain manner or attaining certain goals. It is a belief that one has the capabilities to execute the courses of actions required to manage prospective situations. It relates to person’s perception of their ability to reach a goal. Development of Assessment Tools Assessment tools in the affective domain are those which are used to assess attitudes, interest, motivations and self efficacy. These include: 1. Self-Report - most common measurement tool in the affective domain that essentially requires an individual to provide an account of his attitude or feelings toward a concept or idea or people. It is also called “written reflections” 2. Rating Scales - refers to a set of categories designed to elicit information about a quantitative attribute in social science. Common examples are the Likert scale and 1-10 rating scales for which a person selects the number which is considered to reflect the perceive quality of a product. The basic feature of any rating scale is that it consists of a number of categories. These are usually assigned integers. 3. Semantic Differential (SD) Scales - tries to assess an individual’s reaction to specific words, ideas or concepts in terms of ratings on bipolar scales defined with contrasting adjectives at each end. Good ___ ___ ___ ___ ___ ___ ___ Bad 3210123 ( 3 – extreme; 2 – quite; 0 - neutral) Thurstone Scale Thurstone is considered the father of attitude measurement and addressed the issue of how favorable an individual is with regard to a given issue. He developed an attitude continuum to determine the position of favorability on the issue. Example of Thurstone Scale: Directions: Put a check mark in the blank if you agree with the item: ____ 1. Blacks should be considered the lowest class in human beings. (scale value = 0.9) ____ 2. blacks and whites must be kept apart in all social affairs where they might be taken as equals ( scale value = 3.2) _____3. I am not interested in how blacks rate socially. (scale value = 5.4) Likert Scales Likert developed the method of summated ratings (or Likert scale), which is widely used. This requires an individual to tick on a box to report whether they “strongly agree” “agree” “undecided”, “disagree” or “strongly disagree” in response to a large number of items concerning attitude object or stimulus. Likert scale is derived as follows: a. pick individual items to include. Choose individual items that you know correlate highly with the total score across items b. choose how to scale each item, or construct labels for each scale value to represent interpretation to be assigned to the number c. ask your target audience to mark each item d. Derive a target’s score by adding the values that target identifies on each item. Checklists - most common and perhaps the easiest instrument in the affective domain. It consist of simple items that the student or teacher marks as “absent” or “present” Here are the steps in the construction of a checklist: a. enumerate all the attributes and characteristics you wish to observe b. arrange this attributes as a “shopping list” of characteristics c. ask students to mark those attributes which are present and to leave blank those which are not Rubric – measuring instrument used in rating performance-based task. - the “key to corrections” for assessment tasks designed to measure the attainment of learning competencies that require demonstration skills or creation of products of learning. - offers a set of guidelines or descriptions in scoring different levels of performance or qualities of products of learning - can be used in scoring both the process and the products of learning. Similarity of Rubric with Other Scoring Instruments Rubric is a modified checklist and rating scale. 1. Checklist presents the observed characteristics of a desirable performance or product the rater checks the trait/s that has/have been observed in one’s performance or product. 2. Rating Scale measures the extent or degree to which a trait has been satisfied by one’s work or performance. offers an overall description of the different levels of quality of a work or a performance. uses 3 to more levels to describe the work or performance although the most common rating scales have 4 or 5 performance levels. Types of Rubrics Type Description Advantages Disadvantage It describes the overall It allows fast It does not clearly Holistic Rubric quality of a performance assessment. describe the degree of or product. In this rubric, It provides one score the criterion satisfied there is only one rating to describe the or not by the given to the entire work overall performance performance or or performance. or quality of work product. It can indicate the It does not permit general strengths differential weighting and weaknesses of of the qualities of a the work or product or a performance performance It describes the quality It clearly describes It is more time- Analytic of a performance or the degree of the consuming to use. Rubric product in terms of the criterion satisfied or It is more difficult to identified dimensions not by the construct. and/or criteria for which performance or are rated independently product. to give a better picture It permits differential of the quality of work or weighting of the performance. qualities of a product or a performance. It helps raters pinpoint specific areas of strengths and weaknesses. Important Elements of a Rubric Whether the format is holistic or analytic, the following information should be made available in a rubric. Competency to be tested – this should be a behavior that requires either a demonstration or creation of products of learning. Performance Task – the task should be authentic, feasible and realistic Evaluative Criteria and their Indicators – these should be measurable and observable Performance Levels – these levels could vary in number from 3 or more. Qualitative and Quantitative descriptions of each performance level – these descriptions should be observable to be measurable. Guidelines When Developing Rubrics Identify the important and observable features or criteria of an excellent performance or quality product. Clarify the meaning of each trait or criterion and the performance levels. Describe the gradations of quality product or excellent performance. Aim for an even number of levels to avoid the central tendency source of error. Keep the number of criteria reasonable enough to be observed or judged. Arrange the criteria in order in which they will likely to be observed. Determine the weight/points of each criterion and the whole work or performance in the final grade. Put the descriptions of a criterion or a performance level on the same page. Highlight the distinguishing traits of each performance level. Check if the rubric encompasses all possible traits of a work. Check again if the objectives of assessment were captured in the rubric. PORTFOLIO ASSESSMENT Portfolio assessment is the systematic, longitudinal collection of student work created in response to specific, known instructional objectives and evaluated in relation to the same criteria. Student portfolio is a purposeful collection of student work that exhibits the student’s efforts, progress and achievements in one or more areas. The collection must include student participation in selecting contents, the criteria for selection, the criteria for judging merit and evidence of student self-reflection. Comparison of Portfolio and Traditional Forms of Assessment Traditional Assessment Portfolio Assessment Measures student’s ability at one time Measures student’s ability over time Done by the teacher alone, students are not aware Done by the teacher and the students; the students of the criteria are aware of the criteria Conducted outside instruction Embedded in instruction Assigns student a grade Involves student in own assessment Does not capture the students’ language ability Capture many facets of language learning performance Does not include the teacher’s knowledge of Allows for expression of teacher’s knowledge of student as a learner student as a learner Does not give student responsibility Students learn how to take responsibility Three Types of Portfolio 4. Working Portfolio It is also known as “teacher-student portfolio” as the name implies that it is a project “in the work”. It contains the work in progress as well as the finished samples of work use to reflect on process by the students and teachers. It documents the stages of learning and provides a progressive record of student growth. This is an interactive teacher-student portfolio that aids in communication between teacher and student. The working portfolio may be used to diagnose student needs. In this, both student and teacher have evidence of students’ strengths and weaknesses in achieving learning objectives, information extremely useful in designing future instruction. 5. Showcase Portfolio It is also known as best works portfolio or display portfolio. In this king of portfolio, it focuses on the student’s best and most representative work, it exhibit the best performance of the student. Best works portfolio may document student efforts with respect to curriculum objectives; it may also include evidence of student activities beyond school. It is just like an artist’s portfolio where a variety of work is selected to reflect breadth of talent. Hence, in this portfolio, the student selects what he or she thinks is representative work. The most rewarding use of student portfolios is the display of the students’ best work, the work that makes them proud. In this case, it encourages self-assessment and builds self-esteem to students. The pride and sense of accomplishment that students feel make the effort well worthwhile and contribute to a culture for learning in the classroom. 6. Progress Portfolio It is also known as Teacher Alternative Assessment Portfolio. It contains examples of students’ work with the same types done over a period of time and they are utilized to assess their progress. Uses of Portfolios 7. It can provide both formative and summative opportunities for monitoring progress toward reaching identified outcomes. 8. Portfolios allow students to document aspects of their learning that do not show up well in traditional assessments. 9. Portfolios are useful to showcase periodic or end of the year accomplishments of students such as in poetry, reflections on growth, samples of best works, etc. 10. Portfolios may also be used to facilitate communication between teachers and parents regarding their child’s achievement and progress in a certain period of time. 11. The administrators may use portfolios for national competency testing to grant high school credit, to evaluate educational programs. 12. Portfolios may be assembled for combination of purposes such as instructional enhancement and progress documentation. A teacher reviews students’ portfolios periodically and make notes for revising instruction for next year use. According to Mueller (2010), there are seven steps in developing portfolios of students. Below are the discussions of each step. 8. Purpose: What is the purpose(s) of the portfolio? 9. Audience: For what audience(s) will the portfolio be created? 10. Content: What samples of student work will be included? 11. Process: What processes (e.g., selection of work to be included, reflection on work, conferencing) will be engaged in during the development of the portfolio? 12. Management: How will time and materials be managed in the development of the portfolio? 13. Communication: How and when will the portfolio be shared with pertinent audiences? 14. Evaluation: if the portfolio is to be used for evaluation, when and how should it be evaluated? Guidelines for Assessing Portfolios 6. Include enough documents (items) on which to base judgment. 7. Structure the contents to provide scorable information. 8. Develop judging criteria and a scoring scheme for raters to use in assessing the portfolios. 9. Use observation instruments such as checklists and rating scales when possible to facilitate scoring. 10. Use trained evaluators or assessors. Prepared by: Graciane Joy De Guzman, MA Math