Assessment of Learning 2 Notes PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document provides notes on item analysis and validation techniques for assessments. It covers topics such as item difficulty and discrimination, and how to use these techniques to improve tests.
Full Transcript
that all of the lower 25% got the correct answer while all the ITEM ANALYSIS AND upper 25% got the wrong answer VALIDATION...
that all of the lower 25% got the correct answer while all the ITEM ANALYSIS AND upper 25% got the wrong answer VALIDATION A discriminating index of 1 tells you that the item is perfectly discriminating and an ideal item that should be included in Tests are subject to item analysis and validation to ensure that the the standardize and summative tests final version of the examination would be functional, reliable and useful ACTION FOR THE INDEX RANGE INTERPRETATION The difficulty of the test items will be determined allowing TEST ITEM evaluating whether it is too easy or it is too hard. This will help to Can discriminate decide whether to revise or replace the items in a standardized test -1.0 - 0.50 but item is Discard the item for a particular unit or grading period questionable -0.49 – 0.45 Non-discriminating Revise ITEM ANALYSIS Item analysis provides you information that will allow you 0.46 – 1.0 Discriminating item Include to decide whether to revise or replace an item. ITEM DIFFICULTY Good item has good discriminating ability and has The number of learners who were able to answer the items sufficient level of difficulty correctly divided by the total number of students Can be expresses as: Needed formulas: ( ) Often expressed in percentage form RANGE OF ACTION TO BE DIFFICULTY INTERPRETATION DONE ON THE INDEX ITEM The Item Analysis Procedure will provide you the difficulty of an item, its discriminating power and the Revise the item or 0 - 0.20 Very difficult effectiveness of the other alternatives. A good item has discard good discriminating ability and has sufficient level of difficulty. The results will give you data for class 0.21 – 0.40 Difficulty Retain discussion of the assessment and improve their learning. It provides insights and skills that lead to the preparation 0.41 – 0.60 Moderately Difficult Retain of better tests in the future. VALIDATION AND RELIABILITY 0.61 – 0.80 Easy Retain Once done with the item analysis procedure and doing revision, then there is a need to validate the assessment and instrument Revise the item or 0.81 – 1.00 Very Easy VALIDITY discard What determines if a test measures what it purports to o Very Difficult items tend to discriminate between those measure or as referring to the appropriateness, correctness, who know and those who do not know the correct answer. meaningfulness and usefulness of the specific decisions a Even the brightest students can’t answer teacher makes based on the test results o Very Easy items cannot discriminate between these two Can be subdivided by two –the test itself and the decisions groups of students. Even the poorest students can answer made by the educator based on the results. INDEX OF DISCRIMINATION A test will only be valid if it aligned to the learning outcomes Derive a measure that will determine whether an item can set prior to the teaching-learning process tell the difference between the sets of learners THREE ESSENTIAL VALIDATION EVIDENCES Difference between the proportion of the upper group who Content-Related Evidence got the item right and the proportion of the lower group who Refers to the content and format of the instrument got the item right o How appropriate is the content? Ability of an item to differentiate among students on the o How comprehensive? basis of how well they know the matter being tested o Does it logically get at the intended variable? An easy to derive such a measure is to measure how difficult o How adequately does the sample of items or questions an item is with respect to those who were in the upper 25% represent the contents to be assessed? of the class and how difficult it is those who are in the lower Starts when writing the based on the table of specifications 25% found it difficult, then the item can discriminate and then asks at least (2) experts to whether the two are properly between the two group constructively aligned Experts review whether the items in the test in aligned on Can range from -1.00 (when the Dupper is 0 and Dlower is 1) to the objectives and table of specifications then approves the 1. When the index of discrimination is equal to -1, it implies items If items are for revisions, the teacher would reconstruct or revise the test item to be approved again by the experts GRADING SYSTEM Criterion-Related Evidence Refers to the relationship between the scores obtained using Grading system gives a quantitative reference to whether the instrument and scores obtained using one or more other the desired learning outcome has been achieved. It also shows the effectivity of the teaching methods and tests (criterion) pedagogy. o How well do such scores estimate present or predict future performance of a certain type? This is the quantitative reference of the teacher and the In order to obtain evidences in relevance to a certain stakeholders of the academe to know whether the learners criterion, the educator usually compares scores on the are achieving what they are intended to examination in question with the scores on some other Every educational institution around the world has their own independent criterion test which presumably has already grading system and has evolved over the years high validity Numerical values such as 1, Construct-Related Evidence Higher education 1.25, 1.5, and so on Refers to the nature of the psychological construct or characteristics being measured by the test Often expressed in percentage Basic education curriculum form such as 99%, 80%, 70%, o How well does a measure of the construct explain etc. differences in the behavior of the individuals or their As the Enhanced Basic Education (K-to 12) Curriculum is performance on a certain task? implemented, the grading system transitioned levels of proficiency RELIABILITY NORM-REFERENCED GRADING SYSTEM Consistency of the score obtained “ Norm-referenced grading system plots a student’s grade in How consistent they are for every learner from one type of assessment tool to another and from one set of items to accordance to the norm of the class and not to a set standard ” another Measures a student’s performance in comparison to the performance of learners on the same class Reliability and validity are related concepts. If an Compares a learner’s knowledge or skills to the knowledge instrument is unreliable, it cannot produce the learning or skills of the norm group (the whole class) outcomes. As reliability improves, validity may improve Composition of the norm depends on the assessment (or may not). On the contrary, if an assessment instrument Reports whether test takers performed better or worse than is shown to be valid then it is also reliable a hypothetical average student, which is determined by comparing scores against the performance results of statistically selected group of test takers, typically of the RELIABILITY INTERPRETATION same age or grade level, who have already taken the exam Excellent Reliability; at the Individual student’s grade describes their performance in 0.9 and above level of the best standardized comparison to the performance of students in the norm tests group, but does not indicate whether or not they met or 0.8 – 0.9 Very good for a classroom test exceed a specific standard or criterion Good for a classroom test, in It does not provide the individual’s learning rather how the range of most. There are competitive he/she is in a class 0.7 – 0.8 probably a few items which Promotes competition as it subdivides the class to the best could be improved performers down to the poorest ones Somewhat low, this test needs to be supplemented by other Students might not be motivated to help and collaborate measures to determine grades. with his co-learners as it will adjust the norm of the whole 0.6 – 0.7 class and obtaining higher grades would be hard There are probably some items which could be improved Suggests need for revision of Juan Dela Cruz got a John Smith hgot a test, unless it is quite short. grade of 85% in grade of 85% in 0.5 – 0.6 The test definitely needs to be Mathematics in his Mathematics but his supplemented by other class. his classmates classmates are measures for grading are mostly students considered 'average' who are inclined to students compared to Questionable reliability. This science and those who Juan's class test should not contribute mathematics since are 0.5 or below heavily to course grade and it they were kids needs revisions 85% grade is not the same between the two classes. Juan Dela Cruz might get a higher grade if he is enrolled in John’s class and John might get a lower mark if he was enrolled in the earlier class Grades are often distributed to get the norm and average of It points a student’s the whole class Initially identifies level of performance student who needs A grade in another class might not be the same ‘value’ to that as so to help he special services. It of another teacher in planning Advantage for helps to plan how to individualized The norms dictates who are he brightest and the poorest in Teachers organized the teaching instruction and class thus promoting a silent competition between learners methodology for the monitoring CRITERION-REFERENCED GRADING SYSTEM whole class based on incremental Measures a student’s performance based on mastery of a diagnostic examination progress specific set of skills that has been set by the academic institutions (criterion) Measures what the student knows and doesn’t know at the Deped Order No. 10 S. 2024 time of the grading period Matatag Curriculum Performance = 40% Designed to determine whether students have mastered the Long Test = 20% subject Written works = 40% Each student’s performance is measured based on the material presented (what the student knows and what the COMULATIVE AND AVERAGING SYSTEM OF GRADES student doesn’t know) Philippine Education System has two types of grading systems All students can get 100% if they have fully mastered the COMULATIVE GRADING SYSTEM subject, or they can fail if they did not meet the set standards The grade of a student in a grading period equals his current Gives data for comparison of a student’s current skill quarter grade which is assumed to have the cumulative mastery to his/her skill mastery at previous points in the effects of the previous quarters year Grade in Mathematics o Skill sequence are broken down into gradual steps along 1st Grading: 85 the skill continuum, progress can be measured in very 2nd Grading: 87 small increments (0 85 0 25) + (0 87 0 75) 0 87 𝑜𝑟 87 Each learner’s performance is compared directly to the 3rd Grading: 86 standard, without considering how other students perform (0 87 0 25) + (0 86 0 75) 0 86 𝑜𝑟 86 on the subject 4th Grading: 90 (0 86 0 25) + (0 90 0 75) 0 89 𝑜𝑟 89 Criterion-referenced tests Often use “cut scores” to place students into categories AVERAGING GRADING SYSTEM such as “basic,” proficient,” and “advanced” The grade of the learner on a particular grading period equals the average of the grades obtained in the prior Promotes collaborative effort on learning as the students are grading periods and the currents grading period not concerned with how he performs with others but whether he achieved the set criterion Grade in Mathematics 1st Grading: 85 3rd Grading: 86 SETTING THE PERFORMANCE CRITERIA (CRITERION) 2nd Grading: 87 4th Grading: 90 Being set collaboratively by institution’s stakeholders and 85 + 87 + 86 + 90 348 not by the educator’s opinion and standard 87 4 4 Will be subject for public commentaries and amendments before it can be used Deped Order No. 42 S. 2016 Teachers must discuss with fellow educators (teaching the same Policy Guidelines on Daily Lesson Preparation for the subject) to what are the standards and must be permitted by their K to 12 Basic Education Curriculum superiorss Criterion- Referenced Norm-Referenced LEARNING OUTCOMES Grading System Grading System STUDENTS LEARNING OUTCOMES (SLO) Student’s learning is Skills, competencies, and values that the learning should able Definition and Student’s learning is being evaluated to demonstrate at the end of a subject or program Basis of being evaluated against against a set of Periodically assessed to know: Grading the set of learners criteria o If the teaching strategy of the facilitator of learning is National Achievement effective End of unit Test (NAT) or o How much learning have been achieved by the examination is used standardized Application in to check whether the examinations students Education learners has compares a student mastered the with other high school specific unit students taking the same subject BLENDED LEARNING Reflective teaching Cognitive Integrated learning Outcome-based approach Blooms Taxonomy of Learning Hybrid learning STUDENT LEARNING STYLES Motor VISUAL LEARNING Learners tend to remember through visual representation Affective Graph, chart, pictures, etc. AUDITORY LEARNING Learners tend to remember through hearing KINESTHETICS LEARNING Prefers hands-on approach Spiritually BLOOM’S TAXONOMY OF LEARNING EDWARD GALE’S CONE OF LEARNING Remembering Understanding Six Ways of Understanding explanation interpretation application perspective emphaty self-knowledge Applying Analyzing Evaluating Creating 4 LEVELS OF KNOWLEDGE FACTUAL KNOWLEDGE Ideas or specific data CONCEPTUAL KNOWLEDGE Common PROCEDURAL KNOWLEDGE How things work? Step by step actions, methods of inquiry (thesis) METACOGNITION KNOWLEDGE Thinking about thinking Awareness of knowledge of one’s own cognition Self-check and self-reporting Realization DIRECT DEMONSTRATION METHOD Guided exploratory Discovery approach Inquiry method PBL (Problem-Based Learning Project Method COOPERATIVE LEARNING APPROACH Peer tutoring Learning action cells Think-pair-share DEDUCTIVE/INDUCTIVE APPROACH Project method Inquiry based learning AUTHENTIC ASSESSMENT Authentic Assessment is not only concern with the samples and evidences but the actual competencies and skills learned by the students Authentic Assessment develops competencies in Higher Order Thinking Skills (HOTS) such as analysis, Assessed using a set of rubrics in actual performances and interpreting, synthesizing, and decision-making outputs This makes use of the three modes of assessment: Demonstration of skills are measured based on a set THREE MODES OF ASSESSMENT criterion OBSERVATION tool Performance assessment is classify as: Includes the date and information that the teacher collects o Process-Oriented Performance Based Assessment from the learners o Product-Oriented Performance Based Assessment In making observation-based assessment objective and systemize the following recommended guidelines: o Observation must be done to ALL the learners PROCESS-ORIENTED o Observations must be frequent and regular as possible PERFROMANCE BASED o Observation must be recorded o Observation should cover both routine and exceptional ASSESSMENT occurrences’. Exceptional or unique observation shall be emphasized and be given immediate attention PROCESS-ORIENTED PERFROMANCE BASED ASSESSMENT Seeks to measure the achieved competencies and skills o Reliability of observation records is enhanced if multiple Students perform a specific task to display and demonstrate observations are gathered and synthesized the learning they are expected to learn Checklist is the most common tool for this. It is the easiest to craft and simple to manage LEARNER’S GENERAL CATEGOERIES DURING THE ASSESSMENT PROCESS Developmental checklist THE BEGINNER An observation tool which requires to record the traits and The learners with little to no knowledge at all learning behavior to be evaluate. If craft properly, this will They require the most guidance and attention in terms of showcase the progress of the learners transfer of skills THE INTERMEDIATE Interview Sheet is another example of observation tool The learners who has learned the basic skills and which records qualitative and more comprehensive data. It competencies needed to be learned consist of a list of questions the teacher intends to asks They still require supervision in order to master the craft PERFORMANCE SAMPLES assessment tool Guidance is needed for consistency Evaluation tools that will help in knowing the level of THE SKILLES learning already achieved by the learners The learners who has acquired the necessary skills and Pieces of evidences and tangible manifestations of the cognitive abilities as set by the learning objectives teaching strategy They can work with minimal to no supervision at all Purpose of using this type of tools THE EXPERT o To assess the growth and development of the students at The learner who integrates all the learned behavior in their various levels outputs o Parents are informed of the process of their children in They often work without supervision and can create outputs school flawlessly o Evaluate the strength and challenges of the academic program or even the teaching strategy Provides the teachers with the learner’s proficiency level Portfolios are the most common sample as they provide at specific criterion not just a collective number, for pieces of evidences of an individual’s skills, ideas, interests, teachers to work on the areas of the students that need and accomplishments. Can be a simple folder or a fancy more attention scrapbook that may contain the following: IMPLEMENTING PROCESS-ORIENTED PERFROMANCE BASED o Essays ASSESSMENT o Audio tapes Considerations must be done before using this assessment. Things o Conferences Notes need to take note before the implementation of such evaluation o Pictures TIME o Graphs/Charts Long in duration compared to traditional pen-and-paper o Art works assessment o Group reports Need a longer time to observe the skills and competencies o Compact disk that need to assess o Field reports THE STAKEHOLDERS ACTUAL PERFORMANCE assessment tool Inform immediate supervisor to use this type of assessment Tests and measures of learner’s achievement at a specific place and time They should be notified ahead of time so that he/she can THE DISCRIPTORS give suggestions that would be of help Spell out what is expected of students at each level of Inform the parents of the students of the examination as performance for each criterion they are the one facilitating learning at home and practically Tells the students more precisely what performance looks –they are the one financing the students like at various levels and how their work may be o This is to avoid questions when remarks on the distinguished from the work of others for each criterion improvement of their child has been released SIGNIFICANCE OF LEVEL OF PERFORMANCE THE CAPABILITY OF THE LEARNERS TO PERFORM THE 1. Set the expectations of the students TASK o They what their target is in order to pass the activity or Task should be designed depending on the general and the course acceptable capabilities of the learners o They are able to know what should be done and how it The set learning objective is the standard for this –so it is should be done critical to set them smartly 2. More consistent and more objective Includes the tangibility of the learning objective o Allows to discriminate which among the learners needs How capable are my students to perform such task? additional attention, who are mediocre and those who THE CRITERION perform excellently Set the criterion of performance even before giving the 3. Provides the students feedback to which criterion they need instructions to work on Must be set with tangible evidences and on certain mastery o Able to see clearly the areas that they are good because levels it is more detailed rather than just seeing a remark TASK DESIGNING TYPES OF RUBRICS RULES OF THUMB ANALYTICS RUBRIC RULES OF THUMB when constructing the tasks for the learners Articulates levels of performance for each criterion Identify an activity that would emphasize the competencies Helps to determine how the learners are doing in each to be evaluated criterion o Solving algebraic problems, using microscope, and More detailed in form and more specific reciting a poem Identify an activity that would entail more or less the same When to use? When assessing the student’s performance separately, set of competencies since it better handles weighting of criteria o If an activity would result in numerous possible competencies, then the teacher would have difficulty HOLISTIC RUBRIC assessing the competencies Does not list different levels of performances for each Finding a task would be interesting and enjoyable for the criterion students Assigns a level of performance by assessing performance o Writing an essay or film reviews are boring and across multiple criteria as a whole cumbersome for some students DETERMINING THE LEVEL OF MASTERY When to use? When it cannot be separate the criterion from each other RUBRICS (like grammar structure). We cannot separate the use of Scoring scale used to assess the learner’s performance punctuations from correct grammar structure along a specific set of tasks and a set of criterion Note: PRODUCT-ORIENTED Authentic assessments typically are criterion-based grading PERFROMANCE BASED STUDENT’S APTITUDE ASSESSMENT Measured by matching the student’s performance against a set criterion to determine the degree to which the PRODUCT-ORIENTED LEARNING COMPETENCIES student’s performance meets the criteria for the task Student performances can be described as the target learning outcomes that leads to a specific product or competency Well defined rubrics have the following components: Products are a wide array of student works that target LEVEL OF PERFORMANCE specific skills. This skills are demonstrated through: Gives the ability to quantitatively express the ratings o Reading, writing, speaking, and listening or Informs what level of performance the learner has already psychomotor skills which require physical abilities to achieved perform a given task THE CRITERIA Assessed with an assessment linked to the level of expertise Wanted things to measure and qualitatively contributes to manifested by the product the overall learning objective of the lesson Can be done in the following ways: Level 3 o Knowing and listing down the identifiers of the expected learning objectives (constructive Does the finished product contain the alignment) minimum Next step is: requirements, have o Setting the highest and lowest levels of performance additional features and aesthetically o Determine the type of performance that would pleasing constitute the best and worst performance. The teacher will have an idea of the middle level of performance for the concept being measured Level 2 does the finished output contan additional parts and functions on top of the minimum requirement Level 1 does the finished product illustrate the minimum expected parts and functions? TASK DESIGNING The design of the task depends on what the teacher desires o observe as outputs of the students Things need to consider when designing assessment COMPLEXITY Are my students capable of doing such ambiguous output Is the project age appropriate? APPEAL Is the project enjoyable to create? Are the students going to enjoy doing it while moving towards the desired learning outcomes? CREATIVITY How does one best present the expected output? GOAL-BASED Will the output project the desired learning outcome? Will the project showcase the overall desired learning outcome? CREAFTING THE RUBRICS Scoring rubrics in product-based performance based assessment requires judgement from the evaluating body Typically employed in essays and evaluating projects Judgement concerning the quality of a given writing sample may vary depending upon the criteria that was set SETTING THE CRITERIA The measure for scoring rubrics are statements which identifies “what really matters” in the final outputs Most often used for evaluating outputs o Quality o Creativity o Comprehensiveness o Accuracy o Aesthetics PROCESS OF DEVELOPING SCORING RUBRICS The development of scoring rubrics goes through a process First step in the process: