Assessment in Learning Handout PDF
Document Details
Uploaded by ElatedKansasCity
Tags
Summary
This handout provides an overview of assessment in learning, including basic concepts, purposes, standards, and principles of classroom assessment. It discusses different assessment methods and types of tests, and emphasizes the importance of a balanced assessment approach.
Full Transcript
ASSESSMENT IN LEARNING HANDOUT PART I: BASIC CONCEPTS a. TEST – a tool to measure characteristic. b. TESTING – process of using tests. c. MEASUREMENT – assigning of numbers or quantity. (e.g. scores) d. NON-MEASUREMENT – concerned with qualitative infor...
ASSESSMENT IN LEARNING HANDOUT PART I: BASIC CONCEPTS a. TEST – a tool to measure characteristic. b. TESTING – process of using tests. c. MEASUREMENT – assigning of numbers or quantity. (e.g. scores) d. NON-MEASUREMENT – concerned with qualitative information. (e.g. observations) e. ASSESSMENT – collections and interpretation of information; prerequisite to evaluation f. EVALUATION – a process of making judgment g. TRADITIONAL ASSESSMENT – use of pen-paper objective test. h. ALTERNATIVE ASSESSMENT – use of methods apart from pen-paper (e.g. performance, project, portfolio, journals) i. AUTHENTIC ASSESSMENT – assessment method that simulate true-to-life situations PART II: PURPOSES OF ASSESSMENT a. ASSESSMENT FOR LEARNING - conducted to gain an understanding of the students’ knowledge and skills to guide instruction. This includes: - Placement (done prior to instruction; assesses students’ needs and places them into learning groups) - Diagnostic (done during instruction; determines recurring difficulties and underlying causes of students’ learning problems. Helps plan a remedial instruction) - Formative (done during instruction; continuously monitors student’s level of attainment of the learning objectives. (ex. exit Q&A, short quiz) b. ASSESSMENT OF LEARNING - conducted after instruction. It is usually referred to as summative assessment. Clarifies what the students know and can do. It tells the students level of competency. It reveals whether or not instructions achieved the curriculum outcomes. Usually expressed as marks or letter grades. (ex. Performance Tasks, Written Works, Quarter Tests) c. ASSESSMENT AS LEARNING - self-assessment method for teachers and students. Teachers, as assessors, are required to undergo training to understand and perform their roles well in assessing learners. PART III: STANDARDS IN ASSESSMENT OF STUDENTS 1. Teachers should be skilled in choosing assessment methods appropriate for instructional decisions. 2. Teachers should be skilled in developing assessment methods appropriate for instructional decisions. 3. Teachers should be skilled in administering, scoring and interpreting the results of both externally-produced and teacher-produced assessment methods. 4. Teachers should be skilled in using assessment results when making decisions about individual students, planning teaching, developing curriculum, and school improvement. 5. Teachers should be skilled in developing valid pupil grading procedures which use pupil assessments. 6. Teachers should be skilled in communicating assessment results to students, parents, other lay audiences, and other educators. 7. Teachers should be skilled in recognizing unethical, illegal, and otherwise inappropriate assessment methods and uses of assessment information. *LEGAL STANDARDS: 1. NCBTS - National Competency-Based Teacher Standards (CMO #52, s. 2007 & DO #32, s. 2009) - Domain 5: PLANNING, ASSESSING, REPORTING 2. PPST - PHILIPPINE PROFESSIONAL STANDARDS FOR TEACHERS (DO 42. s. 2017) - Domain 5: ASSESSMENT AND REPORTING PPST Strand 5.1: Design, selection, organization and utilization of assessment strategies PPST Strand 5.2: Monitoring and evaluation of learner progress and achievement PPST Strand 5.3: Feedback to improve learning PPST Strand 5.4: Communication of learner needs, progress and achievement to key stakeholders PPST Strand 5.5: Use of assessment data to enhance teaching and learning practices and programs PART IV: PRINCIPLES OF CLASSROOM ASSESSMENT Principle 1: Clarity & Appropriateness of Learning Target S – Specific (includes the “who”, “what”, and “where”. Use only one action verb to avoid issues with measuring success) M – Measurable (focuses on “how much” change is expected) A – Attainable/Achievable (realistic given program resources and planned implementation) R – Relevant/Result-oriented (relates directly to curriculum/activity goals) T – Time-bounded/Terminal (focuses on “when” the objective will be achieved) LEARNING TARGETS Skills Student ability to demonstrate achievement-related skills. Knowledge Student mastery of substantive subject matter. Reasoning Student ability to use knowledge to reason and solve problems. Affective/Disposition Student attainment of affective states such as attitudes, values, interests and self-efficacy. Products Student ability to create achievement-related products. Principle 2: Appropriateness of Method METHODS OF ASSESSMENT: MODES OF ASSESSMENT: 1. Traditional – uses paper-pen test for assessment such as standardized and teacher-made tests. Advantages: objective & easily administered Disadvantages: time-consuming & prone to guessing/cheating 2. Performance – requires actual demonstration of skills or creation of product. (ex. practical exam, projects, oral test) Advantages: easy preparation & free from fraud Disadvantages: time-consuming & subjective 3. Portfolio – ongoing gathering of indicators of student progress. (ex. working, showcase, & documentary portfolios) Advantages: measures growth & development Disadvantages: time-consuming & subjective Principle 3: Balance A balanced assessment sets targets in all domains of learning and intelligences. Multiple Intelligences - suggests that all people have different kind of intelligences and traditional views of intelligences are too limited. Principle 4: Validity The degree to which the assessment measures what it intends to measure. It is the most important criterion of a good assessment. Ways to Establish Validity: 1. Face Validity – examines the physical appearance of the instrument. – makes the instrument readable and understandable. 2. Content Validity – examines the objectives of the assessment to reflect the curricular objectives. 3. Criterion-related Validity – shows correlation of a set score to another external predictor or measure. a. Concurrent – two measures with close interval b. Predictive – two measure with longer interval 4. Construct Validity – establishes traits/factors that influence scores in a test. a. Convergent – factors are correlated b. Divergent – factors are not correlated FACTORS AFFECTING VALIDITY: 1. Unclear directions 2. Vocabulary and Sentence structure 3. Ambiguity 4. Inadequate time limits 5. Overemphasis of easy 6. Inappropriate test items 7. Poorly constructed test items 8. Test too short 9. Improper arrangement of items 10. Identifiable pattern of answer Principle 5: Reliability The consistency of scores obtained the same person when retested using the same or parallel instrument or when compared to other students who took the same test. Methods in Finding Reliability: 1. Test-retest (the same test + long interval) 2. Equivalent Forms (parallel tests + short interval) 3. Test-retest with Equivalent Forms (parallel tests + longer interval) 4. Split-half (one test split into 2 parts) 5. Kuder-Richardson (one test analyzed per item) IMPROVING TEST RELIABILITY: 1. Test length 2. Spread of scores 3. Item difficulty 4. Item discrimination 5. Time limits *Formula: no. of students who got the item right / total number of students *Formula: UPPER GROUP – LOWER GROUP / ½ of total number of students Principle 6: Fairness A fair assessment provides all students with an equal opportunity to demonstrate achievement. Key to Fairness: 1. Students have knowledge of learning targets and assessment. 2. Students are given equal opportunity to learn. 3. Students possess the pre-requisite knowledge and skills. 4. Students are free from teacher stereotypes. 5. Students are free from biased assessment tasks and procedures. Principle 7: Practicality and Efficiency When assessing learning, the information obtained should be worth the resources and time required to obtain it. Factors to Consider: 1. Teacher Familiarity with the Method 2. Time Required 3. Complexity of the Administration 4. Ease of Scoring 5. Ease of Interpretation 6. Cost Principle 8: Continuity Assessment takes place in all phases of instruction. It could be done before, during and after instruction. Activities prior to instruction: - understand students’ backgrounds - understand their needs & interests - present the expected outcomes - plan group and individual instructions Activities during instruction: - monitor progress; gains, & difficulties - adjust instruction when necessary - provide praises and feedback - judge students’ in line with the outcomes Activities after instruction: - describe each student’s goal attainment - report to students and parents - record assessment for evaluation - evaluate instruction, curriculum, and IMs Principle 9: Authenticity Features of Authentic Assessment: - meaningful performance task - clear standards and public criteria - quality products, and performance - positive Interaction between assessee & assessor - emphasis on meta-cognition and self-evaluation - learning that transfers Criteria of Authentic Achievement: 1. Disciplined Inquiry - in-depth understanding to formulation of ideas 2.. Integration of Knowledge - whole rather than fragments approach 3. Value Beyond Evaluation - assessment has value beyond the classroom Principle 10: Communication - Assessment targets and standards should be communicated. - Assessment results should be communicated to important users. - Assessment results should be communicated to students through direct interaction or regular ongoing feedback on their progress. Principle 11: Positive Consequences - Assessment should have a positive consequence to students; that is, it should motivate them to learn. - Assessment should have a positive consequence to teachers; that is, it should help them improve the effectiveness of their instruction. Principle 12: Ethics - Teachers should free the students from harmful consequences of misuse or overuse of various assessment procedures such as embarrassing students and violating students' right to confidentiality. - Teachers should be guided by laws and policies that affect their classroom assessment - Administrators and teachers should understand that it is inappropriate to use standardized student achievement to measure teaching effectiveness. PART V: ALTERNATIVE ASSESSMENTS A. Performance-based Assessment - a process of gathering information about student’s learning through actual demonstration of essential and observable skills and creation of products that are grounded in real world contexts and constraints. Reasons for Selecting PBA: - dissatisfaction with selected-response test - demand for procedural knowledge - negative impact of conventional tests - experiential, discovery-based, integrated, and problem-based Types of Performance-based Assessment: 1. Demonstration Type – no product ex. cooking demo, group presentations 2. Creation Type – requires tangible product ex. project-making, research paper Methods of Performance-based Assessment: 1. Written Open-ended – written prompt is given (ex. essays, open-ended tests) 2. Behavior-based – direct observation (ex. structured/formal, unstructured/informal) 3. Interview-based – one-on-one conference (ex. structured (set questions), unstructured (free) Criteria in Selecting a Good Performance Assessment 1. Generalizability – able to do similar tasks w/ ease 2. Authenticity – anchored to real-world 3. Multiple Foci – can achieve multiple outcomes 4. Teachability – can be taught 5. Feasibility – practicality in space, time, cost 6. Scorability – can be evaluated 7. Fairness – fair to all students B. Portfolio Assessment - a purposeful, ongoing, dynamic, and collaborative process of gathering multiple indicators of the learner's growth and development. Portfolio assessment is also performance-based but more authentic than any performance-based task. Reasons for Selecting Portfolio Assessment - it tests what really happens in the class - it offer multiple indicators of progress - it gives students responsibility to learn - it allows students to document reflections - teachers can reflect on their instruction - teachers gain new role in the assessment Principles Underlying Portfolio Assessment 1. Content Principle - portfolio should reflect subject matters. 2. Learning Principle - portfolio should create active learners. 3. Equity Principle - portfolio should demonstrate learners learning styles and multiple intelligences. Types of Portfolios: 1. Working portfolio – shows day-to-day work 2. Showcase portfolio – shows best works 3. Documentary portfolio – shows progress PART VI: DEVELOPING RUBRICS Rubric is a measuring instrument used in rating performance-based tasks. It is the “key to corrections'' for assessment tasks designed to measure the attainment of leaning competencies that require demonstration of skills or creation of products of learning. It offers a set of guidelines or descriptions in scoring different levels of performance or qualities of products of learning. It can be used in scoring both the process and the products of learning. Other Scoring Instruments: 1. Checklists - presents the observed characteristics of a desirable performance or product - the rater checks the trait/s that has/have been observed in one’s performance or product. 2. Rating Scale - measures the extent or degree to which a trait has been satisfied by one’s work or performance - uses 3 to more levels to describe the work or performance Two Types of Rubric: 1. Analytic - break down the assessment into specific criteria or skills and addresses an activity with many components and offer a specific grade. 2. Holistic - provide a general overview of the quality of a student's work and much briefer, generally address fewer components and gives no specific grades. Elements of a Rubric: 1. Competencies 2. Performance Task 3. Criteria and Indicators 4. Performance Levels 5. Qualitative & Quantitative Descriptors Guidelines in Developing Rubrics: 1. Identify the important and observable features or criteria 2. Clarify the meaning of each trait or criterion 3. Describe the gradations of quality product or excellent performance. 4. Aim for an even number of levels 5. Keep the number of criteria reasonable enough 6. Arrange the criteria in order in which they will likely to be observed. 7. Determine the weight/points of each criterion 8. Put the descriptions of a criterion or a performance level on the same page 9. Highlight the distinguishing traits of each performance level 10. Check if the rubric encompasses all possible traits of a work. 11. Check again if the objectives of assessment were captured. PART VII: TEST Test is an instrument or systematic procedure which typically consists of a set of questions for measuring a sample of behavior. It is a systematic form of assessment that answers the question, "How well does the individual perform - either in comparison with others or in comparison with a domain of performance task. Purpose of Tests: A. Instructional Use - identify learners who needs enrichment exercises - measure class progress - assign grades/marks - guiding activities for specific learners B. Guidance Use - assist learners to set goals - improve understanding on children’s learning problem - prepare data to guide PTC - determine career interests of learners - predict success in the future Classification of Tests: A. Standardized Tests – made by experts 1. Ability Tests – combines 3Rs plus reasoning ex. Otis Lennon Standardizes Ability Test 2. Aptitude Tests – potential in a specific area ex. Differential Aptitude Test B. Teacher-made Tests – made by teachers 1. Objective Type – has specific answer a. Limited Response Type - requires the student to select the answer from a given number of alternatives or choices. ex. Multiple Choice Test - has a stem/questions and 3-5 options/alternatives with only one key/correct answer – the remaining options are called distracters or decoys. ex. True-False or Alternate Response - has declarative statements that one has to respond or mark true or false; right or wrong, correct or incorrect, yes or no, fact or opinion, agree or disagree and the like. ex. Matching Type - consists of two parallel columns. The items in Column I or A for which match is sought are called premises, and the items in Column II or B from which the selection is made are called responses. ex. Rearrangement - ordering or assembling items in predetermined basis. The items can be arranged in chronological or geographical order, by magnitude, importance, quality. b. Free Response Type/Supply Test - requires the student to supply or give the correct answers. ex. Simple Recall/Short Answer - uses a direct question that can be answered by a single word, phrase, number, or symbol. ex. Completion Test (Fill-in the Blanks) - consists of an incomplete statement that can also be answered by a word, phrase, number, or symbol. ex. Identification Test - a stimulus in the form of definition, description, explanation, picture, diagram, or any illustration where the examinee will supply the appropriate answer. ex. Labelling Test - identify the name of the parts of an illustration which can be given in the form of a diagram, picture, or a drawing. ex. Enumeration - this requires multiple responses in the form of a list. 2. Essay Type/Subjective - provides freedom of response that is needed to adequately assess students' ability to formulate, organize, integrate and evaluate ideas and information or apply knowledge and skills. Types of Essay Tests: a. Restricted – limits content and response b. Extended – provides relative freedom Other Types of Test: 1. Psychological – measure behavior 2. Educational – measure results of instructions 3. Survey – measure general level of achievement 4. Mastery – measure degree of mastery of specific areas 5. Verbal – oral responses are highly needed 6. Non-verbal – responses are through drawings/images 7. Power – items have increasing difficulty 8. Speed – measure number of finished item. PART VIII: SUGGESTIONS IN TEST CONSTRUCTION General Suggestions: 1. Use assessment specifications as a guide to item/task writing. 2. Construct more items/tasks than needed. 3. Write the items/tasks ahead of the testing date. 4. Write each test item/task at an appropriate reading level and difficulty. 5. Write each test item/task in a way that it does not provide help in answering other test items or tasks. 6. Write each test item/task so that the task to be performed is clearly defined and it calls forth the performance described in the intended learning outcome. 7. Write a test item/task whose answer is one that would be agreed upon by the experts. 8. Whenever a test is revised, recheck its relevance. Specific Suggestions: A. Objective Type of Tests: 1. Alternate-Response/T or F - avoid broad, trivial statements and use of negative words especially double negatives. - avoid long and complex sentences. - avoid multiple facts or including two ideas in one statement, unless cause-effect relationship is being measured. - if opinion is used, attribute it to some source unless the ability to identify opinion is being specifically measured. - use proportional number of true statements and false statements. - true statements and false statements should be approximately equal in length. 2. Matching Type - use only homogeneous, material in a single matching exercise. - include an unequal number of responses and premises and instruct the pupil that responses may be used once, more than once, or not at all. - keep the list of items to be matched brief, and place the shorter responses at the right. - arrange the list of responses in logical order. - indicate in the directions the basis for matching the responses and premises 3. Multiple Choice Question - the stem of the item should be meaningful by itself and should present a definite problem. - the item stem should include as much of the item as possible and should be free of irrelevant material. - use a negatively stated stem only when significant learning outcomes require it and stress/highlight the negative words for emphasis. - all the alternatives should be grammatically consistent with the stem of the item. -an item should only contain one correct answer or clearly best answer. - items used to measure understanding should contain some novelty, but not too much. - all distracters should be plausible/attractive - verbal associations between the stem and the correct answer should be avoided. - the relative length of the alternatives/options should not provide a clue to the answer. - the alternatives should be arranged logically. - the correct answer should appear in each of the alternative positions and approximately equal number of times but in random order. - use of special alternatives such as “none of the above" of “all of the above” should be done sparingly. - always have the stem and alternatives on the same page. - do not use multiple choice items when other types are more appropriate. B. Supply Type of Tests: - Word the item/s so that the required answer is both brief and specific. - Do not take statements directly from textbooks - A direct question is generally more desirable than an incomplete statement. - If the item is to be expressed in numerical units, indicate the type of answer wanted. - Blanks for answers should be equal in length and as much as possible in column to the right of the question. - When completion items are to be used, do not include too many blanks. C. Subjective Type of Tests: 1. Essay Type - restrict the use of essay questions to those learning outcomes that cannot be satisfactorily measured by objective items. - construct questions that will call forth the skills specified in the learning standards - phrase each question so that the student’s task is clearly defined or indicated - avoid the use of optional questions. - indicate the approximate time limit or the number of points for each question - prepare an outline of the expected answer in advance or scoring rubric. PART IX: NON-COGNITIVE LEARNING OUTCOMES Affective Assessment Tools: A. Observational Techniques 1. Anecdotal Records - recording of factual description of students’ behavior. - determine in advance what to observe, but be alert for unusual behavior. - analyze observational records for possible sources of bias. - observe and record enough of the situation to make the behavior meaningful. - make a record of the incident right after observation, as much as possible. - limit each anecdote to a brief description of a single incident. - keep the factual description of the incident and your interpretation of it, separate. - record both positive and negative behavioral incidents. - collect a number of anecdotes on a student before drawing inferences concerning typical behavior. - obtain practice in writing anecdotal records. 2. Peer Appraisal - useful in assessing personality, characteristics, social relations skills, and other forms of typical behavior ex. Guess-who Techniques - obtains peer judgment or peer ratings. Students name their classmates who best fit each of a series of behavior description. The result indicates their reputation in the peer group. ex. Sociometric Technique - calls for nominations, but students indicate their choice of companions for some group situation or activity. The result indicates their total social acceptance. 3. Self-report Technique - used to obtain information that is inaccessible by other means, including reports on the students’ attitudes, interests, and personal feelings. 4. Attitude Scales - used to determine what a student believes, perceives, or feels. Attitudes can be measured toward self, others, and a variety of other activities, institutions, or situations. ex. Rating Scale – uses points/scale ex. Semantic Differential – uses adjectives ex. Likert Scale – uses points with values PART X: DEVELOPING ASSESSMENT INSTRUMENTS P – Planning C – Creating A – Administering E – Evaluating PART XI: QUALITIES OF ASSESSMENT INSTRUMENTS A. Validity – measures what it intends to measure B. Reliability – consistency of scores C. Administrability – easy to administer D. Scorability – easy to score E. Interpretability – easy to describe F. Economy – saves time and effort PART XII: SHAPES, DISTRIBUTIONS, & DISPERSIONS OF DATA A. SYMMETRICAL SHAPE SCORE DISTIBUTION 1. Normal Distribution/Mesokurtic/Bell MEAN=MEDIAN=MODE B. SKEWED SHAPE SCORE DISTIBUTION 1. P – positively skewed R – skewed to the right D – difficult test (pre-test) L – low scores 2. N – negatively skewed L – skewed to the left E – easy test (post- test) H – high scores TYPES OF KURTOSIS: 1. Leptokurtic – more peaked than a normal distribution with longer tails; there are more chances of outliers 2. Platykurtic – flatter than a normal distribution with shorter tails; fewer exteremes 3. Mesokurtic – indicates a normal “bell-shaped” distribution PART XIII: DESCRIPTIVE STATISTICS A. MEASURES OF CENTRAL TENDENCY - numerical values which describe the average or typical performance of a given group in terms of certain attributes. - basis in determining whether the group is performing better or poorer than the other groups. 1. MEAN – arithmetic average; easily affected by outliers/extreme scores 2. MEDIAN – centermost score; used when there are outliers/extreme score 3. MODE – scores that occurs most frequently B. MEASURES OF VARIABILITY - describe how spread the scores are. The larger the measure of variability, the more spread the scores are and the group is said to be heterogeneous; the smaller the measure of variability the less spread the scores are and the group is said to be homogeneous. 1. Range - difference between the highest and lowest score; (formula: HS-LS=RANGE) - counterpart of the mode and is also unreliable/unstable; - used as a quick, rough estimate of measure of variability. 2. Standard Deviation - the counterpart of the mean, used also when the distribution is normal or symmetrical; - reliable/stable and so widely used 3. Quartile Deviation - defined as 1/2 of the difference between quartile 3 (75th percentile) and quartile 1 (25th percentile) in a distribution; - counterpart of the median; used also when distribution is skewed. COUNTERPARTS: C. MEASURES OF RELATIVE POSITION: 1. Percentile Ranks - indicates the percentage of scores that fall below a given score; - appropriate for data representing ordinal scale, although frequently computed for interval data. 2. Standard Scores - measure of relative position; used when data represent an interval or ratio scale; - z score expresses how far a score is from the mean in terms of standard deviation units; allows all scores from different tests to be compared. 3. Stanine Scores - standard scores that tell the location of a raw score in a specific segment in a normal distribution which is divided into 9 segments, numbered from a low of 1 through a high of 9. 4. T-scores - tells the location of a score in a normal distribution. PART XIV: GIVING GRADES Grades are symbols that represent a value judgment concerning the relative quality of a student's achievement during specified period of instruction. Grades are important to: - inform students and other audiences about student's level of achievement. - evaluate the success of an instructional program. - provide students access to certain educational or vocational opportunities. - reward students who excel. Student Progress Reporting Method Principles for Effective Grading: - discuss your grading procedures to students at the very start of instruction. - make clear to students that their grade will be purely based on achievement. - explain how other elements like effort or personal-social behaviors will be reported. - relate the grading procedures to the intended learning outcomes or goal/ objectives. - get hold of valid evidences like test results, reports presentation, projects and other assessments, as bases for computation and assigning grades. - take precautions to prevent cheating on test and other assessment measures. - return all tests and other assessment results, as soon as possible. - assign weight to various types of achievement included in the grade. - tardiness, weak effort, or misbehavior should not be charged against achievement grade of student. - be judicious/fair and avoid bias but when in doubt (in case of borderline student) review the evidence. If still in doubt, assign the higher grade. - grades are black and white, as a rule, do not change grades. - keep pupils informed of their class standing or performance. Types of Score Types: PART XV: CONDUCTING PARENT-TEACHER CONFERENCE - Set the goals and objectives of the conference ahead of time. - Begin the conference in a positive manner. - Present the student's strong points before describing the areas needing improvement. - Encourage parents to participate and share information. - Plan a course of action cooperatively. - End the conference with a positive comment - Use good human relation skills during the conference. PART XVI: OTHERS ABCD of Objectives K-12 GRADIN SYSTEM: NOTE: FAILURE IN 1-2 LEARNING AREAS = REMEDIAL FAILURE IN 3 OR MORE LEARNING AREAS = RETAIN