University of Education, Winneba Assessment PDF

ASSESSMENT ASSESSMENT Written by: Prof. Kojo Donkor Taale Ernest Ngman-Wara(PhD) Reviewed by: Ahmed Kobina Amihere Institute for Distance and e-Learning University of Education, Winneba © IDeL - UEW All rights reserved including translation. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording or duplication in any information storage or retrieval system, without prior permission in writing from the Director, IEDE, University of Education, P. O. Box 25, Winneba. Assessment Published 2015 by The Institute for Educational Development and Extension University of Education P O Box 25 Winneba Tel: (0)3323 22 046 Fax: (0)3323 22 497 Email: [email protected] © Institute for Educational Development and Extension, 2014 ISBN xxxxxxx Credits Graphic Design and Layout: E Owusu P-Appiah-Boateng P-Kwamena Tenteh S Kwesi Nyan Printed in Ghana U NI T C 1 PRINCIPLES OF ASSESSMENT ON S TENT INTRODUCTION 12 SECTION 1 SIGNIFICANCE OF ASSESSMENT 14 What is assessment? 14 Why study assessment? 15 Effective Assessment 17 SECTION 2 MODES OF ASSESSMENT 20 Formative assessment 20 Summative assessment 21 Continuous assessment 21 Characteristics of continuous assessment 23 SECTION 3 USES OF CRITERION-REFERENCE AND NORMAL REFERENCED TESTING 28 Norm-referenced tests 28 Applications of norm-referenced tests 29 Raw scores 31 Rank ordering 31 Advantages and disadvantages of rank ordering 32 Percentile scores 33 Criterion-referenced tests 35 Definition of CRT 36 Characteristics of CRT 36 Setting standards 36 Procedures for setting standards 37 Usefulness of CRT 39 Weaknesses of CRT 39 SECTION 4 PRINCIPLE AND PRACTICE OF CONTINUOUS ASSESSMENT 42 Advantages of continuous assessment 42 Disadvantages of continuous assessment 44 SECTION 5 IMPLEMENTATION OF CONTINUOUS ASSESSMENT 48 History of continuous assessment in Ghana 48 Implications of continuous assessment for the teacher 49 Continuous Assessment Skills 50 Teachers' activities in continuous assessment 51 SECTION 6 PROBLEMS AND IMPROVEMENT OF TEACHER'S POLE IN CONTINUOUS ASSESSMENT 54 Problems of implementing continuous assessment 54 Overcoming the problems of continuous assessment 55 4 UEW/IEDE U NI T PREPARING CLASSROOM ACHIEVEMENT TESTS C 2 ON S TENT INTRODUCTION 58 SECTION 1 GENERAL PRINCIPLES OF TEST CONSTRUCTION 60 Planning the test 60 Listing instructional objectives of subject matter 61 Criteria for selecting appropriate Instructional objectives for testing 62 Constructing the table of specification 63 Determining appropriate test item types 64 Item preparation stage 65 Test item evaluation 65 SECTION 2 TEST RELIABILITY 68 Meaning of reliability 68 Types of reliability 69 Equivalent forms reliability 69 Test-retest reliability 70 Test-retest with equivalent forms 71 Internal consistency (split-half) reliability 72 Interjudge (intrajudge) agreement 73 Kuder-Richardson method of estimating reliability 74 Standard error of measurement (SEM) 74 SECTION 3 TEST VALIDITY 80 Types of validity 81 Face validity 81 Content validity 81 Construct validity 82 Concurrent validity 83 Predictive validity 84 Validity and test development 85 Approaches to test validation 85 SECTION 4 ANALYSIS OF TEST ITEMS 88 Meaning of test item analysis 88 Some methods of test analysis 89 Uses of item difficulty level 90 Uses of the Item discrimination index 91 SECTION 5 PREPARATION OF MARKING SCHEMES 96 Scoring objective test items 96 Scoring essay tests 97 Analytical approach 97 Preparing an analytical marking scheme 98 Merits of the analytical marking scheme 99 Impression or holistic approach 99 UEW/IEDE 5 U NI T C 3 TYPES OF TESTING INSTRUMENT ON S TENT INTRODUCTION 102 SECTION 1 SHORT ANSWER ITEMS 104 The short-answer item 104 Uses of the short-answer item 104 Knowledge of terminology 104 Knowledge of specific facts 104 Knowledge of method or procedure 104 Word the item so that the required answer is both brief and specific.106 Advantages of short-answer items 107 Disadvantages of Short-answer items 107 SECTION 2 TRUE-FALSE AND MATCHING TEST ITEMS 110 True-false test item 110 Advantages 110 Disadvantages of true-false tests 111 Suggestions for constructing true-false items 111 Matching test items 111 Advantages 112 Limitations of the matching item 112 SECTION 3 MULTIPLE- CHOICE TEST ITEMS 114 Multiple-choice test item 114 Deciding the number of alternatives in a multiple-choice item 116 Advantages of multiple-choice items 116 Limitations of multiple-choice items 117 Preparation of questions or items 117 Sampling of course content 117 Control of students' response 117 Influence on learning 118 Reliability 118 SECTION 4 ESSAY TEST 120 Forms and uses of essay question 120 Approaches for the assessment of cognitive abilities 121 Preparation of questions 121 Sampling of course content 122 Control of student's response 122 Open-book testing 123 SECTION 5 ASSESSMENT OF BASIC PROCESS SKILLS 126 Science process skills 126 Skills and processes 127 Basic skills 128 Types of skill tasks 129 SECTION 6 ASSESSMENT OF PROCESS SKILLS- OBSERVATIONAL TECHNIQUES 134 Distinction between inference and observation 134 Observational techniques 135 Checklists 135 Rating scales 136 6 UEW/IEDE U NI T ORGANISATION AND ANALYSIS OF TEST SCORES C 4 ON S TENT INTRODUCTION 140 SECTION 1 PRESENTATION OF DATA 142 Frequency distributions 142 Procedure for compiling a frequency distribution 144 Grouped representation of a frequency distribution 145 Frequency polygon 145 Histogram 146 Frequency polygon - another method 148 Shapes Of Distribution 148 KURTOSIS 150 Positively skewed distribution 152 Normal distribution 152 Multiple bar chart 156 Pie charts 158 SECTION 2 MEASURES OF CENTRAL TENDENCY 162 Measures of central tendency 162 The mean 162 The mean of a frequency distribution 163 Mean of a grouped frequency distribution using the assumed mean or short-cut method 164 Characteristics of the arithmetic mean 165 The mode 166 The median 167 How to find the position of the median 168 SECTION 3 MEASURES OF DISPERSION 172 The range 172 Quartiles and the five-number summary 174 The mean deviation 177 The standard deviation 178 The standard deviation of a grouped data 179 The standard deviation of a normal distribution 180 SECTION 4 THE RELATIONSHIP BETWEEN TWO VARIABLES 184 Scatter diagrams 184 Regression analysis 186 The least squares method 186 Interpretation of the correlation coefficient 188 Rank correlation 188 SECTION 5 USES OF TEST RESULTS 192 Functions of measurement 192 Placement and grouping of students 192 Diagnosis of learning problems 192 Curriculum development 193 Reporting student progress to parents 193 Use in school administration 194 UEW/IEDE 7 ASSESSMENT COURSE INTRODUCTION In science and other branches of education, different assessment methods and techniques are used. The purpose of this course is to present an overview of customary examination and assessment procedures as they are currently used in science education, and to describe their main characteristics. An analysis of the strengths and weaknesses of the different assessment procedures and techniques will also be made. This, it is hoped, will be of help to science teachers in choosing the assessment procedures suitable for particular purposes. As you go through this course, you must be searching for answers to the following questions: — how would you define evaluation? — what are the functions of the evaluation? — what are the objectives of the evaluation? — what kinds of information are collected? — what criteria are to be used to judge the merit and worth of an evaluated object? — who is served by an evaluation? — what processes are used in carrying out an evaluation? — what methods of enquiry are used? — who does the evaluation? — by what standards should the evaluation be judged? — what is assessment? This course comprises four (4) units. In unit 1, we shall look at the Principles of Assessment, Units 2 and 3 will be devoted to Preparing Classroom Achievement Tests and Types of Testing Instruments respectively. Unit 4 will deal with Organisation and Analysis of Test Scores. It is my hope you will take the course seriously, devote much time to the assignments and activities so that by the end of the course, you will have become a better teacher than before. By the end of this course, you should be able to: — describe the various principles of assessment — distinguish between the various forms of assessment — list the various types of testing instruments — construct various forms of classroom tests — organize and analyse test scores — report information obtained from assessment to the various stakeholders in education objectively 8 UEW/IEDE ASSESSMENT This page is left blank for your notes UEW/IEDE 9 ASSESSMENT COURSE PLANNER You may use this page as your course planner. Write the dates that you expect to complete each unit in this course. When you actually complete a unit, write the date you completed it. This will help you to keep track of your work and monitor your progress throughout this course. Planned completion date Actual completion date Unit 1: Principles of Assessment Unit 2: Preparing Classroom Achievement Tests Unit 3: Types of Testing Instruments Unit 4: Organisational and Analysis of Test Scores 10 UEW/IEDE U NI T ASSESSMENT C 1 ON S TENT PRINCIPLES OF ASSESSMENT SECTION 1 SIGNIFICANCE OF ASSESSMENT 14 SECTION 2 MODE OF ASSESSMENT 20 SECTION 3 USES OF CRITERION- REFERENCED AND NORM-REFRENCED TESTING 28 SECTION 4 PRINCIPLES AND PRACTICE OF CONTINUES ASSESSMENT 42 SECTION 5 IMPLEMENTATION OF CONTINUOUS ASSESSMENT 44 48 SECTION 6 PROBLEMS AND IMPROVEMENT OF TEACHER’S ROLE IN CONTINUOUS 54 ASSESSMENT the adinkra symbol used in the UEW crest Mate masie I have heard what you have said UEW/IEDE 11 XXXXXXX 1 PRINCIPLES OF ASSESSMENT UNIT Unit X, section X: XXXXXXX You are welcome to the first unit of the course on Assessment in Integrated Science, which deals with the fundamental basis of methods and assessment in Integrated Science. As a teacher you have incredible power of influencing students' learning. Where from this power? From assessment! Do you know that you can motivate students to learn? modify the study habits of students in the class? change students' attitudes and develop in them new interests and directions for learning? As a teacher you can achieve these through the various modes of assessment. However, many teachers look at assessment as an unpleasant task. Because of that assessment is frequently left until the end of a unit or lesson. Then some kind of test is hastily developed and given to the students. The test results are then often put into the form of grades and given to students, mainly for ranking them and providing reports on them to their parents. There is much more to assessment than this. With careful planning and sufficient time to prepare the form of assessment, evaluation can be an integral part of the teaching and learning process. You will discover the need for making assessment an integral part of the teaching and learning process as you go through the sections of this unit. After going through this unit successfully, you should be able to: explain why educators are concerned with assessment describe the various forms (or modes) of assessment state the usefulness of the various modes of assessment in the teaching and learning process state the uses of criterion- and norm-referenced testing explain the principles of continuous assessment suggest ways to improve the implementation of continuous assessment in classrooms. 12 UEW/IEDE ASSESSMENT This page is left blank for your notes UEW/IEDE 13 ASSESSMENT UNIT 1 SECTION 1 SIGNIFICANCE OF ASSESSMENT Unit 1, section 1: Significance of assessment You are welcome to the first section of the course on Methods and Assessment. We shall seek answers to the question, What is assessment? in this section. Also, we shall look at the purposes of assessment or the significance of assessment in teaching and learning. You will need some writing materials (paper and pen) to successfully go through this section. Please relax and let us together go through this section. By the end of this section, you should be able to: explain why educators are concerned with assessment state the purposes of assessment mention the various modes of assessment Educators generally are concerned with the worth of such things as curricula, teaching methods and teaching materials and one major source of information is the performance of pupils who are being taught. This information is obtained through assessment of the teaching and learning process. Activity 1.1 Take a piece of paper and write in not more than five sentences your answer to the question, 'What is assessment?' Keep your write-up as we shall look at it at the end of this section. What is assessment? Assessment, like many other educational concepts, has various labels. Testing, appraisal, diagnosis, measurement, performance review and evaluation are some of the labels used in education to refer to assessment, the process that helps to determine pupils' skills and knowledge. This wide array of descriptors, however, often serve to confuse rather than clarify what occurs in the assessment process. Assessment is the systematic process we use to gather data that serve as feedback to enable us to instruct students more effectively. At first glance, this definition might seem too simplistic and narrow to serve the purposes of educators. But it encompasses a wide array of information sources. Let us concentrate on three key ideas from the definition and their significance. A systematic approach signifies the importance of a purposeful step-by-step plan in which one gathers information about a student. When assessment proceeds in a haphazard manner, the information collected probably is not useful. 14 UEW/IEDE ASSESSMENT Unit 1, section 1: Significance of assessment Data implies information that is factual and can be communicated with accuracy for use in decision making. Using inaccurate data to make decisions creates educational problems for both the child and the teacher. If educators are to instruct students effectively, they must link good assessment practices to effective instruction. Too often assessment is conducted only to place students in programmes or to label or classify them. When assessment data are instructionally relevant we can determine successful interventions or evaluate teaching strategies. Assessment covers a variety of data collection procedures that are used in evaluating educational outcomes. These procedures include observation, testing and interview. These sources of information vary in their degree of intrusiveness, standardization and the level of inference they possess. Observations are non-interactive reports which are translated into numbers according to a rule; interviews are interactive dialogue with others in the environment; tests are structured interactions with correct and incorrect answers. A test has a set of specified, uniform tasks to be performed by students, these tasks being an appropriate sample from the knowledge or skills in a broader field of content. The data collected through the various tools of assessment are used to make a value judgement on the educational aspect(s) assessed. This process is termed evaluation. Educators evaluate students' progress by comparing student performance to the criteria of success based on instructional objectives. Activity 1.2 Now compare your write-up on assessment with the explanation above. Are there similarities and differences between the two? ……………………………………………………………………………… ……………………………………………………………………………… ……………………………………………………………………………… ……………………………………………………………………………… There are various modes of assessment. These include formative, summative and continuous assessment. These modes of assessment will be discussed in Section 2 of this. unit. Why study assessment? At this point you might very well ask why you should have any concerns about assessment in education. After all, assessment is not the primary function of the schools. Why, then, must one learn something about assessment as it applies to the educational setting? UEW/IEDE 15 ASSESSMENT Unit 1, section 1: Significance of assessment In answering this question, three assumptions will be made. The first is that schools exist in order to accomplish certain aims or purposes and that these purposes can be expressed in terms of desired changes in pupil behaviour. The second is that instructional programmes in the schools are formulated in order to accomplish these objectives. The third assumption is that the objectives are not likely to be accomplished successfully unless provision is made for continuing evaluation of the instructional programme. Why is assessment in the teaching and learning process based on these assumptions? The answer to this question is clearly provided by the views of various authors. Mayer (1986) makes clear what the objectives of assessment should be. According to him... the ultimate objective of assessment is to ascertain the effectiveness of teaching... not as many believe, to rank pupils in class and assign them grades. The purpose of education is not to prepare pupils to pass examinations, rather assessment is a way of evaluating the effectiveness of the entire educational enterprise. Ranking pupils is a subsidiary function for the assessment instrument itself." Harlen (1978) identifies the following reasons for assessment of pupils as part of the evaluative process. These are, to:  gather information about a wide range of pupil characteristics as feedback for making decisions;  accumulate records of progress;  provide information from which teachers can obtain insights into their own effectiveness;  inform other teachers who have to make decisions about the pupils; Deale (1975) advances other reasons which supplement those of Harlen. These are, to:  allocate pupils to sets;  compare progress of pupils under different teachers;  compare new teaching materials with old;  give incentive to learning and to remembering;  inform parents about progress;  inform employers or establishment of higher education about attainment;  decide upon entering pupils of higher education about attainment. Pidgeon and Yates (1968) provide the following statement of purposes:  diagnosis of pupils' strengths and weaknesses;  assessment of the extent to which pupils have benefited from a 16 UEW/IEDE ASSESSMENT Unit 1, section 1: Significance of assessment  course of instruction;  evaluation of the effectiveness of methods of teaching;  prediction of pupils' performance;  placement of pupils in the most beneficial educational situation. Macintosh and Hale (1976) introduce guidance as another purpose which is spelt out in the following: diagnosis - to assess progress and to find out how the pupils are assimilating what is being taught. evaluation - to assess the effectiveness of the teaching which again can lead to specific action. guidance - to assist pupils in making decisions about the future, whether it concerns choice of a subject or a course, or whether it is to help in choosing a suitable career. prediction - to discover potential abilities and aptitudes and to predict probable future successes whether in school or outside. selection - to determine which are the most suitable candidates for a course, a class or university. grading - to assign pupils to a particular group, to discriminate between the individuals in a group. Effective Assessment Not all assessment is useful and necessary. For example, over-analysing performance can have a negative effect on the student. In order to make the process a positive one, it is important to think about what you wish to achieve for yourself and the student during your evaluation. Raiment (2006) claims that effective assessment  Evaluates the extent of learning taking place  Is an essential part of the learning process for both teachers and learners  Acts as a systematic process of obtaining evidence  Promotes effective learning when used in the correct way  Determines the next steps needed to continue the process of teaching and learning  Offers positive ways to reaffirm talent, ability and understanding  Has many purposes: it allows both teachers and learners to see levels of achievement and areas for improvement  Takes many forms: e.g., it can be verbal, written, collaborative, personal, and spontaneous among other variations, depending on the circumstances  Assists teachers to deliver accurate and informative lessons and feedback  Can be used to monitor social, academic, and behavioural progress When is assessment most effective? UEW/IEDE 17 ASSESSMENT Unit 1, section 1: Significance of assessment The following are essential golden rules for making sure that you get the best results out of evaluation and assessment techniques. Assessment is most effective when  Shared with the students involved, and, where appropriate, peers, other teachers, parents and carers  Focused on specific areas of learning and individual progress  Used to help students understand how to improve  Clearly linked to targets and achievements  It is honest, positive, and appraising of achievement and effort  It is concise, not dismissive  It is relative to the subject  It recognizes progress and clearly explains how to improve The significance of assessment in the classroom can be summarised by the following statements:  Assessment points the way for both teachers and students to direct their efforts. It provides information that helps teachers to direct and redirect their instruction. It identifies concepts on which the class is doing well and provides diagnostic clues to concepts requiring additional work or a need for change in instructional strategy.  Parents use grades from assessment as guides to know how well their children are progressing.  Colleges and universities want transcripts of course work before they make admission decisions.  The various modes of assessment are based on the purpose of assessment. In other words the purpose of the assessment determines the mode of assessment to use. These modes of assessment include formative assessment, summative assessment and continuous assessment. These modes of assessment will be discussed in detail in the next section. Learning exercise 1. List five reasons why assessment should form an integral part of teaching and learning process. 2. Explain the term assessment as used in the section. 18 UEW/IEDE ASSESSMENT Unit 1, section 1: Significance of assessment References Deale, R N (1975) Assessment and Testing in the Secondary School, Schools Council Examinations Bulletin, 32, Evans Methuen Educational, London. Haden, W (1978) Evaluation and Teacher's Role, Schools Council Research Study. Routledge and Kegan Paul Ltd, London Macintosh, H G and Hale, D E (1976) Assessment and the Secondary School Teacher. Routledge and Kegan Paul Ltd, London Mayer, W V (1986). Assessment: practice guide to improving the quality and scope of assessment, Instruments, UNESCO. Pidgeon, D and Yates, A (1968). An Introduction to Educational Measurement, Routledge and Kegan Paul Ltd, London. Rayment, T. (2006). 101 Essential lists on assessment. Continuum International Publishing Group. New York, USA UEW/IEDE 19 ASSESSMENT UNIT 1 SECTION 2 MODES OF ASSESSMENT Unit 1, section 2: Modes of assessment Welcome to Section 2 of Unit 1. In Section 1 we attempted to explain the term assessment and discussed its importance in the classroom. We concluded that assessment provides information for both pupil and teacher on their performance so as to direct them towards effective teaching and learning respectively. We also found out that assessment may be carried out by the teacher for different purposes, Hence, we have various modes of assessment namely, formative assessment, summative assessment and continuous assessment. Each of these modes of assessment is used to investigate specific issues in the teaching and learning process. We shall now discuss each of the modes of assessment in detail. When you go through this section successfully, you should be able to: explain the different modes of assessment state the importance of each mode of assessment in the teaching and learning process state the difference between the various modes of assessment explain why continuous assessment is more embracing than formative and summative assessment Formative assessment Formative assessment is carried out during the instructional period to provide feedback to students and teachers on how well the material is being taught and learned. Feedback to students reinforces successful learning and identifies the learning errors that need correction. Feedback to the teacher provides information for modifying instruction and prescribing group and individual remedial work.. A classroom teacher uses formative assessment during instruction whenever he/she asks a question either to probe the understanding of his/her students on a concept being taught or to find out what pre-concepts students bring into the class. Formative assessment also takes place when the teacher asks questions on a learning segment before moving on to the next segment in the same teaching/learning period. Teacher-made mastery tests for each segment of instruction (such as a unit or chapter) are used in formative assessment. The mastery tests directly measure the intended learning outcomes of the segment. Observational techniques are also useful in monitoring students' progress and identifying learning errors. But because formative assessment is directed toward improving learning and instruction, the results are typically not used for assigning course grades. 20 UEW/IEDE ASSESSMENT Unit 1, section 2: Modes of assessment Activity 2.1 List some of the activities you would do in your class during instruction which tend to improve your teaching. How do they compare with those described so far in this section? Summative assessment Another mode of assessment a classroom teacher carries out is summative assessment. It is like the end-of-the-road appraisal. This is because it typically comes at the end of a course or unit of instruction. It is designed to determine the extent to which the instructional objectives have been achieved and is used primarily for assigning course grades or certifying student mastery of the intended learning outcomes. It also provides information for judging the appropriateness of the course objectives and the effectiveness of the instruction. Summative assessment relies on a broad sampling of the relevant content, and focuses more generally on all objectives of the unit of instruction. Summative assessment is usually norm- referenced where students are ranked based on the number of objectives in which they show proficiency. Depending on the approach to teaching, summative assessment could also be criterion-referenced, where students are to show listed skills or to display some specified abilities. Techniques used usually include teacher-made achievement tests, ratings on various types of performance (eg, laboratory, oral report) and evaluation of products (eg, themes, drawings, research reports). Can you think of other techniques used in Ghana for summative assessment? The terminal examinations organised by the West African Examinations Council (W AEC) can be classified as summative techniques for assessing the Junior and Senior High Schools programmes. These examinations are detached from the classroom situation and are carried out by an external body which is not involved in the development of the programmes or courses of study. The information from summative assessment cannot be used immediately to rectify situations (or may not be used to correct anomalies as the programme unfolds) because it comes too late in a leamer's career. Therefore, extreme care should be exercised with summative assessment procedures. Continuous assessment The new educational reforms in Ghana which began in 1987 contain specific guidelines for assessment at the basic school and senior high school levels. Continuous assessment marks and external examination scores are used to determine the final grades of the student at the end of the respective programmes. UEW/IEDE 21 ASSESSMENT Unit 1, section 2: Modes of assessment The rationale for introducing continuous assessment in many countries is to minimise the element of risk associated with taking a single terminal examination. It is believed that any student who works conscientiously throughout the period of study should not fail the final certificate examination. The emphasis on continuous assessment is not limited to Ghana. Several other African countries, notably Nigeria, Kenya, Liberia and Zambia have adopted the same strategy. Despite differences in details, the policies in all these countries have common features. In the years past a lot of emphasis was P\1t on assessing the cognitive domain of leaming only. Continuous assessment covers the cognitive, affective and psychomotor domains of learning providing a more valid assessment of the learner's overall ability and performance. Continuous assessment is obviously superior to the one-shot terminal examination. Continuous assessment brings about integration of assessment with the teaching and learning process and with feedback to improve the latter. If feedback is effective then meaningful remedial action can be taken and a wider scope for the realization of educational objectives would then be possible. But one can ask: What is continuous assessment? There is no one universal definition of continuous assessment. However, there is some agreement on the concept of continuous assessment. For our purpose we adopt the definition of continuous assessment as a method of evaluating the progress and achievement of students in educational institutions. It aims at getting the truest possible picture of each student's ability, at the same time, helping each student to develop his or her abilities to the fullest. It is a method whereby the final grading of students takes account, in a systematic way, of their whole performance during a given period of schooling. It can be deduced from the definition that continuous assessment embraces formative and summative assessment. The need for a more effective way of assessing students is not limited to developing countries alone. According to Weston (1990) some countries organised a special conference in 1988 to deliberate on finding a more constructive role of assessment as an aid to purposeful learning. The conference expressed misgivings about many assessments similar to what pertains in Africa and supported a number of suggestions which were put forward as marks of a more positive approach to assessment. These suggestions include: 22 UEW/IEDE ASSESSMENT Unit 1, section 2: Modes of assessment  More emphasis on formative feedback. This includes the recording of the pupil's success as well as the diagnosis of learning difficulties.  Assessing more aspects of achievement. This will most certainly call for more variety in the modes of assessment.  Better specification of learning targets for all aspects of achievement to assist pupils in understanding what they are expected to achieve.  More individualised pacing of learning to enable pupils to reach goals when they are ready.  Involving the pupil as a partner in assessment to help pupils take responsibility for their own learning.  Forms of certification which ensure progression to worthwhile training after compulsory schooling. (This needs to be related to the existing system of education and training.) The summary as presented by Weston (1990) shows that even advanced Countries have realised that a single terminal assessment, as formerly organised, is not very satisfactory and that more effective ways of assessing and recording pupils' performance should be explored and implemented. The teacher who knows his/her pupils far better than an external assessor should be given a greater say in the assessment of the pupil. Moreover, every child, if properly guided, can succeed in his/her studies. It is also not proper to consider only the cognitive aspects in the pupils' success (or otherwise) but the psychomotor and affective aspects, which in the traditional forms are ignored, must also be considered. Characteristics of continuous assessment A number of principles or characteristics of continuous assessment can be derived from the discussion so far which make it appealing to educators. These are well stated in a Workbook on Continuous Assessment (Federal Ministry of Education, 1979). It stressed that continuous assessment must be systematic, comprehensive, cumulative and guidance oriented. It must be systematic in the sense that it requires an operational plan, namely the measurements to be made, at what periods, taking and filing of records and the determination of the evaluative instruments. It should be comprehensive in that many types of assessment tools are used, thus covering all apsects of student behaviour. These give a comprehensive information on the student. It ought to be cumulative in that scores are built as the teaching goes on till the end when these aggregate into a single score. Also any decision made at any time about a student takes into consideration previous decisions made about him/her. UEW/IEDE 23 ASSESSMENT Unit 1, section 2: Modes of assessment It is guidance-oriented because the comprehensive information obtained about the student serves as a basis from which to guide the development and growth of the student. Continuous assessment may therefore be viewed as a better form of assessing students' performance in which evaluation takes place as often as possible. It also caters for all aspects of learning and attitudinal and skills development of the student. We will, later in this book, discuss further the implementation of and problems associated with continuous assessment. The different modes of assessment discussed may be used variously by educators. They provide useful information on aspects of the student and the teaching and learning process for the benefit of both the student and the teacher. The information obtained can be used to improve the teaching/learning process and to form a sound basis for future innovations and reforms in the educational process. Formative assessment of all kinds inform both the teacher and the student by providing periodic feedback to teachers to modify the methods or pacing of teaching/learning activities and indicate where difficulties may be occurring. Summative assessment determines end-of-course achievement used for assigning grades or to certify mastery of objectives. The grading is done at the very tail end, with little or no chance for remedial work. Tests, term reports, or summaries of portfolios or a combination of any of· these at the end of a unit (and end-of-year assessment) are typical sources of summative data. Summative assessment is frequently based upon cognitive gains and rarely takes into consideration other areas of the intellect. Continuous assessment can be defined as a procedure concerned with finding out in a systematic and comprehensive manner the overall performance of a student after a given set of learning activities. Continuous assessment may be used formatively at the time it is taking place but may contribute subsequently to summative assessment. It has been viewed worldwide as the most efficient way of assessing educational process because it can ascertain the effectiveness of the teaching/learning process. It ought to be cumulative in that scores are built as the teaching goes on till the end when these aggregate into a single score. Also any decision made at any time about a student takes into consideration previous decisions made about him/her. It is guidance-oriented because the comprehensive information obtained about the student serves as a basis from which to guide the development and growth of the student. 24 UEW/IEDE ASSESSMENT Unit 1, section 2: Modes of assessment Continuous assessment may therefore be viewed as a better form of assessing students' performance in which evaluation takes place as often as possible. It also caters for all aspects of learning and attitudinal and skills development of the student. We will, later in this book, discuss further the implementation of and problems associated with continuous assessment. Summary The different modes of assessment discussed may be used variously by educators. They provide useful information on aspects of the student and the teaching and learning process for the benefit of both the student and the teacher. The information obtained can be used to improve the teaching/learning process and to form a sound basis for future innovations and reforms in the educational process. Formative assessment of all kinds inform both the teacher and the student by providing periodic feedback to teachers to modify the methods or pacing of teaching/learning activities and indicate where difficulties may be occurring. Summative assessment determines end-of-course achievement used for assigning grades or to certify mastery of objectives. The grading is done at the very tail end, with little or no chance for remedial work. Tests, term reports, or summaries of portfolios or a combination of any of· these at the end of a unit (and end-of-year assessment) are typical sources of summative data. Summative assessment is frequently based upon cognitive gains and rarely takes into consideration other areas of the intellect. Continuous assessment can be defined as a procedure concerned with finding out in a systematic and comprehensive manner the overall performance of a student after a given set of learning activities. Continuous assessment may be used formatively at the time it is taking place but may contribute subsequently to summative assessment. It has been viewed worldwide as the most efficient way of assessing educational process because it can ascertain the effectiveness of the teaching/learning process. It makes use of various types of assessment tools such as teacher-made tests, standardised tests, discussions and observations. Learning exercises 1. Explain how formative assessment and surnmative assessment differ. 2. In your own words describe continuous assessment. 3. State the importance of formative assessment in the classroom. UEW/IEDE 25 ASSESSMENT Unit 1, section 2: Modes of assessment References I. Federal Ministry of Education (1979). Workbook on Continuous Assessment. The Educational Evaluation Unit, Federal Ministry of Education: Lagos, Nigeria. 2. Groulund, N E (1985) Measurement and Evaluation in Teaching. 5th Edition) Macmillan, New York. 3. Ogunmiyi, M B (1984) Educational Measurement and Evaluation. Longman; Nigeria Ltd: Lagos 4. Rowntree, D (1977) Assessing students: How do we know them? Harper and Row, L D: London 5. Tindal, G A and Martson, D B (1990) Classroom-Based Assessment, Evaluating Instructional outcomes. Macmillan Publishing Company: New York. 6. Weston, P B (1990) Assessment, Progression and Purposeful Learning in Europe. A study for the Commission of the European Communities, National Foundations for Educational Research in England and Wales (NFER). 26 UEW/IEDE ASSESSMENT Unit 1, section This page is 2: leftModes blank of forassessment your notes UEW/IEDE 27 USES OF CRITERION-REFERENCE AND NORMAL ASSESSMENT UNIT 1 SECTION 3 Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing REFERENCED TESTING In Section 2 we discussed various modes of assessment which we can use to evaluate the effectiveness of instruction and other aspects of the educational process. The modes of assessment were classified on the basis of the nature of measurement and their use in classroom instruction. We can also classify assessment procedures according to how the results are interpreted. There are two basic ways of interpreting student performance by tests and other evaluation instruments. One is to describe the performance in terms of the relative position held in some known group (eg, Kofi typed better than 90 percent of his classmates). The other is to directly describe the specific performance that was demonstrated (eg, Ama typed 40 words per minute without error). The first type of interpretation is called norm-referenced test (NRT) and the second, criterion referenced test (CRT). Both types of interpretation are useful. The purpose of this section is to address the place ofNRT and CRT in classroom instruction and other aspects ofthe educational process. We shall look at the nature, purposes and some application of these interpretations. When you go through this section successfully you should be able to:  state the purposes of norm-referenced and criterion-referenced tests  distinguish between norm-referenced and criterion-referenced tests  delineate the major types of decisions deduced from NRT  set standards for CRT using judgement of test items  set standards for CRT using judgement of testees  determine when to use the various types of norm-referenced indices Norm-referenced tests Test development, around the tum of the century, emphasised the measurement of individual differences. This emphasis was a result of the need to screen large numbers of army recruits into various military assignments during World Wars I and II. Psychologists used the formal testing techniques of Binet (which measure the individual differences of these recruits on psychological constructs such as memory and comprehension and also on specialised vocational aptitudes). During this period, tests became known as norm-referenced: the performance of the testee was interpreted in relation to a specified normative group. This idea was soon adopted in education. The same type of information gave educators an idea of a student's relative standing with respect to others in class or age range. For example, a student ranks lOth in a classroom group 000 in a test. The primary purpose of the NRT is to compare individuals in order to document their similarities and differences. Typically NRT measures more general category of testees' competencies. It looks for differences that exist in the class in terms of performance of the class. 28 UEW/IEDE ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing A key feature in constructing NRT is the selection of items of average difficulty and the elimination of items that all students are likely to answer correctly. This procedure provides a wide spread of scores so that discrimination among pupils at various levels of achievement can be more reliably made. This is useful for decisions based on relative achievement, such as selection, grouping and relative grading. Applications of norm-referenced tests Let us now discuss in detail some of the uses or applications of norm referenced tests. We must make three major decisions when using norm referenced evaluations. First, the tester must satisfy himself/ herself that the type of educational decision that needs to be made involves screening and eligibility or programme certification. If a different type of decision is needed, then norm-referenced evaluations are not appropriate. Second, the tester must summarise student performance through one of many outcome matrices. The particular matrix adopted should communicate information in the most simple and effective manner. Thirdly, the tester must display student performance as it relates to the norm group. Fig 3.1 depicts the three decisions that are necessary when using norm referenced evaluation. As in their early uses to screen military recruits, norm-referenced tests are used to screen large numbers of students. Schools are faced" with a critical situation: Who should receive specialised services? Schools cannot individualise instruction for everyone, so every teacher in every classroom must make decisions as to who should be UEW/IEDE 29 ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing grouped together and which curriculum materials should be used. To make such decisions, they usually use norm-referenced tests. Can you think of situations in our educational system where norm referenced tests are usual? The results of the terminal examinations of the Junior and Senior High Schools, BECE and SSSCE respectively are used for selection of students from these levels into WASSCE programmes of study at higher institutions. These terminal examinations can therefore be regarded as norm-referenced tests. For screening students and making decisions about eligibility in specialised programmes, norm-referenced testing are very appropriate. The screening process involves measuring a large group of students and determining who is significantly low and warrants more intensive and focused assessment. Determining eligibility is simply an extension of this process. More measurements are administered to students to confirm significant departures from normal (normative) standards and establish performance levels that are aligned with specific criteria for placement into a programme. The screening procedure we have just described may not be common in our schools. However, similar measures are taken in our special schools to place the children properly into their ability groups. Because such tests have a broad sampling plan, taking items from a wide range of content and representing a middle range of difficulty, such tests usually generate stable scores. They often include a wide range of items, varying in difficulty and content. Every student scores something correct, but no one scores 100% correct. Furthermore, students take the same test and, therefore, can be compared on the same scale. Norm-referenced tests are well suited for the evaluation of educational programmes. These programmes may include a wide range of instructional and social interventions. They may take time to implement, ranging from several weeks to several months or years. Also a programme is delivered to many individuals of varying backgrounds and skill levels. Finally, a programme is often implemented by more than one professional. All these considerations require some systematic data-collection process because many differences already exist. To be useful on a broad level, any measurement device (or tool) must encompass a range of items that are appropriate to the range of students over an extended period of time. All items will not be appropriate to students at all times, but enough will be applicable to generate meaningful scores. Norm-referenced tests are well suited to this purpose because they broadly sample content and difficulty level. 30 UEW/IEDE ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing How will you evaluate the performance of the students after administering a norm-referenced test? Compare your answer with the following: Evaluating performance in a non-referenced test A number of outcome matrices are available for evaluating students' performance on programmes including the following: raw scores, rank ordering, percentiles and standard scores. We shall discuss the first three in detail. Raw scores Raw scores are referred to as the number of items the student has answered correctly or incorrectly on a given test. There are two strategies of calculating raw scores. In the first strategy we calculate the raw scores by counting the number of correct test items. In the second, we divide the number of correct items by the total number of test items to obtain the percentage correct. Such scores typically are not emphasised in norm-referenced tests because they provide no basis for comparison to a peer group. However, the raw score is a starting point for all norm-referenced scores. Raw scores are only appropriate in comparison to other students' scores. Therefore, the only way that a raw score can be used is in reference to criterion-referenced evaluation, not in norm-referenced ones. Rank ordering Let us now look at rank ordering. I am sure you are familiar with it because you use it all the time after class examinations. The easiest way to compare students with each other is to rank them from highest to the lowest. This is called rank ordering. - -- - Each score in the distribution is ordered from the highest to the lowest and we assign an ordinal number or ranks that expresses each student's position in the group. Generally, we use raw scores to rank students. When ordering students a rank of 1 is the highest. Ranks of tied scores usually are averaged and each score is given the average rank. The lowest rank receives a rank equal to the number of students in the group. An example of rank order appears in Table 3.1 which presents 30 students, their scores and their ranks. In this example, where scores range from 12 to 56, Student 9 is given a rank of 1.00 for having the highest score. Students 12 and 23 both received a score of 45. Because they are tied for the fourth highest score, the rankings of 4.0 and 5.0 are averaged, and each student is assigned a ranking of 4.5. Student 27 who received the next highest score UEW/IEDE 31 ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing (42) is assigned a ranking of 6.0 because the counting begins at the next available rank after the averaging process. Do we use rank ordering? Many resources in education are limited and often only the individuals who most need special services can be served. Scholarships are often awarded to students through rank ordering. Can you think of other ways we use rank ordering in the classroom? Advantages and disadvantages of rank ordering Rank ordering is, however, not appropriate when the absolute score is more important than the relative score; when rankings are bunched (many scores appear on a few specific values on the scale); and when the scale is very limited and restricted, which will cause the rankings to bunch at a few scores. The main advantage of rank ordering is the ease with which it can be completed. A glance at the range of highest to lowest performer provides useful information about the distribution and clearly depicts those at the top and bottom. The problem with rank ordering is that real interpretations are difficult to make. The relative position in a ranking says nothing about the student's actual skill in terms of mastery or learning. Ranking is applied to those performers at the extremes; many individuals in the middle of the distribution are left out of the analysis and interpretation. Finally, interpretation of rank ordering is always a function of the number of students who are ranked. For example, the 5th student from the top in a group of lOis in a different standing than if the group included 100 students. Let us now look at another way of presenting students' scores. We saw earlier that while the raw score is useful in communicating how many items the student has answered correctly, it does not tell how well the student has done in relation to other students. For example, one student, Dari, may have had 13 correct items out of 20 for 65% correct. But how well does this compare to other students at Dari's class level? If the test was easy, then 13 correct may not be very good. On the other hand, if this test was difficult, a score of 13 might be very high in relation to his/her peers. Further, the rank order tells us whether Dari performs better or worse than Kojo, but it does not consistently communicate Dari's relative standing. For example a rank ordering of 12 would be excellent in a group of 100 students but poor in a group of 15. 32 UEW/IEDE ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing Percentile scores One way to find how well Dari is doing in relation to a known group of students at his class level is to transform the raw score into a percentile or percentile rank. The term percentile refers to the percentage of individuals in a fixed standardization sample with equal or lower scores. If Dari's score of 13 translates to a percentile of39 then 39% of her peers scored at, or below 13 and 61 % performed above 13. Similarly, if Dari' s raw score of 13 has a percentile rank of85, 85% of the population upon which the test is based, scored at or below 13, while 15% of the standardization sample scored above 13. Percentiles range from I to 99 and never include 0 and 100. The 50th percentile is equal to the median (half of the students score above and half UEW/IEDE 33 ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing below). Although percentiles communicate meaning by reference to a hypothetical group of 100 students, their actual interpretation must always be in reference to the group upon which the test was based. Calculating percentile ranks Percentile ranks are very easy to calculate; the calculation involves six steps which are listed below. In order to understand the procedure I suggest you take a set of scores your students obtained in a recent test and present them in a table. Create six columns. Now follow the steps below and complete the table. 1. l. Rank all scores from high to low. 2. For each score value, count the number of students that obtained it (frequency). 3. Convert the number of students at each score value to the percentage of students at each score value by dividing the former by the total number of students in the group (percent at score). 4. Multiply the percentage of students at each score value by 0.5 (0.5x percent at score). 5. Determine the percentage below each score value by subtracting percent at score from 100% for the first score, and all remaining values for percent at score from subsequent values of this step (percent below). 6. Add the values from these last two steps (0.5 percent at score + 7. percent below) to get the percentile rank. Compare one of your calculations with this example (the class consists of 30 students): Example Step 1: Raw score is 56. Step 2: Frequency at score = I (Number of students who obtained the score). Step 3: Percent at score = 1130 x 100 = 3.33% Step 4: 0.5 x percent at score = 0.5 x 3.33% = 1.665% Step 5: Percent below = 100 - 3.33 = 96.67% Step 6: Percent rank = 1.665 + 96.67 = 98 Now that you can calculate the percentile ranking of your student scores, when do you use them? Using percentile ranks Percentiles and percentile ranks are most appropriate when we need to base relative performance on a consistent standard (hypothetical group of 100). It is not appropriate, however, when we want to examine changes in performance or differences in performance among students over time. 34 UEW/IEDE ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing The group must include a sufficient number of students (generally 100 students are needed) to convert raw scores into percentiles. Furthermore, percentiles may not be appropriate when the distribution of scores is very non-normal (eg, where there are more high scores than low scores or and vice versa). In such cases, percentiles may be oversensitive where the scores bunch and under sensitive when the scores are few. Percentiles and percentile ranks are probably the most appropriate summary matrix for a norm-referenced evaluation. They are very easy to interpret, and they communicate relative standing with precision and consistency. Many non-educators understand their meaning or can be taught to understand it with little difficult. The only disadvantage is that they are subject to incorrect manipulation or interpretation because they are ordinal, not interval. Norm-referenced tests provide useful information for screening and eligibility and programme certification. They arose from the study of individual differences. Essentially a norm-referenced test addresses the degree to which individuals are similar or different, which is the heart of screening and eligibility/selection for educational purposes. Because it addresses broad outcomes across many different individuals, programme certification generally incorporates norm-referenced tests. However, neither the skills acquired nor the competence of the individual is indicated. Criterion-referenced tests We shall now turn our attention to criterion-referenced tests (CRT) in which assessment is based on precisely defined skills and knowledge. Interpretation of performance is often stated in terms of standards. Usually such standards are translated into mastery versus non-mastery, which in turn is based on cut-off, the level of performance above which there is mastery and below which there is non-mastery. Standard setting can be based on  judgement of test items  judgement of individual testees  analysis of applicable reference groups Ultimately, all decisions that put students into a mastery or non-mastery category may result in error. Student can be incorrectly judged as mastered (a false positive or type I error) or not-mastered (a false negative or type II error). Now let us look at the definition, characteristics and purpose of CRT. UEW/IEDE 35 ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing Definition of CRT Criterion-referenced tests focus on specific knowledge and skills of students. Rather than compare the student to a peer group as in norm referenced test, student performance is evaluated with reference to a criterion body of knowledge or skill. Probably the most important difference between NRT and CRT is the manner in which tests are developed and scores are interpreted. In sum, a NRT test concerns learning in general terms, whereas CRT delineates learning specific skills and knowledge. Characteristics of CRT There are three characteristics of CRT:  Test items are developed from specific performance objectives directly linked to an instructional domain.  The score is based on absolute, not relative standard.  The test measures mastery by using specific standards. In CRT, the focus of assessment is on what the student actually can or cannot do on specific skills and knowledge tasks. Two components must be addressed when constructing CRTs. These are:  Which specific tasks should be included,  How performance should be judged as mastered or not mastered. Technically CRT implies a well defined domain. In practice, most CRTs also utilize established standards of acceptable performance. The three critical questions that relate to standards to which we have to find answers are the following:  How are standards established?  What is mastery or non-mastery?,  What are the implications of making such judgements? Setting standards There has been a lot of debate on setting standards but the consensus reached was that the standards are arbitrary. This is because efficiency cannot be discussed in absolute terms. So standards can be either absolute or relative. With absolute standards, interpretation does not depend on the performance of all other testees and no comparisions are made between individuals. With relative standards, comparisons are made between individuals, Nevertheless, any standard requires some judgements which must be  meaningful to the people who make them  made by qualified people, and  senstive to the purpose of the test. 36 UEW/IEDE ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing Procedures for setting standards A variety of procedures, most of which are based on judgements, is available for setting standards. Three general methods based on judgement of test content, judgement of individual testees and judgement of groups of testees have been reviewed. The first two methods will be discussed here. Judgement of test content The greatest attention has been on judgement of test content in which the standard setters analyse test items to determine how many would be passed to reflect minimal proficiency. Five basic steps are employed:  Selecting judges: Judges should be experts in the knowledge or skills measured by the test; there should be representatives of other experts in the field.  Defining borderline knowledge and skills. In defining borderline knowledge the judges must first understand what the test measures and how the test scores will be used. Then they should describe someone who represents the borderline between acceptable and unacceptable skill. Consensus on this definition must be reached.  Training of the judges in a particular judgement process: Training must focus on exposing the judges to the range of exemplars that are within, above or below the borderline definition.  Collecting judgement: After adequate training judgements can then be collected independently to avoid creating arbitrary agreement.  Aggregating judgements to determine a passing score. Judgement of individual testees Judgement of the individual testee involves collecting data on their performance. Judgements should reflect recommendations of qualified individuals, focus on the measures used to test the individual, and be established appropriately. Also the individual's performance should be accurate and current. One of the methods used to establish cutoff points based on judgement of the individual testees IS the borderline-group method. In the borderline group method, judges identify individuals who are borderline in their knowledge of the subject matter. The following five steps should be used in applying this system to define mastery scores:  Select appropriate judges.  Define adequate, inadequate and borderline levels of the skills and knowledge tested.  Identify the borderline testees.  Obtain the scores of the borderline testees.  Set the cut-off score at the median of this group. UEW/IEDE 37 ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing The borderline group method can be used in the classroom. The best use of this procedure is in corroborating results of mastery tests and/or mastery levels. Minimum standards for many curriculum-embedded tests have been established without any empirical support, possibly promoting serious problems. Therefore, teachers need to question seriously the levels that are suggested and adjust them upward or downward to minimise errors in judgement. Student performance on other tasks can be useful in corroborating the results of mastery tests. All standard-setting methods result in a dichotomous decision that separates testees into successful (mastery) and unsuccessful (non-mastery) groups. Two ways for incorrectly labelling students exist. First, it is possible to be incorrect in labelling a student mastered, which is known as a false positive or Type I error. A student also may be incorrectly labelled a non-master, which is known as false negative or Type II, error. These two error types rarely are equally serious. For example, it is a serious problem to incorrectly award certification to a pilot. A false positive error in this case is potentially far more tragic than a false negative one. The severity of making false positive decisions is serious in many other areas, including medicine. On some occasions, a false negative error may be quite serious. For example denying a student a certificate may seriously jeopardise his/her ability to enter the work force. False negatives are particularly serious when performance levels are close to cut-off scores. Therefore, when viewing mastery decisions we should be aware that all measurements contain error and that dichotomous decisions may be wrong. Cut-off scores can be stabilised by ensuring that an adequate number of items are included on the test and by considering the proximity of the students' score to the cut-off score. Furthermore, rather than thinking of a student's status as definitively master or non-master, it is better to consider such status in terms of probabilities. Usefulness of CRT You may by now be wondering why we have spent so much time discussing CRT. The principles of CRT when applied in the classroom can improve the educational process. We can already deduce that CRT forms the basis of certification on competence in skills so that in the world of work the educands will be well placed. So CRT can serve as the gatekeeper in some jobs. CRT plays a diagnostic role both for the teacher and the student. The teacher will be able to identify problems within a specific domain and the student 38 UEW/IEDE ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing will be able to identify his/her weaknesses. These may call for remedial work. The results of CRT will be useful in educational placements. The results of CRT of an individual in an area, where the person has mastered, will help the authorities to place the student where s/he will perform optimally. Instructionally, problems are quickly identified. CRT results may also be useful for curriculum evaluation and development. Weaknesses of CRT Despite the above usefulness of CRT it has a number of weaknesses. These include the following: Some form of competition among students in a class, as a source of motivation is necessary but where CRT is not well executed, competition will be missing. The reliability and validity of CRT as a measurement of students' learning outcomes are yet to be established. Reliability and validity of a testing instrument are crucial in educational testing. There is still so much controversy with the criterion level or cut-off point. It is difficult to determine, for example, what will be the cut-off point for 100 items with 100 scores. Summary Norm-referenced tests (NRT) provide useful information for two major educational decisions, screening and eligibility and programme certification. Essentially NRT addreses the degree to which individuals are similar or different, which is the heart of screening and eligibility. Because it addresses broad outcomes across many different individuals, programme certification generally incorporates NRT. Criterion-referenced tests (CRT) measure students' mastery or non-mastery of specific knowledge and skills after instruction. The key feature of CRT is the selection of items that are directly relevant to the learning outcomes to be measured, without regard to the items' ability to discriminate among students. Strictly speaking, NRT and CRT refer only to the method of interpreting results. The tests that are strictly built to maximise each type of interpretation have much in common and it is impossible to determine the type of test from examining the test itself. Rather, it is in the construction and use of the tests that the differences can be noticed. NRT discriminates among students at various levels of achievement. This is because the test UEW/IEDE 39 ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and norm-referenced testing includes more difficult items than easy ones and the situation is the reverse in CRT. CRT and NRT are best said to be at the ends of a continum rather than as a clear-cut dichotomy (that is direct opposites). Learning exercises 1. Describe CRT and NRT as assessment instruments. 2. List similarities and differences between CRT NRT. 3. State the uses of CRT and NRT. 4. How useful are the results from CRT and NRT for classroom instruction? References 1. Groulund, N E (1985). Measurement and Evaluation in Teaching. (5th Edition). Macmillan Publishing Company. New York. 2. Kempa, R (1986). Assessment in Science. Cambridge University Press: Cambridge, UK 3. Tindal, G A and Marston, B D (1990). Classroom-Based Assessment: Evaluating Instructional Outcomes. Macmillan Publishing Company: New York. 40 UEW/IEDE ASSESSMENT Unit 1, section 3: Uses of criterion-referenced and This page is norm-referenced left blank for yourtesting notes UEW/IEDE 41 PRINCIPLE AND PRACTICE OF CONTINUOUS ASSESSMENT ASSESSMENT UNIT 1 SECTION 4 Unit 1, section 4: Principles and practice of continuous assessment Dear reader, you are welcome to Section 4 of Unit 1. In Section 2 we learnt why continuous assessment may be viewed as a better form of assessing student performance. This is because continuous assessment is carried out in a comprehensive and systematic manner to provide comprehensive information on all aspects of the student's performance. We also listed a number of characteristics of continuous assessment. In this section, we shall discuss some of the advantages and disadvantages of continuous assessment. When you go through this section successfully, you should be able to:  state the advantages of continuous assessment  state the disadvantages of continuous assessment Assessment, whether terminal or continuous, has its good and bad sides. However, many educationists propound that the advantages of continuous assessment outweigh its disadvantages and advocate its use as one of the major components of school assessment. Activity 4.1 You have been practising continuous assessment in your classroom. Take a sheet of paper and state as many of its advantages as you can. Keep the list. We shall look at it at the end of the section. Let us now take a look at some of the advantages of continuous assessment. Advantages of continuous assessment Continuous assessment minimises the element of risk associated with a single final examination. A single final examination could be affected by the testing situation, health of the student, environmental conditions and negative effects of testing on the examinee among others. In continuous assessment the effects of these possible conditions are minimised since the performance of the student all the way through the course is taken into consideration in arriving at the final grading/score. The comprehensive nature of the information provided by continuous assessment gives a better idea of the student's performance. In the usual one- shot final examination, test items are limited to the cognitive domain and do not cover the other domains, namely, affective and psychomotor domains, which focus on the vital objectives of science teaching and learning. Continuous assessment can assess and provide data on a wider range of skills of a student than the traditional written examinations. It tests skills in the higher levels of cognitive domain such as analysis, synthesis and evaluation by using various assessment tools such as projects, oral tests, practical tests and direct observations. These reflect aspects of psychomotor 42 UEW/IEDE ASSESSMENT Unit 1, section 4: Principles and practice of continuous assessment and affective domains. These advantages may contribute to enhance the validity of the assessment. As Pennycuick (1990) sums up, the validity of pupils' results is increased in two ways when we use continuous assessment: first by gathering assessment data over a substantial period of time, and second by maximising the educational objectives which are assessed. The formative aspect of continuous assessment is equally important. As the student is continually assessed both the teacher and the student identify the weaknesses and difficulties in the teaching and learning process. This leads to redirection of instruction by the teacher and early remediation to arrest the learning weaknesses and eliminate the learning difficulties of the student. Therefore in continuous assessment both the teacher and the learner have the opportunity to improve upon their respective aspects of the programme. Unlike the traditional end-on-course examination, continuous assessment makes teachers more concerned with the curriculum and less examination focused. With continuous assessment the teacher is compelled to look at the other aspects of education because the teacher's own assessment of the student forms part of the final grading of the student. Continuous assessment in a way reduces the anxiety of students over examinations. Examination malpractices such as cheating to pass examination are minimised because the student is aware that she/he has already accumulated some marks through continuous assessment. Continuous assessment can discourage truancy. Because the student is aware that all internal assessments are considered in awarding the final grade she/he will rarely, voluntarily miss out on any assignment/homework etc. Now compare the answers you wrote at the beginning of the section with what we have discussed. You may revise your answer after the comparison. You can also add to the list if you have made new insights on the basis of the discussion. Let us now look at some of the disadvantages of continuous assessment. Remember we said at the beginning of this section that continuous assessment has both advantages and disadvantages. Activity 4.2 Take a sheet of paper and state the disadvantages of continuous assessment which you may have discovered during your practice of continuous assessment. Keep your answer. We shall look at it later. UEW/IEDE 43 ASSESSMENT Unit 1, section 4: Pri

University of Education, Winneba Assessment PDF

Document Details

Tags

Related

Summary

Full Transcript