Topic 1 Introduction to Testing and Evaluation PDF

TOPIC 1 INTRODUCTION TO TESTING AND EVALUATION Why do we need to test? Do we need to standardize tests? Find out from online resources the differences in definitions for these key words: a. tests + testing b. assess + assessment c. evaluate + evaluation What is assessment? Not the same as testing! An ongoing process to ensure that the course/class objectives and goals are met. A process, not a product. A test is a form of assessment. (Brown, 2004, p. 5) unplanned comments, Informal assessment verbal feedback to students, observing can take a number of Informal students perform a forms: task or work in small groups, and so on. and Formal Assessment systematic Formal assessment are give students and exercises or teachers an appraisal of students’ procedures which are: achievement such as tests. Multiple-choice True-false Traditional Assessment Matching Norm-referenced and criterion referenced tests Norm-referenced test standardized tests (college board, TOEFL, GRE) Norm and Place test-takers on a mathematical continuum in rank order Criterion- referenced Criterion-referenced tests tests give test-takers feedback on specific objectives (“criteria”) test objectives of a course known as “instructional value” Authentic Assessment Authentic assessment Examples: reflects student learning, achievement, performance assessment motivation, and attitudes on instructionally portfolios relevant classroom activities (O’Malley & Valdez, 1996). self-assessment Diagnose students strengths and needs Provide feedback on student learning Purposes Provide a basis for instructional placement for Inform and guide instruction Assessment Communicate learning expectations Motivate and focus students’ attention and effort Provide practice applying knowledge and skills Provide a basis for evaluation for the purpose of: Purposes ◦ Grading ◦ Promotion/graduation continued ◦ Program admission/selection ◦ Accountability ◦ Gauge program effectiveness Assessment Instruments Pre-assessment (diagnostic) Formative (ongoing) Summative (final) Pretests Quizzes Teacher-made test Observations Discussions Portfolios Journals/logs Assignments Projects Discussions Projects Standardized tests Questionnaires Observations Interviews Portfolios Journal logs Standardized tests Discussion Which types of How would you assessments noted in document a student the chart could be performance during considered authentic a discussion? assessment? Practicality Reliability Principles of Assessment Validity Authenticity Washback An effective test is practical ◦ Is not excessively expensive ◦ Stays within appropriate time Practicality constraints ◦ Is relatively easy to administer ◦ Has a scoring/evaluation procedure that is specific and time-efficient A reliable test is consistent and dependable. If you give the same test to the same students in two different occasions, the test should yield similar Reliability results. ◦ Student-related reliability ◦ Rater reliability ◦ Test administration reliability ◦ Test reliability The most common issue in student Student related reliability is caused by temporary illness, fatigue, a bad day, anxiety, and Related other physical and psychological factors Reliability which may make an “observed” score deviate from a “true” score. Human error, subjectivity, and bias may enter into the scoring process. Inter-rater reliability occurs when two or Rater more scorers yield inconsistent scores of the same test, possibly for lack of Reliability attention to scoring criteria, inexperience, inattention, or even preconceived bias toward a particular “good” and “bad” student. Test administration reliability deals with the conditions in which the test is administered. Test ◦ Street noise outside the building Administration ◦ bad equipment Reliability ◦ room temperature ◦ the conditions of chairs and tables, photocopying variation The test is too long Test Reliability Poorly written or ambiguous test items A test is valid if it actually assess the objectives and what has been taught. ◦ Content validity Validity ◦ Criterion validity (tests objectives) ◦ Construct validity ◦ Consequential validity ◦ Face validity A test is valid if the teacher can clearly define the achievement that he or she is measuring A test of tennis competency that asks Content someone to run a 100-yard dash lacks content validity Validity If a teacher uses the communicative approach to teach speaking and then uses the audiolingual method to design test items, it is going to lack content validity The extent to which the objectives of the test have been measured or assessed. For instance, if you are assessing reading Criterion- skills such as scanning and skimming information, how are the exercises related designed to test these objectives? Validity In other words, the test is valid if the objectives taught are the objectives tested and the items are actually testing this objectives. A construct is an explanation or theory that attempts to explain observed phenomena Construct If you are testing vocabulary and the Validity lexical objective is to use the lexical items for communication, writing the definitions of the test will not match with the construct of communicative language use Accuracy in measuring intended criteria Its impact on the preparation of test- Consequential takers Its effect on the learner Validity Social consequences of a test interpretation (exit exam for pre-basic students at El Colegio, the College Board) Face validity refers to the degree to which a test looks right, and appears to measure the knowledge or ability it claims to measure ◦ A well-constructed, expected format with familiar tasks Face Validity ◦ A test that is clearly doable within the allotted time limit ◦ Directions are crystal clear ◦ Tasks that relate to the course (content validity) ◦ A difficulty level that presents a reasonable challenge The language in the test is as natural as possible Items are contextualized rather than isolated Authenticity Topics are relevant and meaningful for learners Some thematic organization to items is provided Tasks represent, or closely approximate, real-world tasks Washback refers to the effects the tests have on instruction in terms of how students prepare for the test “Cram” courses and “teaching to the test” are Washback examples of such washback In some cases the student may learn when working on a test or assessment Washback can be positive or negative Self and peer-assessments ◦ Oral production-student self-checklist, peer checklist, offering and receiving holistic rating of an oral presentation ◦ Listening comprehension- listening to Alternative TV or radio broadcasts and checking comprehension with a partner Assessment ◦ Writing-revising work on your own, Options peer-editing ◦ Reading- reading textbook passages followed by self-check comprehension questions, self-assessment of reading habits (page 416, Brown, 2001) Performance assessment- any form of assessment in which the student constructs a response orally or in writing. Authentic It requires the learner to accomplish a complex and significant task, while Assessment bringing to bear prior knowledge, recent learning, and relevant skills to solve realistic or authentic problems (O’Malley & Valdez, 1996; Herman, et. al., 1992). Examples of Authentic Assessment Portfolio Student self- Peer assessment assessment assessment Student-teacher Oral interviews Writing samples conferences Projects or Experiments or exhibitions demonstrations Aptitude Diagnostic Forms of Language Placement Testing Achievement Proficiency tests Listening Types of Speaking Language Skills Reading Writing REFLECTION

Topic 1 Introduction to Testing and Evaluation PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue