Week 2 - Assessment Concepts and Issues PDF
Document Details
Uploaded by LighterCurium3563
Tags
Summary
This document discusses assessment and testing concepts, highlighting the differences between the two and providing examples of language tests. It also includes a sample quiz to test understanding of these concepts.
Full Transcript
ies 2 | | CHAPTER T ASSESSMENT CONCEPTS AND ISSUES Objectives: After reading this chapter, you will be able to: Understand differences between assessment and testing, along wi...
ies 2 | | CHAPTER T ASSESSMENT CONCEPTS AND ISSUES Objectives: After reading this chapter, you will be able to: Understand differences between assessment and testing, along with other basic assessment concepts and terms Distinguish among five different types of language tests, cite examples of each, and apply them for different purposes and contexts Appreciate historical antecedents of present-day trends and research in language assessment Grasp some major current issues that assessment researchers are now addressing Tests have a way of scaring students. How many times in your school days did you feel yourself tense up when your teacher mentioned a test? The anticipation of the upcoming “moment of truth” may have provoked feelings of anxiety and self-doubt along with a fervent hope that you would come out on the other end with at least a sense of worthiness. The fear of failure is perhaps one of the strongest negative emotions a student can experience, and the most common instrument inflicting such fear is the test. You are not likely to view a test as positive, pleasant, or affirming, and, like most ordinary mortals, you may intensely wish for a miraculous exemption from the ordeal. And yet, tests seem as unavoidable as tomorrow’s sunrise in virtually all educational settings around the world. Courses of study in every discipline are marked by these periodic milestones of progress (or sometimes, in the perception of the learner, confirmations of inadequacy) that have become conventional methods of measurement. Using tests as gatekeepers—from classroom achievement tests to large-scale standardized tests—has become an acceptable norm. Now, just for fun, take the following quiz. These five questions are sample items from the verbal section of the Graduate Record Examination (GRE®). All the words are found in standard English dictionaries, so you should be able to answer all five items easily, right? Okay, go for it. 2 CHAPTER 1 Assessment Concepts and Issues Directions: In each of the five items below, select the definition that correctly defines the word. You have two minutes to complete this test! 1. onager a. large specialized bit used in the fina! stages of well-drilling b. in cultural anthropology, an adolescent approaching puberty an Asian wild ass with a broad dorsal stripe. a phrase or word that quantifies a noun a 2. shroff (Yiddish) a prayer shawl worn by Hassidic Jews anooe. a fragment of an ancient manuscript (Archaic) past tense form of the verb to shrive. a banker or money changer who evaluates coin 3. hadal relating to the deepest parts of the ocean below & 20,000 feet. one of seven stations in the Islamic bajj (pilgrimage) to Mecca. a traditional Romanian folk dance performed at Spring festivals. pertaining to Hades 4. chary a. discreetly cautious and vigilant about dangers and risks. pertaining to damp, humid weather before a rainstorm optimistic, positive, looking on the bright side. expensive beyond one’s means 5. yabby overly talkative, obnoxiously loquacious. any of various Australian burrowing crayfishes a small, two-person horse-drawn carriage used in Victorian England. in clockwork mechanisms, a small latch for calibrating the correct time Now, how did that make you feel? Probably just the same as many learners feel when they take multiple-choice (or shall we say multiple-guess?), timed, “tricky” tests. To add to the torment, if this were a commercially administered standardized test, you would probably get a score that, in your mind, demon- strates that you did worse than hundreds of people! If you’re curious about how you did on the GRE sample quiz, check your answers on page 23 at the end of this chapter. CHAPTER 1 Assessment Concepts and Issues 3 Of course, this little quiz on infrequently used English words is not an appropriate example of classroom-based achievement testing, nor is it intended to be. It’s simply an illustration of how tests make us feel much of the time. Here’s the bottom line: Tests need not be degrading or threatening to your students. Can they instead build a person’s confidence and become learnin g experiences? Can they become an integral part of a student's ongoing classroom development? Can they bring out the best in students? The answer is yes. That’s mostly what this book is about: helping you create more authentic, intrinsically motivating assessment procedures that are appropriate for their context and designed to offer constructive feedback to your students. To reach this goal, it’s important to understand some basic concepts: * What do we mean bv assessment? ¢ What is the difference between assessment and a test? * How do various categories of assessments and tests fit into the teaching— learning process? ASSESSMENT AND TESTING Assessment is a popular and sometimes misunderstood term in current educa- tional practice. You might think of assessing and testing as synonymous terms, but they are not. Let's differentiate the two concepts. Assessment is “appraising or estimating the level or magnitude of some attribute of a person” (Mousavi, 2009, p. 35). In educational practice, assessment is an Ongoing process that encompasses a wide range of methodological tech- niques. Whenever a student responds to a question, offers a comment, or tries a new word or structure, the teacher subconsciously appraises the student’s performance. Written work—from a jotted-down phrase to a formal essay—is a performance that ultimately is “judged” by self, teacher, and possibly other stu- dents. Reading and listening activities usually require some sort of productive performance that the teacher observes and then implicitly appraises, however peripheral that appraisal may be. A good teacher never ceases to assess stu- dents, whether those assessments are incidental or intended. Tests, on the other hand, are a subset of assessment, a genre of assess- ment techniques. They are prepared administrative procedures that occur at identifiable times in a curriculum when learners muster all their faculties to offer peak performance, knowing that their responses are being measured and evaluated. In scientific terms, a test is a method of measuring a person's ability, knowl- edge, or performance in a given domain. Let’s look at the components of this definition. A test is first a method. It’s an instrument—a set of techniques, pro- cedures, or items—that requires performance on the part of the test-taker. To qualify as a test, the method must be explicit and structured: multiple-choice questions with prescribed correct answers, a writing prompt with a scoring 4 CHAPTER 1 Assessment Concepts and Issues rubric, an oral interview based on a question script, or a checklist of expected responses to be completed by the administrator. Second, a test must measure, which may be defined as a process of quan- tifying a test-taker’s performance according to explicit procedures or rules (Bachman, 1990, pp. 18-19). Some tests measure general ability, whereas others focus on specific competencies or objectives. A multiskill proficiency test deter- mines a general ability level; a quiz on recognizing correct use of definite arti- cles measures specific knowledge. The way the results or measurements are communicated may vary. Some tests, such as a classroom-based, short-answer essay test, may earn the test-taker a letter grade accompanied by marginal com- ments from the instructor. Others, particularly large-scale standardized tests, provide a total numerical score, a percentile rank, and perhaps some subscores. If an instrument does not specify a form of reporting measurement—a means to offer the test-taker some kind of result—then that technique cannot appro- priately be defined as a test. Next, a test measures an individual's ability, knowledge, or performance. Testers need to understand who the test-takers are. What are their previous experiences and backgrounds? Is the test appropriately matched to their abili- ties? How should test-takers interpret their scores? A test measures performance, but the results imply the test-taker’s ability or, to use a concept common in the field of linguistics, competence. Most language tests measure one’s ability to perform language, that is, to speak, write, read, or listen to a subset of language. On the other hand, tests are occasionally designed to tap into a test-taker’s knowledge about language: defining a vocabulary item, reciting a grammatical rule, or identifying a rhetorical feature in written dis- course. Performance-based tests sample the test-taker’s actual use of language, and from those samples the test administrator infers general competence. A test of reading comprehension, for example, may consist of several short reading pas- sages each followed by a limited number of comprehension questions—a small sample of a second language learner’s total reading behavior. But the examiner may infer a certain level of general reading ability from the results of that test. Finally, a test measures a given domain. For example, in the case of a pro- ficiency test, even though the actual performance on the test involves only a sampling of skills, the domain is overall proficiency in a language—general competence in all skills of a language. Other tests may have more specific crite- ria. A test of pronunciation might well test only a limited set of phonemic mini- mal pairs. A vocabulary test may focus on only the set of words covered in a particular lesson or unit. One of the biggest obstacles to overcome in construct- ing adequate tests is to measure the desired criterion and not inadvertently include other factors, an issue that is addressed in Chapters 2 and 3. A well-constructed test is an instrument that provides an accurate measure of the test-taker’s ability within a particular domain. The definition sounds fairly simple but, in fact, constructing a good test is a complex task involving both science and art. CHAPTER 1 Assessment Concepts and Issues 5 Measurement and Evaluation Two frequently occurring, yet potentially confusing, terms that often appear in discussions of assessment and testing are measurement and evaluation. Because the terms lie somewhere between assessment and testing, they are at times mis- takenly used as synonyms of one or the other concept. Let's take a brief look at these two processes. Measurement is the process of quantifying the observed performance of classroom learners. Bachman (1990) cautioned us to distinguish between quan- titative and qualitative descriptions of student performance. Simply put, the former involves assigning numbers (including rankings and letter grades) to observed performance, whereas the latter consists of written descriptions, oral feedback, and other nonquantifiable reports. Quantification has clear advantages. Numbers allow us to provide exact descriptions of student performance and to compare one student with another more easily. They also can spur us to be explicit in our specifications for scor- ing student responses, thereby leading to greater objectivity, On the other hand, quantifying student performance can work against the teacher or tester, perhaps masking nuances of performance or giving an air of certainty when scoring rubrics may actually be quite vague. Verbal or qualitative descriptions may offer an opportunity for a teacher to individualize feedback for a student, such as in marginal comments on a student's written work or oral feedback on pronunciation. Yet another potentially ambiguous term that needs explanation is evalua- tion. Is evaluation the same as testing? Evaluation does not necessarily entail testing; rather, evaluation is involved when the resuits of a test (or other assess- ment procedure) are used to make decisions (Bachman, 1990, pp. 22-23). Evalu- ation involves the interpretation of information. Simply recording numbers or making check marks on a chart does not constitute evaluation. You evaluate when you “value” the results in such a way that you convey the worth of the performance to the test-taker, usually with some reference to the consequences— good or bad—of the performance. Test scores are an example of measurement, and conveying the “meaning” of those scores is evaluation. If a student achieves a score of 75% (measure- ment) on a final classroom examination, he or she may be told that the score resulted in a failure (evaluation) to pass the course. Evaluation can take place without measurement, as in, for example, a teacher’s appraisal of a student's correct oral response with words like “excellent insight, Fernando!” Assessment and Learning Returning to our contrast between tests and assessment, we find that tests are a subset of assessment, but they are certainly not the only form of assessment that a teacher can apply. Although tests can be useful devices, they are only one among many procedures and tasks that teachers can ultimately use to 6 CHAPTER 1 Assessment Concepts and Issues assess (and measure) students. But now, you might be thinking, if you make assessments every time you teach something in the classroom, does all teach- ing involve assessment? Are teachers constantly assessing students, with no assessment-free interactions? The answers depend on your perspective. For optimal learning to take place, students in the classroom must have the freedom to experiment, to try out their own hypotheses about language without feeling their overall compe- tence is judged in terms of those trials and errors. In the same way that tourna- ment tennis players must, before a tournament, have the freedom to practice their skills with no implications for their final placement on that day of days, so also must learners have ample opportunities to “play” with language in a class- room without being formally graded. Teaching sets up the practice games of language learning: the opportunities for learners to listen, think, take risks, set goals, and process feedback from the “coach” and then incorporate their acquired skills into their performance. At the same time, during these practice activities, teachers (and tennis coaches) are indeed observing students’ performance, possibly taking measure- ments, offering qualitative feedback, and suggesting strategies. For example: * How did the performance compare with previous performance? * Which aspects of the performance were better than others? * Is the learner performing up to an expected potential? ¢ What can the learner do to improve performance the next time? * How does the performance compare with that of others in the same learning community? In the ideal classroom, all these observations feed into the way the teacher provides instruction to each student. (See Clapham, 2000; Cumming, 2009, for a discussion of the relationship among testing, assessment, and teaching.) Figure 1.1 shows the interrelationships among testing, measurement, assessment, teaching, and evaluation. This diagram represents our discussion of all these overlapping concepts. Informal and Formal Assessment One way to begin untangling the lexical conundrum created by distinguishing among tests, assessment, teaching, and other related concepts is to understand the difference between informal and formal assessment. Informal assessment can take a number of forms, starting with incidental, unplanned comments and responses, along with coaching and other impromptu feedback to the student. Examples include putting a smiley face on homework or saying “Nice job!” or “Good work!” “Did you say can or can't?” “I think you meant to say you broke the glass, not you break the glass.” Informal assessment does not stop there. A good deal of a teacher’s infor- mal assessment is embedded in classroom tasks designed to elicit performance CHAPTER 1 Assessment Concepts and Issues 7 Figure 1.1 Tests, measurement, assessment, teaching, and evaluation (m s Measu —- Evalu remen ation t Assessment Teaching without recording results and making fixed conclusions about a student’s com- petence. Informal assessment is virtually always nonjudgmental, in that you as a teacher are not making ultimate decisions about the student’s performance; you're simply trying to be a good coach. Examples at this end of the continuum include making marginal comments on papers, responding to a draft of an essay, offering advice about how to better pronounce a word, suggesting a strat- egy to compensate for a reading difficulty, or showing a student how to modify his or her notetaking to better remember the content of a lecture. On the other hand, formal assessments are exercises or procedures spe- cifically designed to tap into a storehouse of skills and knowledge. They are systematic, planned sampling techniques constructed to give teacher and stu- dent an appraisal of student achievement. To extend the tennis analogy, formal assessments are the tournament games that occur periodically in the course of a regimen of practice. Is formal assessment the same as a test? We can say that all tests are formal assessments, but of all formal assessment is testing. For example, you might use a student’s journal or portfolio of materials as a formal assessment of the attainment of certain course objectives, but calling those two procedures “tests” is problematic. A systematic set of observations of the frequency of a student's oral participation in class is certainly a formal assessment, but it too is hardly what anyone would call a test. Tests are usually relatively constrained by time (usually spanning a class period or at most several hours) and draw on a limited sample of behavior. Formative and Summative Assessment Another useful distinction to bear in mind is the function of an assessment: How is the procedure to be used? Two functions are commonly identified in the literature: formative and summative assessments. Most classroom assessment is 8 CHAPTER1 Assessment Concepts and Issues formative, assessment: evaluating students in the process of “forming” their competencies and skills with the goal of helping them to continue that growth process. The key to such formation is the delivery (by the teacher) and internal- ization (by the student) of appropriate feedback on performance, with an eye toward the future continuation (or formation) of learning. For all practical purposes, virtually all kinds of informal assessment are (or should be) formative. They have as their primary focus the ongoing develop- ment of the learner's language. So when you give a student a comment or a suggestion, or call attention to an error, you offer that feedback to improve the learner’s language ability. (See Andrade & Cizek, 2010, for an overview of for- mative assessment.) Summative assessment aims to measure, or summarize, what a student has grasped and typically occurs at the end of a course or unit of instruction. A summation of what a student has learned implies looking back and taking stock of how well that student has accomplished objectives, but it does not necessar- ily point to future progress. Final exams in a course and general proficiency exams are examples of summative assessment. Summative assessment often, but not always, involves evaluation (decision making). Ross (2005) cited research to show that the appeal of formative assessment is growing and that conventional summative testing of language-learning out- comes is gradually integrating formative modes of assessing language learning as an ongoing process. Also, Black and William's (1998) analysis of 540 research studies found that formative assessment was superior to summative assessment in providing individualized crucial information to classroom teachers. (See also Bennett, 2011; Black & William, 2009.) One of the problems with prevailing attitudes toward testing is the view that all tests (quizzes, periodic review tests, midterm exams, etc.) are summative. At various points in your past educational experiences, no doubt you've considered such tests summative. You may have thought, “Whew! I'm glad that’s over. Now I don’t have to remember that stuff anymore!” A challenge to you as a teacher is to change that attitude among your students: Can you instill a more formative quality to what your students might otherwise view as a summative test? Can you offer your students an opportunity to convert tests into “learning experiences”? We will take up this challenge in subsequent chapters in this book. Norm-Referenced and Criterion-Referenced Tests Another dichotomy that’s important to clarify and that aids in sorting out com- mon terminology in assessment is the distinction between norm-referenced and criterion-referenced testing. In norm-referenced tests, each test-taker’s score is interpreted in relation to a mean (average score), median (middle score). stan- dard deviation (extent of variance in scores), and/or percentile rank. The pur- pose of such tests is to place test-takers in rank order along a mathematical continuum. Scores are usually reported back to the test-taker in the form of a