Lecture 5 Stages of Test Development PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document details the stages of test development, including general procedures for test construction, writing specifications, and considerations for trialling tests. It also discusses the importance of clear test specifications, and the content of a language skill test.
Full Transcript
5/1/2016 Stages of test development Spring 2016 Stages of test development General procedures for test construction: 1. Make a full and clear statement of the testing problem. 2. Write complete specifications for the test. 3. Write and modera...
5/1/2016 Stages of test development Spring 2016 Stages of test development General procedures for test construction: 1. Make a full and clear statement of the testing problem. 2. Write complete specifications for the test. 3. Write and moderate items. 4. Trial the items informally on native speakers and reject or modify problematic ones as necessary. 5. Trial the test on a group of non-native speakers similar to those for whom the test is intended. 6. Analyse the results of the trial and make any necessary changes. 7. Validate. 8. Write handbooks for test takers, test users and staff. 9. Train any necessary staff. 1 5/1/2016 Stages of test development (Cont’d) Test development is a task to be carried out by a team which should have qualities such as: 1. willingness to accept justified criticism, 2. native or near-native command of the language, 3. intelligence and imagination (to create contexts in items and to foresee possible misinterpretations). In what follows the stages of test development are explained in more detail: 1. Stating the problem In order to make yourself clear about what you want to know and for what purpose, the following questions have to be answered: 1. What kind of test is it to be? (the four main types) 2. What is its precise purpose? 3. What abilities are to be tested? 4. How detailed must the results be? 5. How accurate must the results be? 6. How important is backwash? 7. What constraints are set by unavailability of expertise, facilities, time (for construction, administration and scoring)? 2 5/1/2016 2. Writing specifications for the test This will include information on: content, test structure, timing, medium/channel, techniques to be used, criterial levels of performance, and scoring procedures. i. Content: the fuller the information on content, the less arbitrary should be the subsequent decisions as to what to include in the test. However, only elements which clearly contribute to the language abilities should be included. The content of a test of a language skill may be specified along with other dimensions such as the following: 2. Writing specifications for the test (Cont’d) a) Operations: the tasks to be carried out by candidates: for a reading test, for example, this may include: scan text to locate specific information, guess meaning of unknown words from context) b) Types of text: for a writing test, for example, this may include: letters, forms, essays. c) Addressees of texts: this refers to the kind of people that the candidate is expected to be able to write or speak to, e.g., native speakers of the same age and status, native speaker university students. d) Length of texts: length of passages to be read, length of spoken texts to be listened to, or length of pieces to be written. 3 5/1/2016 2. Writing specifications for the test (Cont’d) e) Topics: topics selected should be suitable for the candidate and the type of test. f) Readability: reading passages should be specified according to a certain range of readability. g) Vocabulary range: this can be specified with the help of, for example, the hand book of the Cambridge Young Learners tests, where words are listed. h) Dialect, accent, style: this refers to those that the test takers are meant to understand or which they are expected to write or speak. Style may be formal, informal, conversational, etc. i) Speed of processing: in reading the number of words to be read per minute, in speaking, the rate of speech, etc. 2. Writing specifications for the test (Cont’d) ii. ii. Structure, timing, medium/channel and techniques: the following should be specified: a) Test structure: what sections will the test have and what will be tested in each section? b) Number of items: (in total and in the various sections. c) Number of passages: ( and number of items associated with each. d) Medium/channel: (paper and pencil, tape, computer, face-to- face, telephone, etc.) e) Timing: for each section and for entire test) f) Techniques: what techniques will be used to measure what skills or sub-skills? 4 5/1/2016 2. Writing specifications for the test (Cont’d) iii. Criterial levels of performance: the required levels of performance for (different levels of) success should be specified. This may involve a simple statement such as, ‘to demonstrate ‘mastery’, 80 per cent of the items must be answered correctly’. iv. Scoring procedures: this is important especially when the scoring will be subjective. This will include: What rating scale will be used? How many people will rate each piece of work? What happens if two or more raters disagree about a piece of work. 3. Writing and moderating items Once specifications are in place, the writing of items can begin. This will include: a) Sampling: not everything specified in the content will be covered by the items. Therefore, choices have to be made and should widely cover the area of content to insure content validity and beneficial backwash. b) Writing items: items should be written with the specifications in mind. However, the writing of successful items is extremely difficult. So, some items will be rejected, others reworked. c) Moderating items: moderation is the scrutiny of proposed items by at least two colleagues, neither of whom is the author of the items being examined. Their task is to find weaknesses in the items and remedy or reject them. 5 5/1/2016 4. Informal trialling of items on native speakers After the process of moderation, items should be presented in the form of a test to a number of native speakers, 20 or more if possible. They should be similar to candidates in terms of age, education, and general background. Items that prove difficult for the native speakers or are given inappropriate responses should be revised or replaced. Trialling of the test on a group 5. of non-native speakers After moderation and trialling on native speakers, the test should be administered under test conditions to a group similar to that for whom the test is intended. Then, problems in administration and scoring are noted. However, a trialling of this kind is often not possible. Why? 6 5/1/2016 6. Analysis of results of the trial and making necessary changes There are two types of analysis that should be carried out: statistical and qualitative. Statistical analysis will reveal how difficult items are and how well they discriminate between stronger and weaker candidates in addition to qualities of the test such as reliability. Qualitative analysis will help to discover misinterpretations, unanticipated but possibly correct answers and any other indicators of faulty items that should be modified or dropped from the test. 7. Validation The final version of the test can be validated. For a high stakes, or published test, this should be regarded essential. However, for relatively low stakes tests that are to be used within an institution, this may not be necessary. 7 5/1/2016 8. Writing handbooks for test takers, test users and staff Handbooks may be expected to contain the following: 1. The rationale for the test. 2. An account of how the test was developed and validated. 3. A description of the test (which may include a version of the specifications). 4. Sample items (or a complete sample test). 5. Advice on preparing for taking the test. 6. An explanation of how test scores are to be interpreted. 7. Training material (for interviewers, raters, etc. 8. Details of test administration. 9. Training staff Using the handbooks and other materials, all staff who will be involved in the test process should be trained. This may include interviewers, raters, scorers, computer operators, and invigilators (proctors). 8 5/1/2016 References Hughes, Arthur (2003)Testing for Language Teachers. Cambridge Language Teaching Library Brown, H. Douglas (2003)Language Assessment: Principles and classroom practices. Longman 9