Module Assessment 3 PDF
Document Details
University of Eastern Philippines
Tags
Summary
This document details module assessment 3 for a Bachelor of secondary education program at the University of Eastern Philippines. It covers the characteristics of quality assessment tools, different types of teacher-made tests, learning targets, and assessment methods.
Full Transcript
lOMoARcPSD|29548003 Module Assessment 3 Bachelor of secondary education (University of Eastern Philippines) Scan to open on Studocu Studocu is not sponsored or endorsed by any college o...
lOMoARcPSD|29548003 Module Assessment 3 Bachelor of secondary education (University of Eastern Philippines) Scan to open on Studocu Studocu is not sponsored or endorsed by any college or university Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 3 Designing and Developing Assessments Overview Instructional objectives must be specific, measurable, attainable, relevant, and time- bound. Teachers must develop assessment tools like test items that should match with the instructional objectives appropriately and accurately. Teachers should have skills and knowledge to design and develop assessment tools used to guide the collection of quality evidence, including their application in formative and summative assessment. This module discusses the characteristics of quality assessment tools and the different types of teacher-made tests. This also includes discussions on the learning target, assessment methods and assessment tools development. Learning Outcomes After learning this module, you should be able to: develop assessment tools that are learner-appropriate and target-matched; and improve assessment tools based on assessment data. Lesson 1. Characteristics of Quality Assessment Tools Assessment tools are techniques used to measure a student’s academic abilities, skills, and/or fluency in a given subject or to measure one’s progress toward academic proficiency in a specific subject area. It is the instrument (form, test, rubric, etc.) that is used to collect data for each outcome. The actual product that is handed out to students for the purpose of assessing whether they have achieved a particular learning outcome(s). Assessments can be either formal or informal. Informal assessments are often inferences an educator draws as a function of unsystematic observations of a student’s performance in the subject matter under consideration. Formal assessments are objective measurements of a student’s abilities, skills, and fluency using screening, progress monitoring, diagnosis, or evaluation. Both types of assessments are important; however, only formal assessments are research, or evidence-based. Educators use assessment tools to make informed decisions regarding strategies to enhance student learning. General Principles of Testing (Ebel and Frisbie, 1999) 1. Measure all instructional objectives. When a teacher constructs test items to measure the learning progress of the students, they should match all the learning objectives posed during instruction. That is why the first step in constructing a test is for the teacher to go back to the instructional objectives. 2. Cover all the learning tasks. Teacher should construct a test that contains a wide range of sampling of items.in this case, the teacher can determine the educational outcomes or abilities that the resulting scores are representatives of the total performance in the areas measured. Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 3. Use appropriate test items. The test items constructed must be appropriate to measure learning outcomes. 4. Make test valid and reliable. The teacher must construct a test that is valid so that in can measure what is supposed to measure from the students. The test is reliable when the scores of the students remain the same or consistent when the teacher gives the same test for the second time. 5. Use test to improve learning. The test scores should be utilize by the teacher properly to improve learning by discussing the skills or competencies on the items that have not been learned or mastered by the learners. Appropriateness of Assessment Tools The type of test used should match the instructional objective or learning outcomes of the subject matter posed during the delivery of the instruction. The following are the types of assessment tools: 1. Objective Test. It requires student to select the correct response or to supply a word or short phrase to answer a question or complete statement. It includes true-false, matching type, and multiple-choice questions. The word objective refers to the scoring, it indicates that there is only one correct answer. 2. Subjective Test. It permits the student to organize and present an original answer. It includes either short answer questions or long general questions. This type of test has no specific answer. Hence, it is usually scored on an opinion basis, although there will be certain facts and understanding expected in the answer. 3. Performance Assessment. Is an assessment in which students are asked to perform real- world tasks that demonstrate meaningful application of essential knowledge and skills. It can appropriately measure learning objectives which focus on the ability of the students to demonstrate skills or knowledge in real-lifer situations. 4. Portfolio Assessment. It is an assessment that is based on the systematic, longitudinal collection of student work created in response to specific, known instructional objectives and evaluated in relation to the same criteria. Portfolio is a purposeful collection of students’ work that exhibits the students’ efforts, progress and achievements in one or more areas over a period of time. It measures the growth and development of students. 5. Oral Questioning. This method is used to collect assessment data by asking oral questions. The most commonly used of all forms of assessment in class, assuming that the learner hears and shares the use of common language with the teacher during instruction. The ability of the students to communicate orally is very relevant to this type of assessment. This is also a form of formative assessment. 6. Observation Technique. This is a method of collecting assessment data. The teacher will observe how students carry our certain activities either observing the process or product. There are two types of observation techniques: formal and informal observations. Formal observations are planned in advance like when the teacher assess oral report or presentation in class while informal observation is done spontaneously, during instruction like observing the working behavior of students while performing a laboratory experiment. 7. Self-report. The responses of the students may be used to evaluate both performance and attitude. Assessment tools could include sentence completion, Likert scales, checklists, or holistic scales. Different Qualities of Assessment Tools 1. Validity refers to appropriateness of score-based inferences; or decisions made based on the students’ test results. The extent to which a test measured what is supposed to measure. 2. Reliability refers to the consistency of measurement; that is, how consistent test results or other assessment results from one measurement to another. A test is reliable when it can be used to predict practically the same scores when the test administered twice to the same group of students and with a reliability index of 0.61 above. 45 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 3. Fairness means the test item should not have any biases. It should not be offensive to any examinee subgroup. A test can only be good if it is fair to all the examinees. 4. Objectivity refers to the agreement of two or more raters or test administrators concerning the score of a student. If the two raters who assess the same student on the same test cannot agree on the score, the test lacks objectivity and neither of the score from the judges is valid. Lack of objectivity reduces test validity in the same way that the lack of reliability influence validity. 5. Scorability means that the test should be easy to score, direction for scoring should be clearly stated in the instruction. Provide the students an answer sheet and the answer key for the one who sill check the test. 6. Adequacy means that the test should contain a wide range of sampling of items determine the educational outcomes or abilities so that the resulting scores are representatives of the total performance in the areas measured. 7. Administrability means the test should be administered uniformly to all students so that the scores obtained will not vary due to factors other than differences of the students’ knowledge and skills. There should be a clear provision for instruction for the students, proctors and even the one who will check the test. 8. Practicality and Efficiency refers to the teacher’s familiarity with the methods used, time required for the assessment, complexity of the administration, ease of scoring, ease of interpretation of the test results and the materials used must be at the lowest cost. Assessment Task 3.1 1. Why are assessment tools necessary in assessment? 2. What are the advantages and disadvantages of a subjective test over objective type of test? Lesson 2. Types of Teacher-made Tests There are different types of assessing the performance of the students like objective test, subjective test, performance based assessment, oral questioning, portfolio assessment, self- assessment and checklist. Each of this has their own function and use. The type of assessment tools should always be appropriate with the objectives of the lesson. There are two general types of test item to use in achievement test using paper and pencil test. It is classified as selection- type items and supply-type items. Selection-type or Objective Test Items A. Multiple-choice Test A multiple-choice test is used to measure knowledge outcomes and other types of learning outcomes such as comprehension and applications. It is the most commonly used format in measuring student achievements at different levels of learning. Multiple-choice item consists of three parts: the stem, the keyed option and the incorrect options or alternatives. The stem represents the problem or question usually expressed in completion form or questions form. The keyed option is the correct answer. The incorrect options or alternatives also called distracters or foil. Guidelines in Constructing Multiple-choice Test 1. Construct each item to assess a single written objective 2. Base each item on a specific problem stated clearly in the stem. 46 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 3. Include as much of the item as possible in the stem, but do not include irrelevant material. 4. State the stem in positive form (in general). 5. Word the alternatives clearly and concisely 6. Keep the alternatives mutually exclusive. 7. Keep the alternatives homogeneous in content. 8. Keep the alternatives free from clues as to which response is correct. 8.1 Keep the grammar of each alternative consistent with the stem 8.2 Keep the alternatives parallel in form. 8.3 Keep the alternatives similar in length. 8.4 Avoid textbook, verbatim phrasing. 8.5 Avoid the use of specific determiners. 8.6 Avoid including keywords in the alternatives. 8.7 Use plausible distracters. 9. Avoid the alternatives “all of the above” and “none of the above” (in general). 10. Use as many functional distracters as are feasible. 11. Include one and only one correct or clearly best answer in each item. 12. Present the answer in each of the alternative positions approximately an equal number of times, in a random order. 13. Lay out the items in a clear and consistent manner 14. Use proper grammar, punctuation, and spelling. 15. Avoid using unnecessarily difficult vocabulary. 16. Analyze the effectiveness of each item after each administration of the test. Examples of Multiple-choice Items Knowledge Level The number of chromosomes in a cell produced by meiosis is ______ A. half as many as the original cell. B. twice as many as the original cell. C. the same number as the original cell. D. not predictable. Comprehension Level Why did John B. Watson reject the structuralist study of mental events? A. He believed that structuralism relied too heavily on scientific methods. B. He rejected the concept that psychologists should study observable behavior. C. He believed that scientists should focus on what is objectively observable. D. He actually embraced both structuralism and functionalism. Application Level An envelope contains 140 pieces consisting P1000 and P500 peso bill. If the number of P500 IS 20 more than the number of P1000 peso bill. How many P500 peso bill are there? A. 40 B. 60 C. 70 D. 80 Analysis Level What is the statistical test used when you test the mean difference between the pre- test and post-test? A. Analysis of variance B. t-test C. Correlation D. Regression analysis 47 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 Advantages of Multiple-choice Test 1. Measures learning outcomes from the knowledge to evaluation level. 2. Scoring is highly objective, easy, and reliable. 3. Scores are more reliable than subjective type of test. 4. Measures broad samples of content within a short span of time. 5. Distracters can provide diagnostic information. 6. Item analysis can reveal the difficulty of an item and can discriminate the good and poor performing students. Disadvantages of Multiple-choice Test 1. Time consuming to construct a good item. 2. Difficult to find effective and plausible distracters. 3. Scores can be influenced by the reading ability of the examinees. 4. In some cases, there is more than one justifiable correct answer. 5. Ineffective in assessing the problem solving skills of the students. 6. Not applicable when assessing the students’ ability to organize and express ideas. B. Matching Type Test Matching type item consists of two columns. Column A contains the descriptions and must be placed at the left side while column b contains the options and placed at the right side. The examinees are asked to match the options that are associated with the descriptions. Guidelines in Constructing Matching Type Test 1. The descriptions and options must be short and homogeneous. 2. The descriptions must be written at the left side and marked it with Column A and the options must be written at the right side and marked it with Column B to save time for the examinees. 3. There should be more options than descriptions or indicate in the directions that each option may be used more than once to decrease the chance of guessing. 4. Matching directions should specify the basis for matching. Failure to indicate how matches should be marked can greatly increase the time consumed by the teacher in scoring. 5. Avoid too many correct answers. 6. When using names, always include the complete name (first name and surname) to avoid ambiguities. 7. Use numbers for the descriptions and capital letters for the options to avoid confusions to the students that have a reading problem. 8. Arrange the options into a chronological order or alphabetical order. 9. The descriptions and options must be written in the same page. 10. A minimum of three items and a maximum of seven items for elementary level and a maximum of seventeen items for secondary and tertiary levels. 48 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 Examples of Matching Type Test Direction: Match each trigonometric term in column A with the corresponding expression in column B. Write only the letter of your choice on the space provided for each item. COLUMN A COLUMN B _____1. 1 A. _____2. csc B. sin csc _____3. cos C. 1 – cos2 _____4. tan D. _____5. cot2 E. 1 - tan2 _____6. cos2 F. _____7. sec G. _____8. sec2 H. _____9. sin2 I. 1 – sin2 _____10. cot J. csc2 - 1 K. 1 + tan2 L. 1 + cos2 Advantages of Matching Type Test 1. It is simpler to construct than a multiple-choice type of test. 2. It reduces the effect of guessing compared to the multiple-choice and true or false type of tests. 3. It is appropriate to assess the association between facts. 4. Provide easy, accurate, efficient, objective and reliable test scores. 5. More content can be covered in the given set of test. Disadvantages of Matching Type Test 1. It measures only simple recall or memorization of information. 2. It is difficult to construct due to problems in selecting the descriptions and options. 3. It assesses only low level of cognitive domain such as knowledge and comprehension. C. True or False Type In this type of test, the examinees determine whether the statement presented is true or false. True or false test item is an example of a “force-choice test” because there are only two possible choices in this type of test. The students are required to choose the answer true or false in recognition to a correct statement or incorrect statement. This type of test is appropriate in assessing the behavioural objectives such as ‘identify,” “select,” or “recognize.” It is also suited to assess the knowledge and comprehension level in cognitive domain. This is appropriate when there are only two plausible alternatives or distracters. Guidelines in Constructing True or False Test 1. Avoid writing a very long sentence statement. Eliminate unnecessary word(s) in the statement. 2. Avoid trivial questions. 3. It should contain only one idea in each item except for statement showing the relationship between cause and effect. 4. It can be used for establishing cause and effect relationship. 5. Avoid using negative or double negatives. Construct the statement positively. If this cannot be avoided, bold negative words or underline it to call the attention of the examinees. 49 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 6. Avoid using opinion-based statement, if it cannot be avoided, the statement should be attributed to somebody. 7. Avoid specific determiner such as “never,” “always,” “all,” “none,” for they tend to appear in the statements that are false. 8. Avoid specific determiner such as “some,” “sometimes,” and “may” for they tend to appear in the statements that are true. 9. The number of true items must be the same with the number of false items. 10. Avoid grammatical clues that lead to correct answer such as the article 9a, an, the). 11. Avoid statement directly taken from the textbook. 12. Avoid arranging the statements in a logical order such as (TTTTTFFFFF, TFTFTFTFTF, etc.). 13. Directions should indicate where or how the students should mark their answer. Examples of True or False type of Test Direction: Determine if the statement is true or false. Write true if the statement is true, otherwise, write false. 1. Swimming is the best fat burning exercise for everyone. 2. Target heart rate must be maintained for 20 minutes or longer to be effective. 3. The effectiveness of an exercise program depends on the frequency, intensity, and duration of the exercise. 4. An indoor bike burns more calories than riding an outdoor bike. 5. When sleeping the body relaxes the muscles, breathing and heart rate decreases, and the body repairs itself. Advantages of a True or False Test 1. It covers a lot of content in a short span of time. 2. It is easier to prepare compared to multiple-choice and matching type of test. 3. It is easier to score because it can be scored objectively compared to a test that depends on the judgment of the rater(s). 4. It is useful when there are two alternatives only. 5. The score is more reliable than essay type of test. Disadvantages of a True or False Test 1. Limited only to low level of thinking skills such as knowledge and comprehension, or recognition or recall information. 2. High probability of guessing the correct answer compared to multiple-choice which consists of more than 2 choices. Supply type or Subjective type of Test Items A. Completion type or Short Answer Test Completion or Short Answer Test is an alternative form of assessment because the examinee needs to supply or create the appropriate word(s), symbol(s) or number(s) to answer the question or complete a statement rather than selecting the answer from the given options. There are two ways of constructing completion type of test: question form or complete the statement form. Guidelines in Constructing Completion type or Short Answer Test 50 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 1. The item should require a single word answer or brief and definite statement. Do not use indefinite statement that allows several answers. 2. Be sure that the language used in the statement is precise and accurate in relation to the subject matter being tested. 3. Be sure to omit only key words; do not eliminate so many words so that the meaning of the item statement will not change. 4. Do not leave the blank at the beginning or within the statement. It should be at the end of the statement. 5. Use direct question rather than incomplete statement. The statement should pose the problem to the examinee. 6. Be sure to indicate the units in which to be expresses when the statement requires numerical answer. 7. Be sure that the answer the student is required to produce is factually correct. 8. Avoid grammatical clues. 9. Do not select textbook sentences. Examples of Completion and Short Answer Question Form Direction: Write your answer on the space provided before each item. _________________1. What do you call a polygon that has five sides? _________________2. What is the sum of the measures of the interior angles of a nonagon? _________________3. Which polygon is always regular? _________________4. Which polygon has 5 diagonals? _________________5. What do you call a polygon with equal sides and equal angles? Completion Form 1. A polygon with 5 sides is called ________________. 2. The sum of the measures of the interior angles of a nonagon is ________________. 3. A polygon that is always regular is called ________________. 4. A polygon with 5 diagonals is called ________________. 5. A kind of polygon with equal sides and equal angles is called ________________. Advantages of a Completion or Short Answer Test 1. It covers a broad range of topic in a short span of time. 2. It is easier to prepare and less time consuming compared to multiple choice and matching type of test. 3. It can assess effectively the lower level of Bloom’s Taxonomy. It can assess recall of information, rather than recognition. 4. It reduces the possibility of guessing the correct answer because it requires recall compared to true or false items and multiple-choice items. 5. It covers greater amount of contents than matching type test. Disadvantages of a Completion or Short Answer Test 1. It is only appropriate for questions that can be answered with short responses. 2. There is a difficulty in scoring when the questions are not prepared properly and clearly. The question should be clearly stated so that the answer of the student is clear. 51 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 3. It can assess only knowledge, comprehension and application levels in Bloom’s taxonomy of cognitive domain. 4. It is not adaptable in measuring complex learning outcomes. 5. Scoring is tedious and time consuming. B. Essay Items It is appropriate when assessing students’ ability to organize and present original ideas. It consists of a few number of questions wherein the examinee is expected to demonstrate the ability to recall factual knowledge; organize his knowledge; and present his knowledge in logical and integrated answer. Extended response essay and restricted response essay are the two types of essay test items. Extended Response Essay allows the students to determine the length and complexity of the response. It is very useful in assessing the synthesis and evaluation skills of the students. When the objective is to determine whether the students can organize ideas, integrate and express ideas, evaluate information in knowledge, it is best to use extended response essay test. Restricted Response Essay is an essay item that places strict limits on both content and the response given by the students. This type of essay, the content usually restricted by the scope of the topic to be discussed and the limitations on the form of the response is indicated in the question. Guidelines in Constructing Essay Test Items 1. Construct essay question used to measure complex learning outcomes only. 2. Essay questions should relate directly to the learning outcomes to be measured. 3. Formulate essay questions that present a clear task to be performed. 4. An item should be stated precisely and it must clearly focus on the desired answer. 5. All students should be required to answer the same question. 6. Number of points and time spent in answering the question must be indicated in each item. 7. Specify the number of words, paragraphs or the number of sentences for the answer. 8. The scoring system must be discussed or presented to the students. Examples of Essay Test Extended Response Essay 1. Present and describe the modern theory of evolution and discuss how it is supported by evidence from the areas of (a) comparative anatomy, (b) population genetics. 2. From the statement, “Mathematics maybe defined as the subject in which we never know what we are talking about, nor whether what we are saying is true,” what do you think is the reasoning of the statement? Explain your answer. Restricted Response Essay 1. Point out the advantages and disadvantages of an essay type of test. Limit your answer to five advantages and five disadvantages. Explain each answer in not more than two sentences. 2. Mr. Cruz, a science teacher, wants to measure his students’ ability to interpret scientific data with paper-and-pencil test. Describe the steps that Mr. Cruz would follow. Give reasons to justify each step in not more than three sentences. Advantages of Essay Test 1. It is easier to prepare and less time consuming compared to other paper and pencil tests. 52 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 2. It measures higher-order thinking skills (analysis, synthesis and evaluation). 3. It allows students’ freedom to express individuality in answering the given question. 4. The students have a chance to express their own ideas in order to plan their own answer. 5. It reduces guessing answer compared to any objective type of test. 6. It presents more realistic task to the students. 7. It emphasizes on the integration and application of ideas. Disadvantages of Essay Test 1. It cannot provide an objective measure of the achievement of the students. 2. It needs so much time to grade and prepare scoring criteria. 3. The scores are usually not reliable most especially without scoring criteria. 4. It measures limited amount of contents and objectives. 5. Low variation of scores. 6. It usually encourages bluffing. Suggestions for Grading Essay Test 1. Decide on a policy for dealing with incorrect, irrelevant or illegal responses. 2. Keep scores of the previously read items out of sight. 3. The student’s identity should remain anonymous while his/her paper is being graded. 4. Read and evaluate each student’s answer to the same question before grading the next question. 5. Provide students with general grading criteria by which they will be evaluated prior to the examination. 6. Use analytic scoring or holistic scoring. 7. Answer the test question yourself by writing the ideal answer to it so that you can develop the scoring criteria from your answer. 8. Write your comments on their papers. C. Problem-Solving Test Problem-solving test or computational test is a type of subjective test that presents a problem situation or task and required demonstration of work procedures and correct solution, or just a correct solution. Teacher can assign full of partial credit to either correct or incorrect solutions depending on the quality and kind of work procedures presented. Guidelines for Writing Problem-solving Test Items 1. Clearly identify and explain the problem. 2. Provide directions which clearly inform the student of the type of response called for. 3. State the directions whether or not the student must show his/her work procedures for full or partial credit. 4. Clearly separate item parts and indicate their point values. 5. Use figures, conditions, situations which create a realistic problem. 6. Ask questions that elicit response on which experts could agree that one solution and one or more work procedures are better than others. 7. Work through each problem before classroom administration to double-check accuracy. 53 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 Examples of Problem-solving Test Direction: Analyze and solve each problem. Show your solution neatly and clearly by applying the strategy indicated in each item. Each item corresponds to 10 points. 1. Debbie begins a physical fitness program. Debbie’s goal is to do 100 sit-ups. On the first day of the program, she does 20 sit-ups. Every 5 th day of the program, she increases the number of sit-ups by 10. After how many days will she reach her goal? (Make a list or table) Advantages of Problem-solving Test Items 1. It minimizes guessing by requiring the students to provide an original response rather than to select from several alternatives. 2. It is easier to construct. 3. It can most appropriately measure learning objectives which focus on the ability to apply skills and knowledge in the solution of problems. 4. It can measure an extensive amount of content objectives. Disadvantages of Problem-solving Test 1. It generally provides low test and test scorer reliability. 2. It required an extensive amount of teacher time to read and grade the paper. 3. It does not provide an objective measure of student achievement or ability – subject to bias on the part of the grader when partial credit is given. Assessment Task 3.2 1. Construct a multiple-choice test using Krathwolh’s 2001 revised cognitive domain. 2. Formulate at least two examples of the different types of objective and subjective test in your area of specialization. Lesson 3. Learning Target and Assessment Method Match Table of Specification Table of specification (TOS) is a chart or table that details the content and level of cognitive domain assessed on a test as well as the types and emphases of test items (Gareis and Grant, 2008). TOS is very important in addressing the validity and reliability of the test items. The validity of the test means that the assessment can be used to draw appropriate result from the assessment because the assessment guarded against any systematic error. TOS provides the test constructor a way to ensure that the assessment is based from the intended learning outcomes. It is also a way of ensuring that the number of questions on the test is adequate to ensure dependable results that are not likely caused by chance. 54 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 It is also a useful guide in constructing a test and in determining the type of test items that you need to construct. Different Formats of Table of Specification A. Format 1 of a Table of Specification. This format is composed of the specific objectives, the cognitive level, type of test used, the item number, and the total points needed in each item. Specific Objectives refer to the intended learning outcomes stated as specific instructional objective covering a particular test topic. Cognitive Level pertains to the intellectual skill or ability to correctly answer a test item using Bloom’s taxonomy of educational objectives. We sometimes refer to this as the cognitive domain of a test item. Thus, entries in this column could be knowledge, comprehension, application, analysis, synthesis, and evaluation. Type of Test Item identifies the type or kind of test a test item belongs to. Examples of entries in this column could be multiple-choice, true or false, or even essay. Item Number simply identifies the question number as it appears in the test. Total Points summarize the score given to a particular test. Specific Objectives Cognitive Type of Test Item Total Level Number Points Solve worded problems in Application Multiple- 1 and 2 4 consecutive integers choice points B. Format 2 of Table of Specification (One-way Table of Specification) Contents Number of Number Cognitive Level Test Item Class Sessions of Items K-C A HOTS Distributio n Basic Concepts Fraction 1 2 1-2 Addition of Fraction 1 2 3-4 Subtraction of Fraction 1 2 5-6 Multiplication and Division 3 6 7-12 of Fraction Application/Problem 4 8 13-20 Solving Total 10 20 C. Format 3 of Table of Specification (Two-way Table of Specification) Conte Class Krathwohl’s Cognitive Level Tota Item nt Sessio l Distribu ns Poin tion ts Remembe Understan Applyi Analyzi Evaluat Creati ring ding ng ng ing ng Concepts 1 2 1-2 Z-score 2 4 3-6 T-score 2 4 7-10 Stanine 3 6 11-16 Percentil 3 6 17-22 e Rank Applicati 4 8 23-30 on Total 15 30 Preparing a Table of Specification 55 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 1. Selecting the learning outcomes to be measured. Identify the necessary instructional objectives needed to answer the test items correctly. The lists of the instructional objectives will include the learning outcomes in the areas of knowledge, intellectual skills or abilities, general skills, attitudes, interest, and appreciation. Use Bloom’s taxonomy or Krathwolh’s 2001 revised taxonomy of cognitive domain as guide. 2. Make an outline of the subject matter to be covered in the test. The length of the test will depend on the areas covered in its content and the time needed to answer. 3. Decide on the number of items per subtopic. Use this formula to determine the number of items to be constructed for each subtopic covered in the test so that the number of item in each topic should be proportioned to the number of class sessions. 4. Make the two-way chart as shown in the format 2 and format 3 of a Table of Specification. 5. Construct the test items. A classroom teacher should always follow the general principle of constructing test items. The test item should always correspond with the learning outcome so that it serves whatever purpose it may have. Number of items = number of class sessions x desired total number of items total number of class session Example: Number of class session discussing the topic: 3 Desired number of items: 10 Total number of class sessions for the unit: 10 No. of items = no. of class sessions x desired total number of items total number of class session = 3 x 10 10 = 30 10 =3 Assessment Task 3.3 1. Why is it necessary to prepare Table of Specification before constructing test items? 2. Choose a topic in your specialization and make a sample Table of Specification. Prepare a 20-item test and use this in your table of specification. Lesson 4. Assessment Tools Development Assessment Development Cycle 1. Planning Stage Determine who will use the assessment results and how they will use them. Identify the learning targets to be assessed. Select the appropriate assessment method or methods. Determine the sample size. 2. Development Stage Develop or select items, exercises, tasks, and scoring procedures. Review and critique the overall assessment for quality before use. 3. Use Stage Conduct and score the assessment. Revise as needed for future use. 56 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 Steps in Developing Assessment Tools 1. Examine the instructional objectives of the topics previously discussed. The first step in developing a test is to examine and go back to the instructional objectives so that you can match with the test items to be constructed. 2. Make a table of specification (TOS). TOS ensures that the assessment is based from the intended learning outcomes. 3. Construct the test items. In constructing test items, it is necessary to follow the general guidelines for constructing test items. Kubiszyn and Borich (2007) suggested some guidelines for writing test items to help classroom teachers improve the quality of test items to write. Begin writing items far enough or in advance so that you will have time to revise them. Match items to intended outcomes at appropriate level of difficulty to provide valid measure of instructional objectives. Limit the question to the skill being assessed. Be sure each item deals with an important aspect of the content area and not with trivia. Be sure the problem posed is clear and unambiguous. Be sure that the item is independent with all other items. The answer to one item should not be required as a condition in answering the next item. A hint to one answer should not be embedded to another item. Be sure the item has one or best answer on which experts would agree. Prevent unintended clues to an answer in the statement or question. Grammatical inconsistencies such as “a” or “an” give clues to the correct answer to those students who are not well prepared for the test. Avoid replication of the textbook in writing test items; do not quote directly from the textual materials. You are usually not interested in how well students memorize the text. Besides, taken out of context, direct quotes from the text are often ambiguous. Avoid trick or catch questions in an achievement test. Do not waste time testing how well the students can interpret your intentions. Try to write items that require higher-order thinking skills. 4. Assemble the test items. After constructing the test items following the different principles of constructing test item, the next step to consider is to assemble the test items. There are two steps in assembling the test: (1) packaging the test; and (2) reproducing the test. In assembling the test, consider the following guidelines: Group all test items with similar format. Arrange test items from easy to difficult. Space the test items for easy reading. Keep items and option in the same page. Place the illustrations near the description. Check the answer key. Decide where to record the answer. 5. Check the assembled test items. Before reproducing the test, it is very important to proofread first the test items for typographical and grammatical errors and make necessary corrections if any. If possible, let others examine the test to validate its content. This can save time during the examination and avoid destruction of the concentration of the students. 6. Write directions. Check the test directions for each item format to be sure that it is clear for the students to understand. The test direction should contain the numbers of items to which they apply; how to record their answers; the basis of which they select answer; and the criteria for scoring or the scoring system. 57 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 7. Make the answer key. Be sure to check your answer key so that the correct answers follow a fairly random sequence. 8. Analyze and improve the test items. Analyzing and improving the test items should be done after checking, scoring and recording the test. Item Analysis Item analysis is a process of examining the student’s response to individual item in the test. It consists of different procedures for assessing the quality of the test items given to the students. Through the use of item analysis we can identify which of the given are good and defective test items. Good items are to be retained and defective items are to be improved, to be revised or to be rejected. Uses of Item Analysis 1. Item analysis data provide a basis for efficient class discussion of the test results. 2. Item analysis data provide a basis for remedial work. 3. Item analysis data provide a basis for general improvement of classroom instruction. 4. Item analysis data provide a basis for increased skills in test construction. 5. Item analysis procedures provide a basis for constructing test bank. Types of Quantitative Item Analysis 1. Difficulty Index It refers to the proportion of the number of students in the upper and lower groups who answered an item correctly. The larger the proportion, the more students, who have learned the subject is measured by the item. To compute the difficulty index of an item, use the formula: where: DF = difficulty index; n = number of the students selecting item correctly in the upper group and in the lower group; and N = total number of students who answered the test Level of Difficulty To determine the level of difficulty of an item, find first the difficulty index using the formula and identify the level of difficulty using the range given below. The higher the value of the index of difficulty, the easier the item is. Hence, more students got the correct answer and more students mastered the content measured by that item. Level of Difficulty of an Item Index Range Difficulty Level 0.00 - 0.20 Very Difficult 0.21 - 0.40 Difficult 0.41 - 0.60 Average/Moderately Difficult 0.61 - 0.80 Easy 0.81 - 1.00 Very Easy 2. Discrimination Index It is the power of the item to discriminate the students who know the lesson and those who do not know the lesson. It also refers to the number of students in the upper group who got an item correctly minus the number of students in the lower group who got an item correctly. Divide the difference by either the number of the students in the upper group or number of students in the lower group or get the higher number if they are not equal. Discrimination index is 58 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 the basis of measuring the validity of an item. This index can be interpreted as an indication of the extent to which overall knowledge of the content area or mastery of the skills is related to the response on an item. The formula used to compute for the discrimination index is: where: DI = discrimination index value; CUG = number of the students selecting the correct answer in the upper group; CLG = number of students selecting the correct answer in the lower group; and D = the number of students in either the lower group or upper group Types of Discrimination Index 1. Positive discrimination happens when more students in the upper group got the item correctly than those students in the lower group. 2. Negative discrimination occurs when more students in the lower group got the item correctly than the students in the upper group. 3. Zero discrimination happens when a number of students in the upper group and lower group who answer the test correctly are equal, hence, the test item cannot distinguish the students who performed in the overall test and the students whose performance are very poor. Level of Discrimination Ebel and Frisbie (1986) as cited by Hetzel (1997) recommended the use of the Level of Discrimination of an Item for easier interpretation. Index Range Discrimination Level 0.19 and below Poor item, should be eliminated or need to be revised 0.20 – 0.29 Marginal item, needs some revision 0.30 – 0.39 Reasonably good item but possibly for improvement 0.40 and above Very good item Steps in Solving Difficulty Index and Discrimination Index 1. Arrange the scores from highest to lowest. 2. Separate the scores in the upper group and lower group. There are different methods to do this: (a) if a class consists of 30 students who takes an exam, arrange their scores from highest to lowest, then divide them into two groups. The highest score belongs to the upper group. The lowest score belongs to the lower group. (b) Other literature suggested to use 27%, 30%, or 33% of the students for the upper group and lower group. (c)However, in the Licensure Examination for Teachers (LET) the test developers always used 27% of the students who participated in the examination for the upper and lower groups. 3. Count the number of those who chose the alternatives in the upper and lower group for each item and record the information using the template: Options A B C D E Upper Group Lower Group Note: Put asterisk for the correct answer. 4. Compute the value of the difficulty index and the discrimination index and also the analysis of each response in the distracters. 59 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 5. Make an analysis for each item. 3. Analysis of Response Options Aside from identifying the difficulty index and discrimination index, another way to evaluate the performance of the entire test item is through the analysis of the response options. It is very important to examine the performance of each option in a multiple-choice item. Through this, you can determine whether the distracters or incorrect options are effective or attractive to those who do not know the correct answer. The attractiveness of the incorrect options is determine when more students in the lower group than in the upper group choose it. Analyzing the incorrect options allows the teachers to improve the test items so that it can be used again in the future. Distracter Analysis 1. Distracter. It is the term used for the incorrect options in the multiple-choice type of test while the correct answer represents the key. 2. Miskeyed item. The test item is a potential miskey if there are more students from the upper group who choose the incorrect options than the key. 3. Guessing item. Students from the upper group have equal spread of choices among the given alternatives. 4. Ambiguous item. This happen when more students from the upper group choose equally an incorrect option and the keyed answer. How to Improve the Test Item Example 1. A class is composed of 40 students. Divide the group into two. Option B is the correct answer. Based from the given data on the table, as a teacher, what would you do with the test item? Options A B* C D E Upper Group 3 10 4 0 3 Lower Group 4 4 8 0 4 1. Compute the difficulty index. N = 10 + 4 = 14 N = 40 2. Compute the discrimination index CUG = 10 CLG = 4 D = 20 3. Make an analysis about the level of difficulty, discrimination and distracters. a. Only 35% of the examinees got the answer correctly, hence, the item is difficult. b. More students from the upper group got the answer correctly, hence, it has positive discrimination. c. Retain options A, C, and E because most of the students who did not perform well in the overall examination selected it. Those options attract most students from the lower group 4. Conclusion: Retain the test item but change option D, make it more realistic to make it effective for the upper and lower groups. At least 5% of the examinees choose the incorrect option. 60 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 Example 2. Below is the result of an item analysis for a test item in Mathematics. Are you going to reject, revise or retain the test item? Options A B C* D E Upper Group 4 3 4 3 6 Lower Group 3 4 3 4 5 1. Compute the difficulty index. N=4+3=7 N = 39 2. Compute the discrimination index CUG = 4 CLG = 3 D = 20 3. Make an analysis about the level of difficulty, discrimination and distracters. a. Only 18% of the examinees got the answer correctly, hence, the item is very difficult. b. More students from the upper group got the answer correctly, hence, it has a positive discrimination of 5% c. Students respond about equally to all alternatives, an indication that they are guessing. d. If the test item is well-written but too difficult, reteach the material to the class. 4. Conclusion: Reject the item because it is very difficult and the discrimination index is very poor, and option A and B are not effective distracters. Example 3. A class is composed of 50 students. Use 27% to get the upper and the lower groups. Analyze the item given the following results. Option D is the correct answer. What will you do with the test item? Options A B C D* E Upper Group 3 1 2 6 2 Lower Group 5 0 4 4 1 1. Compute the difficulty index. N = 6 + 4 = 10 N = 28 2. Compute the discrimination index CUG = 6 CLG = 4 D = 14 3. Make an analysis about the level of difficulty, discrimination and distracters. a. Only 36% of the examinees got the answer correctly, hence, the item is difficult. b. More students from the upper group got the answer correctly, hence, it has a positive discrimination of 14% c. Modify options B and E because more students from the upper group chose them compare with the lower group, hence, they are not effective distracters because most of the students who performed well in the overall examination selected them as their answers. 61 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 d. Retain options A and C because most of the students who did not perform well in the overall examination selected them as the correct answers. Hence, options A and C are effective distracters. 4. Conclusion: Revised the item by modifying options B and E. Test Reliability Reliability refers to the consistency with which it yields the same rank for individuals who take the test more than once (Kubiszyn and Borich, 2007). That is, how consistent test results or other assessment results from one measurement to another. A test is reliable when it can be used to predict practically the same scores when test administered twice to the same group of students and with a reliability index of 0.60 or above. The reliability of a test can be determined by means of Pearson Product Moment of Correlation, spearman-Brown Formula, Kuder-Richardson Formulas, Cronbach’s Alpha, etc. Factors Affecting Reliability of a Test 1. Length of the test 2. Moderate item difficulty 3. Objective scoring 4. Heterogeneity of the student group 5. Limited time Methods of Establishing Reliability of a Test 1. Test-retest Method. A type of reliability determined by administering the same test twice to the same group of students with any time interval between the tests. The results of the test scores are correlated using the Pearson Product Correlation Coefficient (r) and this this correlation coefficient provides a measure of stability. This indicated how stable the test result over a period of time. 2. Equivalent/Parallel/Alternate Forms. A type of reliability determined by administering two different but equivalent forms of the test to the same group of students in close succession. The equivalent forms are constructed to the same set of specification that is similar in content, type of items and difficulty. The results of the test scores are correlated using the Pearson Product Correlation Coefficient and this correlation coefficient provides a measure of the degree to which generalization about the performance of students from one assessment to another assessment is justified. It measures the equivalence of the tests. 3. Split-half Method. Administer test once and score two equivalent halves of the test. To split the test into halves that are equivalent, the usual procedure is to score the even-numbered and the odd-numbered test item separately. This provides two scores for each student. The results of the test scores are correlated using the Spearman-Brown formula and this correlation coefficient provides a measure of internal consistency. It indicates the degree to which consistent results are obtained from two halves of the test. 4. Kuder-Richardson Formula. Administer the test once. Score the total test and apply the Kuder-Richardson (KR) formula. The KR-20 formula is applicable only in situations where students’ responses are scored dichotomously, and therefore, is most useful with traditional test items that are scored as right or wrong, true or false, and yes or no type. KR-20 formula estimates of reliability provide information whether the degree to which the items in the test measure is of the same characteristic, it is an assumption that all items are of equal in difficulty. Another formula for testing the internal consistency of a test is the KR-21 formula, which is not limited to test items that are scored dichotomously. Reliability Coefficient 62 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 Reliability coefficient is a measure of the amount of error associated with the test scores. Reliability Coefficient has the following description: (a) The range of the reliability coefficient is from 0 to 1.0; (b) The acceptable range value is 0.60 or higher; (c) The higher the value of the reliability coefficient, the more reliable the overall test scores; (d) Higher reliability indicates that the test items measure the same thing. Interpreting Reliability Coefficient 1. The group variability will affect the size of the reliability coefficient. Higher coefficient results from heterogeneous groups than from the homogeneous groups. As group variability increases, reliability goes up. 2. Scoring reliability limits test score reliability. If tests are scored unreliable, error is introduced. This will limit the reliability of the test scores. 3. Test length affects test score reliability. As the length increases, the reliability tends to go up. 4. Item difficulty affects test score reliability. As test items become very easy or very hard, the test’s reliability goes down. Level of Reliability Coefficient Reliability Coefficient Interpretation Above 0.90 Excellent reliability 0.81 – 0.90 Very good for a classroom test 0.71 – 0.80 Good for classroom test. There are probably few items need to be improved 0.61 – 0.70 Somewhat low. The test needs to be supplemented by other measured (more test0 to determine grades 0.51 – 0.60 Suggests need for revision of test, unless it is quite short (10 or fewer items). Needs to be supplemented by other measured (more test) for grading 0.50 and below Questionable reliability. This test should not contribute heavily to the course grade, and it needs revision. Example 1. Prof. Joel conducted a test to his 10 students in Elementary Statistics class twice after one-day interval. The test given after one day is exactly the same test given the first time. Scores below were gathered in the first test (FT) and second test (ST). Using test-retest method, is the test reliable? Show the complete solution. Student FT ST 1 36 38 2 26 34 3 38 38 4 15 27 5 17 25 6 28 26 7 32 35 8 35 36 9 12 19 10 35 38 Using the Pearson r formula, find Ʃx, Ʃy, Ʃxy, Ʃx2, Ʃy2 Student FT (x) ST (y) xy x2 y2 1 36 38 1368 1296 1444 2 26 34 884 676 1156 3 38 38 1444 1444 1444 4 15 27 405 225 729 63 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 5 17 25 425 289 625 6 28 26 728 784 676 7 32 35 1120 1024 1225 8 35 36 1260 1225 1296 9 12 19 228 144 361 10 35 38 1330 1225 1444 n=10 Ʃx = 274 Ʃy = 316 Ʃxy = 9192 Ʃx2 = 8332 Ʃy2 = 10400 Analysis: The reliability coefficient using the Pearson r = 0.91 means that it has a very high reliability. The scores of the 10 students conducted twice with one-day interval are consistent. Hence, the test has a very high reliability. Example 2. Prof. Glenn conducted a test to his 10 students in his Chemistry class. The test was given only once. The scores of the students in odd and even items below were gathered, (O) odd items and (E) even items. Using split-half method, is the test reliable? Show the complete solution. Odd (x) Even (y) 15 20 19 17 20 24 25 21 20 23 18 22 19 25 26 24 20 18 18 17 Use the Spearman-Brown Formula to find the reliability of the whole test, find the Ʃx, Ʃy, Ʃxy, Ʃx2, Ʃy2 to solve the reliability of the odd and even test items. Odd (x) Even (y) xy x2 y2 15 20 300 225 400 19 17 323 361 289 20 24 480 400 576 25 21 525 625 441 20 23 460 400 529 18 22 396 324 484 19 25 475 361 625 26 24 624 676 576 20 18 360 400 324 18 17 306 324 289 Ʃx = 200 Ʃy = 211 Ʃxy = 4 249 Ʃx2 = 4 096 Ʃy2 = 4 533 Steps: 1. Use the Pearson Product Correlation Coefficient Formula to solve for r. 64 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 2. Find the reliability of the original test using the formula: 3. Analysis: The reliability coefficient using Brown formula is 0.50, which is questionable reliability. Hence, the test items should be revised. Example 3. Ms. Tan administered a 40-item test in English for her Grade VI pupils in UEPLES. Below are the scores of 15 pupils, find the reliability using the Kuder-Richardson formula. Student Score (x) 1 16 2 25 3 35 4 39 5 25 6 18 7 19 8 22 9 33 10 36 11 20 12 17 13 26 14 35 15 39 Steps: 1. Solve the mean and the standard deviation of the scores using the given table. Student Score (x) X2 1 16 256 2 25 625 3 35 1225 4 39 1521 5 25 625 6 18 324 7 19 361 8 22 484 9 33 1089 10 36 1296 11 20 400 12 17 289 13 26 676 14 35 1225 15 39 1521 n = 15 Ʃx =405 Ʃx2 = 11 917 2. Solve for Standard Deviation using the formula: 65 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 3. Solve for the Mean using the formula: 4. Solve the reliability coefficient using the Kuder-Richardson formula: 5. Analysis: The reliability coefficient using KR-21 formula is 0.90 which means that the test has a very good reliability. Meaning, the test is very good for a classroom test. Test Validity Validity is concerned whether the information obtained from an assessment permits the teacher to make a correct decision about a student’s learning. This means that the appropriateness of score-based inferences or decisions made are based on the students’ test results. Validity is the extent to which a test measures what it is supposed to measure. Types of Validity 1. Face Validity. It is the extent to which a measurement method appears “on its face” to measure the construct of interest. Face validity is at best a very weak kind of evidence that a measurement method is measuring what it is supposed to. One reason is that it is based on people’s intuitions about human behaviour, which are frequently wrong. It is also the case that many established measures in psychology work quite well despite lacking face validity. 2. Content Validity. A type of validation that refers to the relationship between test and the instructional objectives, establishes content so that the test measures what is supposed to measure. Things to remember about validity: a. The evidence of the content validity of a test is found in the Table of specification. b. This is the most important type of validity for a classroom teacher. c. There is no coefficient for content validity. It is determined by experts judgmentally, not empirically. 3. Criterion-related Validity. A type of validation that refers to the extent to which scores from a test relate to theoretically similar measures. It is a measure of how accurately a student’s current test score can be used to estimate a score on a criterion measure, like performance in courses, classes or another measurement instrument. For example, the classroom reading grades should indicate similar levels of performance as Standardized Reading test scores. 66 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 a. Concurrent validity. The criterion and the predictor data are collected at the same time. This type of validity is appropriate for tests designed to assess a student’s criterion status or when you want to diagnose student’s status; it is a good diagnostic screening test. It is established by correlating the criterion and the predictor using Pearson Product Correlation Coefficient and other statistical tools correlations. b. Predictive validity. A type of validation that refers to a measure of the extent to which student’s current test result can be used to estimate accurately the outcome of the student’s performance at later time. It is appropriate for tests designed to assess students’ future status on a criterion. Regression analysis can be sued to predict the criterion of a single predictor or multiple predictors. 4. Construct Validity. A type of validation that refers to the measure of the extent to which a test measures a theoretical and unobservable variable qualities such as intelligence, math achievement, performance anxiety, and the like, over a period of time on the basis of gathering evidence. It is established through intensive study of the test or measurement instrument using convergent/divergent validation and factor analysis. There are other ways of assessing construct validity like test’s internal consistency, developmental change and experimental intervention. a. Convergent validity is a type of construct validation wherein a test has a high correlation with another test that measures the same construct. b. Divergent validity is type of construct validation wherein a test has low correlation with a test that measures a different construct. In this case, a high validity occurs only when there is a low correlation coefficient between the tests that measure different traits. c. Factor analysis assesses the construct validity of a test using complex statistical procedures conducted with different procedures. Important Things to Remember about Validity 1. Validity refers to the decisions we make, and not to the test itself or to the measurement. 2. Like reliability, validity is not an all-or-nothing concept; it is never totally absent or absolutely perfect. 3. A validity estimate, called a validity coefficient, refers to specific type of validity. It ranges between 0 and 1. 4. Validity can never be finally determined; it is specific to each administration of the test. Factors Affecting the Validity of a Test Item 1. The test itself. 2. The administration and scoring of a test. 3. Personal factors influencing how students response to the test. 4. Validity is always specific to a particular group. Validity Coefficient The validity coefficient is the computed value of the rxy. In theory, the validity coefficient has values like the correlation that ranges from 0 to 1. In practice, most of the validity scores are usually small and they range from 0.3 to 0.5, few exceed 0.6 to 0.7. Hence, there is a lot of improvement in most of our psychological measurement. Another way of interpreting the findings is the squared correlation coefficient (rxy)2, this is called coefficient of determination. Coefficient of determination indicates how much variation in the criterion can be accounted for by the predictor. 67 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 Example: Teacher James develops a 45-item test and he wants to determine if his test is valid. He takes another test that is already acknowledged for its validity and uses it as criterion. He conducted these two sets of test to his 15 students. The following table shows the results of the two tests. Is the test valid? Find the validity coefficient using Pearson r and the coefficient of determination. Teacher James Test (x) Criterion Test (y) xy x2 y2 12 16 192 144 256 22 25 550 484 625 23 31 713 529 961 25 25 625 625 625 28 29 812 784 841 30 28 840 900 784 33 35 1155 1089 1225 42 40 1680 1764 1600 41 45 1845 1681 2025 37 40 1480 1369 1600 26 33 858 676 1089 44 45 1980 1936 2025 36 40 1440 1296 1600 29 35 1015 841 1225 37 41 1517 1369 1681 Ʃx = 465 Ʃy = 508 Ʃxy = 16 702 Ʃx2 = 15 487 Ʃy2 = 18 162 Coefficient of determination = (r)2 = (0.94)2= 88.36% Interpretation: The correlation coefficient is 0.94, which means that the validity of the test is high, or 88.36% of the variance in the students’ performance can be attributed to the test. Assessment Task 3.4 1. A 25-item multiple-choice test in Physical Education with four options was recorded below for item number 10. Listed were a number of students in the lower and upper groups who answered A, B, C, and D. Item 10 A B C D* Upper Group (27%) 4 5 2 9 Lower Group (27%) 6 4 5 5 Based from the given table answer the following: a. Give the difficulty index b. What is the level of difficulty? c. Indicate the discrimination index d. What is the discrimination level? e. Which of the option(s) is/are68the most effective? f. Which of the option(s) is/are ineffective? g. What can you conclude? Interpret the result. Downloaded by TONY MADINO ([email protected]) 2. Teacher Luis conducted a test to his 15 students in Science class twice with one-day lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 Feedback How was it working with this module? Were you exhausted seeing a lot of terms, numbers, and computations used in designing and developing assessment? I hope you were able to follow the discussion in this module. Remember that in assessment, numbers and computations are always included. The results of the different formulas used for item analysis, reliability and validation testing are used for the interpretation of the test item. So it is necessary that you will know this computation processes. If you are having a hard time on some lessons, you can always go back to the different topics and examples. Summary To aid you in reviewing the concepts in this module, here are the highlights: Assessment tools are techniques used to measure a student’s academic abilities, skills, and/or fluency in a given subject or to measure one’s progress toward academic proficiency in a specific subject area. Objective test, subjective test, performance assessment, portfolio assessment, oral questioning, observation technique, and self-report are some of the common types of classroom assessment tools. The different qualities of assessment tools include validity, reliability, fairness, objectivity, scorability, adequacy, administrability, practicality, and efficiency. The two general types of test item to use in achievement test using paper and pencil test are classified as selection-type items or objective type and supply-type items or subjective type. A multiple-choice test is an objective type of test used to measure knowledge outcomes and other types of learning outcomes such as comprehension and applications. It is the most commonly used format in measuring student achievements at different levels of learning. It consists of three parts: the stem, the keyed option and the incorrect options or alternatives. Matching type test is an objective test that consists of two columns. Column A contains the descriptions and must be placed at the left side while column b contains the options and placed at the right side. The examinees are asked to match the options that are associated with the descriptions True or false test item is an objective type of test which require the examinees to choose the answer true or false in recognition to a correct statement or incorrect statement. This is an example of a “force-choice test” because there are only two possible choices in this type of test. Completion or Short Answer Test is an alternative form of subjective assessment because the examinee needs to supply or create the appropriate word(s), symbol(s) or number(s) to answer the question or complete a statement rather than selecting the answer from the given options. There are two ways of constructing completion type of test: question form or complete the statement form. Essay type of test is appropriate when assessing students’ ability to organize and present original ideas. It consists of a few number of questions wherein the examinee is expected to demonstrate the ability to recall factual knowledge; organize his knowledge; and present his knowledge in logical and integrated answer. Extended response essay and restricted response essay are the two types 69 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 of essay test items. Problem-solving test or computational test is a type of subjective test that presents a problem situation or task and required demonstration of work procedures and correct solution, or just a correct solution. Teacher can assign full of partial credit to either correct or incorrect solutions depending on the quality and kind of work procedures presented. Table of specification (TOS) is a chart or table that details the content and level of cognitive domain assessed on a test as well as the types and emphases of test items (Gareis and Grant, 2008). Item analysis is a process of examining the student’s response to individual item in the test. It consists of different procedures for assessing the quality of the test items given to the students. Through the use of item analysis we can identify which of the given are good and defective test items. Difficulty Index refers to the proportion of the number of students in the upper and lower groups who answered an item correctly. Discrimination Index is the power of the item to discriminate the students who know the lesson and those who do not know the lesson. Reliability refers to the consistency with which it yields the same rank for individuals who take the test more than once. That is, how consistent test results or other assessment results from one measurement to another. Validity is the extent to which a test measures what it is supposed to measure. Suggested Readings If you want to learn more about the topics in this module, you may log on to the following links: https://content.schoolinsites.com/api/documents/a4734c1ff0b948828e25b66791054c3b.pdf https://www.slideshare.net/RonaldQuileste/constructing-test-questions-and-the-table-of- specifications-tos https://www.yourarticlelibrary.com/statistics-2/teacher-made-test-meaning-features-and- uses-statistics/92607 https://www.slideshare.net/tamlinares/sound20-design2028ch204-729 https://opentextbc.ca/researchmethods/chapter/reliability-and-validity-of-measurement/ References Buendicho, F. C. (2013). Assessment of Student learning 1. Rex Bookstore, Inc, Manila, Philippines. Burton, S. J., Sudweeks, R. E., Merrill, P. F., Wood, B. (1991). How to prepare better multiple- choice test items: Guidelines for university faculty. Retrieved from http://testing.byu.edu/info/handbooks/betteritems.pdf Gabuyo, Y.A. (2012) Assessment of Learning I. Rex Book Store, Inc., Manila, Philippines. 70 Downloaded by TONY MADINO ([email protected]) lOMoARcPSD|29548003 Module 3 | Assessment in Learning 1 71 Downloaded by TONY MADINO ([email protected])