Full Transcript

Item Analysis (continued) Chapter 10 01/18/24 1 Item Response Theory ï‚—Traditional item analysis tell about items whether they are difficult or not, but they are heavily influenced by the types of people who take the test. ï‚—Modern approaches to test development and to the analysis of test do no...

Item Analysis (continued) Chapter 10 01/18/24 1 Item Response Theory Traditional item analysis tell about items whether they are difficult or not, but they are heavily influenced by the types of people who take the test. Modern approaches to test development and to the analysis of test do not emphasize on the simple methods of item analysis. Modern approaches take advantages of statistics and analytic methods. 01/18/24 2 Item Characteristic Curve (ICC) Is a graphic presentation of the probability of choosing the correct answer to an item as a function of the level of attribute measured by the test. As verbal ability increases, the probability of choosing the correct answer increases 01/18/24 3 Item Characteristic Curve (ICC) This graph summarizes the key features of an item such as its difficulty, its discriminating power and the probability of answering correctly. A very difficult item shows little discriminating power 01/18/24 4 Item Characteristic Curve (ICC) People who did well on test are less likely to do well on this particular item. 01/18/24 5 Item Characteristic Curve (ICC) Similar discriminating power but differ in difficulty. Item C is the most difficult, item A is the least difficult. 01/18/24 6 Item Characteristic Curve (ICC) Item A is useful only for testing individuals low on the relevant ability—at higher levels, everyone answers correctly, and no information is gained. Item D is useful only for individuals with high ability levels—at lower ability levels, it is certain that everyone fails the item and, no information is gained. 01/18/24 7 Item Response Theory Item response theory (IRT) is a psychometric theory, in which the mathematical relationship summarized by an item characteristics curve. It is used to analyze test items and tests. Assumptions in IRT: 1.There should be a relationship between the attribute the test measures and examinees’ responses to test items.  E.g. people who have a good mathematical ability should do well on test items. 2.There should be a simple mathematical relationship between a person’s ability level and likelihood of a correct answer. Item characteristic curves are used to describe this relationship. 01/18/24 8 Item Response Theory Advantages compared to traditional psychometric item analysis. 1.Mathematical superiority: measures in IRT are sample invariant, Problem is: The same mathematical test which is difficult for a group of 4th graders may be less difficult to for a group of 6th graders, and extremely easy for 8th graders. i.e. sample characteristics can be analyzed without confounding them with people taking the test. 2.Theoretical advantage: IRT encourages to think WHY people answer items correctly. 01/18/24 9 Item Response Theory IRT provides parameters, which represent items’ susceptibility to guessing, role of ability and item discriminating power. IRT defines item difficulty in terms of the level of ability needed to answer the item correctly. The traditional definition of difficulty says nothing about what makes an item difficult or easy. IRT says a difficult item is one that requires a high level of ability to achieve a correct answer. IRT defines item discriminating power in terms of the relationship between item responses and the construct measured by the test. Discriminating power in IRT doesn’t depend on test sum scores. 01/18/24 10 Item Response Theory IRT defines item discriminating power in terms of the relationship between item responses and the construct measured by the test. When people who are very high on mathematical ability are more likely to answer an item on a mathematical ability test than people who are low on that ability, then the items shows high discriminating power. The traditional approach links item responses to total test scores rather than to the construct the test is measured. 01/18/24 11 Applications of IRT IRT can be used to study same concepts as traditional item analysis (item difficulty, etc.), but can be used for solving problems, which would be difficult to traditional item analysis. 1.An approach that uses distractors in estimating ability, 2.An approach that tailors the test to the individual examinee, 3.Approaches for analyzing and constructing specialized tests. 01/18/24 12 Applications of IRT 1. Getting information about ability from distractors. In traditional item analysis, responses are scored either 0 (wrong answer)or 1 (correct answer), it makes no difference which distractor is chosen. But some distractors are probably better answers than others, whereas others will be chosen only if person has little or no knowledge. but in IRT, all responses are displayed by using item characteristics curves. Rather than concentrating only on the correct answer, we can construct separate ICCs for each possible response in a multiple-choice test, distractors as well as the correct responses. 01/18/24 13 Applications of IRT 1. Getting information about ability from distractors. A is the correct answer. A and C shows some discriminating power. C is as good as A, shows less discriminating powe though. But, D does not! B shows negative discriminating power. B and D are bad choices 01/18/24 14 Applications of IRT 2. Adaptive testing Rather than constructing a long test that contains some items appropriate for each ability level, we can construct test tailored to the ability level of the person who are taking the test. Tailored testing is possible: test adopt itself to an examinee’s ability level. Like computerized adaptive tests… IRT can be used to help tailor tests to individual examinees. When the person takes the test on a computer, it’s possible to estimate his/her ability at each step of testing and then to select next item to correspond with the person’s estimated level of ability. 01/18/24 15 Applications of IRT 3. Item analysis for specialized tests For some of the tests, 3 general principles of traditional approaches can not be applied in evaluating item analysis. Those 3 principles were; Incorrect responses should be distributed among distractors. 2. p values (difficulty) should be around .50. 3. Item-total correlations should be positive. 1. Screening test and criterion-keyed tests are outstanding examples. 01/18/24 16 Applications of IRT 3) Item analysis for specialized tests  IRT can be used in assessing screening test and criterion-keyed tests (no right and wrong answers). Screening test: Any testing procedure designed to separate out people with a given characteristic. e.g. A medical school receiving large number of applicants. e.g. For a memory research, anxiety might be confounding so can be used to eliminate the people with tendency of anxiety. 01/18/24 17 Applications of IRT 3) Item analysis for specialized tests  In criterion-keyed tests, there are no right and wrong answers.  In these type of tests, a person’s score is determined by the similarity of his/her responses to the responses of some known group.  E.g. MMPI. Examinees’ responses can be compared with people diagnosed as schizophrenic, paranoid or etc. 01/18/24 18 Applications of IRT 3) Item analysis for specialized tests This test is designed to screen out the lower half of the applicants. The curve is very steep at the point of the discrimination of lower and average group. 01/18/24 19 Applications of IRT The item discriminates well for diagnoses of depression and schizophrenia. It is unrelated to paranoia though. 01/18/24 20 Applications of IRT IRT can be used in item bias analysis. The item is more difficult for males. And also it doesn’t discriminate the males with low levels of ability from the males with high levels of ability. For females, the item is less difficult and is a better discriminator. 01/18/24 21