Psychological Testing and Measurement (PSY-P631) Lesson 09 PDF
Document Details
Uploaded by SimplifiedNewton
Tags
Summary
This document explores different types of psychological tests, specifically focusing on norm-referenced tests and domain-referenced tests. It explains their characteristics, uses, and interpretations. The document illustrates the difference between norm-referenced tests, which measure performance relative to a group, and domain-referenced tests, which focus on whether a person has mastered specific skills or knowledge.
Full Transcript
Psychological Testing and Measurement (PSY-P631) VU Lesson 09 Domain Referenced Test Interpret...
Psychological Testing and Measurement (PSY-P631) VU Lesson 09 Domain Referenced Test Interpretation Nature and Uses: So far we have been discussing the concept, use and significance of norms for the interpretation of psychological test results. But we should remember that norms are not the only way of assessing, measuring, and interpreting individuals’ abilities or traits. The tests that provide normative data for the sake of score interpretation are norm-referenced tests. A norm- referenced test is “a test that evaluates each individual relative to a normative group” (Kaplan, & Saccuzzo, 2001). A norm referenced test is administered to a representative sample of the population of interest, raw scores are gathered and analyzed, and in the light of the analysis norms are established for future test takers. In such type of tests, the norms or standards with which every individual’s performance is compared are the scores obtained by other persons. If one has attained scores equivalent to the normative samples’ average scores then one’s performance is considered average. If he fails to do so then his score is below average, and in case of a score higher than the normative average he is considered above average. There is another type of tests that employs a certain criterion or standards for describing a persons’ performance. A person has to perform within a domain in order to be considered proficient in a skill, behavior, or ability. These are called the domain referenced tests. The terms criterion-referenced and domain-referenced are used interchangeably. The latter is used more commonly. A criterion referenced test is “a test that describes the specific types of skills, tasks, knowledge of an individual relative to a well-defined mastery criterion. “The content of criterion-referenced test is limited to certain well-defined objectives”. (Kaplan, & Saccuzzo, 2001) Glaser (1963) was first one to use the term ‘criterion-referenced testing’. Alternative terms like ‘domain-referenced’ and ‘content-referenced’ testing were also proposed and used by writers. The interpretive frame of reference in a domain-referenced test employs a specific ‘content’ domain rather than a specified population of ‘persons’. The test results are reported in terms of what the person/ test taker knows, how proficient he is, to what extent he has command over a certain content domain. For example, according to Anastasi (2007), test takers’ performance may be reported in terms specific kinds of arithmetic operation which they have mastered, the estimated size of their vocabulary, or difficulty level of reading matter that they can comprehend (from comic books to literary classics). It can also be expressed in terms of the chances of a person’s achieving a designated performance level on an external criterion (educational and occupational). Who Decides and Determines the Domain or Criterion? The domain is primarily derived from the values or standards of an individual or organization. There are so many real life situations and professional scenarios where evaluation of a person’s ability, proficiency, or performance with reference to a norm is meaningless. Rather what is required is command over the domain. This is what is of value. For example, in training of surgeons how better can one perform a surgery than others or how much above or below average his skills are is not what is of importance. What is required is that one should have acquired the skill of surgery, to a specified extent. Similarly in the evaluation of a pilot, what is most important to be assessed is whether the pilot can fly a plane and how well can he do that. Whether he or she is equal, above, or below others is not of prime importance. Norm-referenced tests assess and describe how well test takers have performed in relation to other people. Domain-referenced tests on the other hand tell us about what test takers can do. In other words we can say that domain-referenced tests focus on the potential, and norm-referenced tests focus on the performance of a test taker in comparison to others. Can you think of a situation where you would prefer a norm referenced test and a situation where you would need a domain- referenced test ……what about learning a table of 5 by a 4 year old, and being able to repair an electrical connection? In some contexts, domain-referenced tests are also called “mastery tests”. These are the situations where the test is used to assess the mastery or achievement of certain skills or contents. The focus of attention is content rather than a specific population. Such tests, particularly in the educational context, became very popular in the 1970’s. Where domain referenced tests are used, the performance of test takers ©copyright Virtual University of Pakistan Psychological Testing and Measurement (PSY-P631) VU may be reported in terms of mastery or command over skills e.g. specific kind of arithmetic operation mastered by them; the estimated size of their vocabulary; difficulty level of reading matter that they can comprehend; or the chances of achieving a designated performance level on an external criterion that may be educational or occupational. Can u think of situations where you would be interested in mastery of a person over a skill rather than where he stands in comparison to others? What would you like to see in a cricketer? Whether he can hit the ball very well or what is his position in comparison to other cricketers? And if you were to travel by air, what would interest you; is your pilot a good pilot or his relative position among pilots? Domain referenced tests are commonly used in education; such tests have been used in educational innovations. According to Anastasi and Urbina (2007), their major applications have been made in computer-assisted, computer-managed, and other individualized, self-placed instructional systems. Testing is closely integrated in these systems, with instruction being introduced before, during and after completion of each instructional unit to check on pre-requisite skills, diagnose possible learning difficulties, and prescribe subsequent instructional procedures. Broad surveys of educational accomplishment have also employed domain referenced tests e.g. National Assessment of Educational Progress in the U.S. Another area of application of these tests is when mastery of small number of clearly defined job skills is to be assessed for the sake of testing job proficiency e.g. in military occupational specialties. A similar application is when domain referenced assessment is made for evaluating the attainment of minimum requirements e.g., qualifying for drivers’ or pilots’ license. In addition to these areas of application, it is believed that domain referenced tests can be helpful in improving the traditional teacher-made tests if the test developers are familiar with the concept of, and philosophy behind, domain referenced tests. Content Meaning: The most significant feature of domain referenced tests is that test performance is interpreted in terms of content meaning. As previously said, the goal is not to find out about relative standing of a person, but to learn what he/she knows, and what he/she can do. In developing and designing this type of tests, the content has to be carefully chosen, treated, and presented. While constructing such a test, a clearly defined domain of knowledge or skills to be assessed is the primary requirement. The content to be tested will be selected from a content domain. This domain should be an important one, and must be generally accepted as important. In order to develop items for making assessment of mastery over the content, the content has to be subdivided into smaller units. These small units are defined in performance terms i.e. what performance or behavior will indicate that a certain element or section of content has been learnt or mastered. When this approach is employed in educational settings and content is subdivided into smaller units, then these units correspond to behaviorally define instructional objectives. The mastery over each unit is described in terms of instructional objectives. For example, “divides numbers carrying zeros by numbers carrying one zero by canceling one zero at the end”, “can convert grams into ounces”, “can convert centigrade into Fahrenheit”. Instructional objectives not only specify the learning outcomes, they also affect the way a course is taught, and the way assessments will be made. After the instructional objectives have been finally shaped, the difficult task of item development for each objective follows. It may take quite long to prepare items for sampling individual objectives, because each item has to be a good representative of the domain to be assessed. Careful formulation of objectives and clear statement of concepts and methodologies is also very important in this regard. The test developer’s own expertise, experiences, and judgment, all matter a lot to the various steps of item construction. Domain-referenced tests can be most useful for testing basic skills like reading and arithmetic at elementary levels. According to Anastasi and Urbina (2007) an ordinal hierarchy is usually adopted for arranging instructional objectives for testing basic skills. For the acquisition of higher level skills, it is necessary to acquire elementary skills. Therefore they will be arranged in that order. Another point that one needs to keep in mind here is that the nature of objectives will depend on the nature of content or subject being considered. It is not advisable to ©copyright Virtual University of Pakistan Psychological Testing and Measurement (PSY-P631) VU formulate highly specific objectives for advanced levels of knowledge in less highly structured subjects. At elementary levels the content as well as the sequence or learning will be mostly flexible. Mastery Testing: Mastery testing is an important characteristic of domain-referenced tests. The procedure of testing mastery provides us an all-or-none score. This means that the test score will yield information about the presence or absence of mastery. It can be about presence or absence or about the attainment of the pre-established level of mastery. A generally expected level is complete mastery that may be up to 80 % or 85 %. Another way of reporting mastery is to use a three way-distinction. Under this system the report is in terms of mastery, non-mastery, and an intermediate, doubtful, or “review” interval. In mastery tests, individual differences are not a matter of concern. A number of educators believe that individual differences become meaningless in these tests. They argue that if suitable instructional methods are used and enough time is given, then almost everyone will be able to exhibit complete mastery of the domain and achieve the instructional objectives. The area where individual differences can be noted in traditional educational tests is the time that subjects take in learning the content according to the objectives. Therefore, it is said that the effect of individual differences can be reduced to minimum after appropriate training. Mastery testing is used in a number of individualized instructions. Published domain-referenced tests of basic skills for elementary school also employ this approach. Two issues need to be considered in the construction of such tests; how many items should be used, and what proportion of items has to be correct before a reliable assessment is made. Initially these issues were tackled by the test developers using their own judgment. But now a number of procedures and statistical techniques are available for resolving these issues. Develop a test to assess if your friend has learnt certain content. Relation to Norm-Referenced Testing: Although mastery testing has its advantages, its usefulness is more in case of basic skills at elementary levels. In case of areas where complete mastery is not the focus of interest, mastery testing is not the choice for evaluation of the subject. As we go to higher levels, and talk of more advanced and less structured subjects, mastery testing does not prove to be the best and the most suitable approach. In areas like understanding, critical thinking, appreciation, and originality there is no way of assessing complete attainment, or mastery. In fact there is no end, extent, limit, or direction of learning or progress. In such cases norm-referenced tests are useful. There are in fact no cutoff points/ scores to show complete absence of these faculties or skills. We also have some published tests available that allow norm-referenced as well as domain-referenced applications e.g., Stanford diagnostic tests in reading and in mathematics. Both, appropriate norms and a system for qualitative analysis of child’s attainment of detailed instructional objectives are available in these tests. Some authors write that even when we are using domain-referenced tests, we cannot say that they have nothing to do with norms. The concept of norms in any case is operative in the domain referenced tests. There is an underlying realization that a continuum of abilities does exists. Minimum Qualification and Cutoff scores: As we said earlier, mastery tests may adopt an all-or-none approach or make a three-way distinction. But there are situations where clear cut cutoff score points or scores need to be specified. There are numerous situations where minimum qualifications have to be specified and implemented e.g., when someone is to be granted a driving or flying license; when workers are being selected at a war zone where sharp learning and vision are required; when workers are being selected for a nuclear plant; when students graduate from one college and are to be chosen for a medical school. Different tests use their own cutoff scores. It is recommended that the following points should be kept in mind using cutoff scores for decision making. a) One should not use the scores of a single test as cutoff. A band of scores (from more than one test) should be used. ©copyright Virtual University of Pakistan Psychological Testing and Measurement (PSY-P631) VU b) Multiple sources of information should be used for decision making regarding test takers. Relevant performance on tests other than the one in question, whether from past or present, should be used. c) If a panel of judges set the cut off points, then those judges should be experts in test construction as well as the areas of task performance. d) Whenever possible, cutoff scores should be established and verified with the support of empirical information. e) Test scores should be obtained from groups that are clearly different from each other on the relevant criterion behavior (Anastasi & Urbina, 2007). An empirical method for setting cutoff scores can be in the form of expectancy tables. Expectancy Tables: An expectancy table contains probability of different criterion outcomes for persons who obtain each test score. These tables are based upon statistical information regarding the relationship of tests/variables as yielded past administrations. An expectancy table is something like this: Relationship between scores on test XYZ and course grades: Score on Number of Percentage in each grade XYZ students A B C D 50-60 12 - 1 8 4 60-70 15 3 6 6 - 70-80 23 12 9 2 - ©copyright Virtual University of Pakistan Psychological Testing and Measurement (PSY-P631) VU ©copyright Virtual University of Pakistan