Psych 312 Manual 2022 PDF

Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher...

Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 1 of 198 Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 1 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 2 of 198 PSYCH 311 PSYCH 312 COURSE LEARNING OUTCOMES At the end of the module, you should be able to: 1. demonstrate substantial understanding of psychological testing and the theoretical and statistical concepts and processes involved in the construction and standardization of tests. 2. identify the different types of tests and the current uses of psychological tests in the PSYCHOLOGICAL different fields of psychology as well as in other disciplines. ASSESSMENT 1 3. identify the importance, benefits, and limitations of psychological tests and assessment. 4. critically apply technical concepts and basic principles of psychometrics in test development, test purchase, psychological assessment and research. 5. utilize statistical knowledge in demonstrating computational skills in item analysis, norming, and the establishment of reliability and validity of a psychometric tool. Module Developers 6. exhibit an empirical and ethical attitude, as DR. MARY PAULINE E. NAMOCA, PhD, RPsy, LPT well as a scientific approach to tests, NICOLE SABRINA L. DELA CRUZ, MS GC, RGC, RPm, LPT development of tests, psychological testing JOMEL Q. VIADO, MS GC, RPm and assessment. MARIE JOY W. CHEONG, MS GC, RPm 7. uphold the welfare and well-being of the RYAN MAE DASKEO, Rpm recipient/client of the testing or assessment service. 8. appreciate the value of testing in the practice of the different fields of psychology 9. exemplify the values of honesty and integrity in the practice of psychological testing and assessment. 10. encourage further improvement and innovation for the improvement of tests and psychological assessment. Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 2 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 3 of 198 COURSE INTRODUCTION The basic education curriculum of the country was enhanced with the implementation of the K to 12 Curriculum. The K to 12 Program covers Kindergarten and 12 years of basic education (six years of primary education, four years of Junior High School, and two years of Senior High School [SHS]) to provide sufficient time for mastery of concepts and skills, develop lifelong learners, and prepare graduates for tertiary education, middle-level skills development, employment, and entrepreneurship. The implementation of the K to 12 Curriculum is expected to contribute to the country’s development in various forms because it is believe as necessary for implementation to increase the quality of our education which is critical to our progress as a nation. One of the features of the K to 12 curriculum is the requirement to equipped every graduate with the following skills: Information, media, and technology skills; Learning and innovation skills; Effective communication skills; and Life and career skills. The development of these skills can be done with the aid of technologies for teaching and learning which is the focus of this course. It aims to present some activities that will prepare pre-service teachers to integrate ICTs in the teaching learning processes for the various fields of specialization. It aims to help pre-service and in-service teachers to expand the boundaries of their creativity and the creativity of their students beyond the four walls of the classrooms. It aims to allow teachers to discover the power of computer technologies to serve as teaching tools that will motivate, captivate, and mobilize them to master their contents and perform standards for greater learning. Psychological Assessment 1 introduces you to a general orientation in psychological testing and the proper evaluation of tests as utilized in psychology. How behavior can be measured will be the focal point of this subject. It shall endeavor to deepen your learning about the essential features of psychological tests, focusing on such attributes as standardization of procedures in test administration and scoring, objectivity and item Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 3 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 4 of 198 analysis, norms and interpretation of scores, measurement and evaluation of reliability and validity of tests. An understanding of these major principles of test construction aids the test user toward a more effective utilization of psychological tests. TABLE OF CONTENTS Module I: Introduction to Psychological Testing and Assessment Module 2: Standardization and Objectivity Unit 1: Standardization in Test Administration Unit 2: Standardization in Scoring and the concept of Guessing Unit 3: Building Objectivity and Standardization through Item Analysis: Introduction to Item Analysis and the Analysis for Item Difficult y Index Unit 4: Building Objectivity and Standardization through Item Analysis: Item Analysis for Index of Discrimination, and Distracter Analysis MODULE 3: Norms and the Interpretation of Scores UNIT 1: Introduction to the Concept of Norms UNIT 2: Developmental Norms UNIT 3: Within Group Norms: Percentile Scores UNIT 4: Within Group Norms: Standard Scores and Standardized Scores, Inter- relationship Among Within Group Norms and Relativity of Norms MODULE 4: Reliability UNIT 1: Introduction to the Concepts of Reliability, Definition of Reliability UNIT 2: Relative Reliability: Temporal Stability Approaches to Reliability UNIT 3: Relative Reliability: Internal Consistency Split-Half Approaches to Reliability UNIT 4: Relative Reliability: Internal Consistency Inter-item Approaches to Reliability UNIT 5: Application of Reliability Concepts UNIT 6: Absolute Reliability: Standard Error of Measurement as an indicator of Reliability MODULE 5: Test Validity UNIT 1: Introduction to Validity UNIT 2: Specific procedures for Validation – Content Validation, Predictive Validity, Construct Validity, and Concurrent Validity Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 4 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 5 of 198 MODULE 1: Introduction to Psychological Testing and Assessment This module shall discuss the significant historical developments, origins, and rationale in psychological testing as well as the meaning and the purpose of psychological test. Furthermore, the testing process and the specific steps in constructing a psychological test will be presented. The different types of psychological tests, how they differ from each other, and their uses will also be discussed Galton’s Poster from the Galton Institute. https://www.galtoninstitute.org.uk/sir-francis-galton/psychology-statistics-criminology/ Learning outcomes: 1. Define Psychological testing and the concepts related to its history including processes, research methods and statistics used in test development and standardization. 2. Explain the steps in test construction, types of, and current uses of psychological tests. 3. Identify the importance, benefits, and limitations of psychological assessment as used in different fields. Engage: There’s always a first time for everything Explore: Rooted in History: History of Psychological Testing and Assessment Explain: What is a TEST? Elaborate: Current Uses of Tests Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 5 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 6 of 198 This module shall discuss the significant historical developments, origins, and rationale in psychological testing as well as the meaning and the purpose of psychological test. Furthermore, the testing process and the specific steps in constructing a psychological test will be presented. The different types of psychological tests, how they differ from each other, and their uses will also be discussed Engage: Pre-assessment survey: Recall the very first time that you took a test. What was the test about? What exactly did you have to do? How were you feeling during that time? ________________________________________________________________________________ ________________________________________________________________________________ ________________________________________________________________________________ ________________________________________________________________________________ ___________________________________________________________________________ Explore: How did psychological assessment start? Historians noted that a rudimentary form of assessment existed in ancient China as early as 2200 B.C. when the Chinese emperor had his officials examined in every third year to determine their fitness for the office. Such testing was modified and refined over the centuries until written exams were introduced in the Han Dynasty 202 B.C. - A.D. 200). Five topics were tested: civil law, military affairs, agriculture, revenue, and geography. Furthermore in Europe, interest in physiognomy, the notion that we can judge the inner character of people from their outward appearance especially the face can be dated to the fourth century, when the Greek philosopher Aristotle (384-322 B.C.) published a short treatise based on the premise that the soul and the body “sympathize” with each other. Physiognomy is essentially an attempt to infer the personality characteristics of a person from his individual appearance. Physiognomy laid the foundation for a more specialized form of quackery known as phrenology - which means reading bumps on the head. The founding of phrenology is usually ascribed to the German physician Franz Joseph Gall (1758-1828). Gall argued that the brain is the organ of sentiments and faculties and that these capacities are localized and because the skull conforms to the shape of the brain, a cranial “bump” would signify an enlargement of the underlying faculty. Thus, based on these plausible (but incorrect) assumptions, Gall and his followers were able to decide if an individual was amorous, Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 6 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 7 of 198 secretive, hopeful, combative, benevolent, self-confident, happy, imitative - in all, dozens of traits were discerned from cranial bumps. In Great Britain, Sir Francis Galton (1822-1911) attempted to measure intellect by means of reaction time and sensory discrimination. He wrote a book entitled Inquiries into Human Faculty and its Development (1883). This was a series of essays that emphasized individual differences in mental faculties. Because of his efforts in devising practicable measures of individual differences, historians of psychological testing often regard Galton as the father of mental testing. Moving on, French psychologist Alfred Binet invented the first modern intelligence test in 1905. In 1904, the Minister of Public Instruction in Paris appointed a commission to decide on the educational measures that should be undertaken with those children who could not profit from regular instruction. The commission concluded that medical and educational examinations should be used to identify those children who could not learn by the ordinary methods. Furthermore, it was determined that these children should be removed from their regular classes and given special instruction suitable to their more limited intellectual prowess. This was the beginning of the special education classroom. It was evident that a means of selecting children for such special placement was needed, and Binet and his colleague Simon were called on to develop a practical tool for just this purpose. Thus arose the first formal scale for assessing the intelligence of children. The 30 tests on the 1905 scale ranged from utterly simple sensory tests to quite complex verbal abstractions. Thus, the scale was appropriate for assessing the entire gamut of intelligence - from severe mental retardation to high levels of giftedness. Although it was Goddard who first translated the Binet scales in the United States, it was Stanford professor Lewis M. Terman (1857-1956) who popularized IQ testing with his revision of the Binet scales in 1916. The new Stanford-Binet, as it was called, was a substantial revision, not just an extension, of the earlier Binet scales. Among the many changes that led to the unquestioned prestige of the Stanford-Binet was the use of the now familiar IQ for expressing test results. Given the American penchant for efficiency, it was only natural that researchers would seek group mental tests to supplement the relatively time-consuming individual intelligence tests imported from France. The slow paced development in group testing picked up dramatically as the United States entered World War I in 1917. It was then that Robert M. Yerkes, a well-known psychology professor at Harvard, convinced the U.S. government and the army that all of its 1.75 million recruits should be given intelligence tests for purposes of classification and assignment. Immediately upon being commissioned into the Army as a colonel, Yerkes assembly a Committee on the Examination of Recruits, which met at the Vineland school in New Jersey to develop the new group tests for the assessment of Army recruits. Yerkes chaired the committee; other famous members included Goddard and Terman. Two group tests emerged from this collaboration: the Army Alpha and Army Beta. The Alpha was based on the then unpublished work of Otis (1918) and consisted of eight verbally loaded tests for average and high-functioning recruits. The eight tests were (1) Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 7 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 8 of 198 following oral directions, (2) arithmetical reasoning, (3) practical judgment, (4) synonym - antonym pairs, (5) disarranged sentences, (6) number series completion, (7) analogies, and (8) information. The Army Beta was a nonverbal group test designed for use with illiterates and recruits whose first language was not English. It consisted of various visual-perceptual and motor tests such as tracing a path through mazes and visualizing the correct number of blocks depicted in a three dimensional drawing. In 1939, David Wechsler, an American psychologist tried to solve the problem of psychological testing in general which is its dependence on language. Thus, he developed new nonverbal parts to go along with the verbal parts of the test. In the Philippines, formal assessment started as a mandate from the government to look into the educational status of the country. Several surveys were conducted such as: (1) Monroe Survey in 1925, (2) Prosser Survey in 1930, (3) UNESCO Survey in 1939. There were several initiatives established until the creation of the Republic Act (R.A.) 10029 also known as the Philippine Psychology Act of 2009 which is an act to regulate the practice of psychology. The state recognizes the importance of psychologists, their functions and services and at the same time it aims to protect the public by preventing inexperienced or untrained individuals from offering psychological services. References: https://www.youtube.com/watch?v=pmg2NEL7390 https://www.pap.org.ph/sites/default/files/ra10029.pdf https://files.eric.ed.gov/fulltext/ED511798.pdf Explain: What is a psychological test? Psychological tests are tools. To reap the benefits that tests can provide, one must keep in mind that any tool can be an instrument of good or harm depending on how it is used. Effective test use requires some familiarity with test construction. Such information is needed to evaluate different tests, choose tests appropriate for particular purposes and individual examinees, and to interpret test scores properly. Psychology, the science which seeks to measure and explain facts of intellect, character and other aspects of man’s personal life, has long recognized the complexity of behavior of man. In the analysis of human behavior, it has become apparent that individual differences exist in two ways: intra- and inter- individual differences. Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 8 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 9 of 198 Intra-individual differences refer to differences found within the same individual while inter- individual differences are differences recognizable between 2 or more people. These differences were studied scientifically and subjected to measurement and objective evaluation only about a hundred years ago. The idea of a test is thus a pervasive element of the culture. However, the layperson’s notion of a test does not necessarily coincide with the more restrictive view held by psychometricians. A psychometrician is a specialist in psychology or education who develops and evaluates psychological tests. A test is a standardized procedure for sampling procedure and describing it Important keywords defining most with categories or scores. Another way to psychological tests are presented in this say it is: a psychological test is a set of concise definition. These are: standardized and objective occasions Standardization - this is an essential feature of for response presented to an individual any psychological test. A test is said to be with the purpose of eliciting a reliable standardized if the procedures for and valid sample of his behavior in administering and scoring it are uniform from comparison to others within the same one examiner, and from one setting to target population. another. If the scores obtained by different persons are to be comparable, testing conditions must obviously be the same for all. Such a requirement is only a special application of the need for controlled conditions in all-scientific observations. In a test situation, the single independent variable is often the individual being tested. Standardization rests largely upon the directions for administration found in the instructional manual that typically accompanies the test. Objectivity - the administration, scoring, and interpretation of scores are objective insofar as they are independent of the subjective judgment of the particular examiner. Any test taker theoretically should obtain the identical score on a test regardless of who happens to be the examiner. Other ways in which psychological tests can be properly described as objective are in the determination of the difficulty level and discrimination value of an item or a whole test. Reliability - in psychometrics, this term basically means consistency. Test reliability is the consistency of scores obtained by the same persons when re-tested with identical test or with an equivalent form of the test. Reliability may be checked by comparing the scores obtained by the same test takers at different times, with different sets of items, with different examiners or scorers or under any other relevant testing condition. Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 9 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 10 of 198 Validity - this is undoubtedly the most important question to be asked about any test - its validity, that is, the degree to which the test actually measures what it purports to measure, and, likewise, it shows what the test is measuring. It provides a direct check on how well the test fulfills its function. The determination of validity usually requires independent, external criteria of whatever test is designed to measure. Norms - psychological tests do not have predefined standards of passing or failing; performance on each test is evaluated on basis of empirical data. An examinee’s score is usually interpreted by comparing it with the scores obtained by others on the same test. For this purpose, test developers typically provide norms - this being the average performance of the standardization sample on the test. STEPS IN TEST CONSTRUCTION Test construction has become a highly technical matter, and the procedures vary for different types. This section attempts to outline the most basic procedures of test construction that are common to most types of test. SPECIFICATION OF THE PURPOSE OF THE TEST AND THE TARGET TEST POPULATION. This begins the whole process. These test parameters can be described in terms of: (1) the specific area in which assessment is to be done; (2) the characteristics of the attribute or some theoretical conception of the psychological nature of the trait that the test aims to measure; and, (3) the criterion that the test scores are to be related with for its validation. ITEM WRITING. This is done by experts in the various areas to be tested. Items are composed to cover a wide range of difficulty, from quite easy to very difficult items, for the intended population. Many more items should be created than will be used in the final test. ITEM EDITING. This stage is done by several persons, to increase the likelihood of spotting formal defects in the items. Each item is checked for clarity of wording, appropriateness of vocabulary level for the intended population, stylistic equivalence of multiple choice distracters and the correct answer, and the “face validity” of the items. This latter characteristic is the property of an item that gives it the appearance of measuring what the test as a whole is supposed to measure. It may or may not be related to the actual validity of the item of the test, but it can affect the “reasonableness,” “fairness,” and acceptability of the test in the eyes of those who are taking it. In item editing, questionable, weak, or defective items are either revamped or discarded. Unless the remaining pool of acceptable items still contains a much larger number than will be needed for the final test, more items are composed and subjected to editing until the required number is obtained. ITEM TRYOUT. This consists of administering the entire pool of items, presented in the format of an actual test, to a large, representative sample from the population for which the test is intended. The sole purpose of this tryout is obtain enough data for the next stage in the process of test construction; item analysis. Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 10 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 11 of 198 ITEM ANALYSIS. This stage presents the most technical aspect of the whole process, involving a number of psychometric and statistical methods. The essential information provided by an item analysis are the following: a. the difficulty level (percentage passing) of an item; b. the discriminability of each item, that is, how clearly it differentiates (in terms of percentage passing the item) the highest from the lowest scorers on the test as a whole, or how highly each item correlates with the total score in the test; c. analysis of incorrect responses to determine, for example, which multiple- choice alternatives are so rarely chosen as to be virtually non-functional. STANDARDIZATION. The distribution of raw scores (number correct) in a large representative sample of the target population is converted to some meaningful, intrepretable scale such as percentiles, IQs, or other forms of standardized scores. Such converted scores clearly indicate any individual’s relative standing in the standardization (or normative) population. Many tests are re-standardized every few years, or even more often to take account of shifts in the target population. Periodic item analysis may lead to revamping or discarding items no longer suitable for the target population. Determination of the test’s reliability and standard error of measurement in the normative population is also a part of the standardization procedure. VALIDATION. This is the final step. Test scores are correlated with the appropriate criterion performance (e.g. scholastic achievement, college grades, ratings of proficiency on the job). Often validity coefficients are determined for different subgroups or for different criteria. All standardization and validation methods and results should be reported in the test manual for the benefit of the test users. Proper execution of all procedures involved in test construction and validation is a large scale, extremely costly undertaking. Making and marketing a new test that would be competitive with the present, most widely used tests is beyond the resources of any individual psychometrician or small organization. Test production, therefore, is dominated by only a handful of large multi-million moneyed firms. Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 11 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 12 of 198 TYPES OF TESTS Tests can differ from each other in many ways. There are many variations that have been used to subdivide tests into a broad list of certain types. The reader should note, however, that any typology of tests is purely arbitrary determination. Group tests are largely paper-and-pencil measures suitable to the testing of large groups of persons at the same time. The number of examinees can range into the hundreds if sufficient test proctors are available. Because of its simplicity and low cost, this type is far more popular than the individual tests. Individual tests are instruments which by their design and purpose must be administered one on one. An advantage of these tests is that the examiner can establish rapport, gauge the level of motivation of the examinee, and assess the relevance of other factors upon the test results. Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 12 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 13 of 198 Another broad classification of tests is those that seek to measure maximum performance (tests of abilities) versus those tests that gauge a typical response (tests of typical performance). The former is used when one wishes to know how well the person can perform at his best. The distinguishing feature of these tests is that the subject is encouraged to earn the best score he can. Within this category are often included: Intelligence test - refers to a test which yields an overall summary score based on the results from a heterogeneous sample of items. Aptitude test - measures one or more clearly defined and relatively homogeneous segments of ability. Such tests come in two varieties: single aptitude test and multiple aptitude test batteries. These are often used to predict success in an occupation, training course, or educational endeavor. Achievement tests - measure a person’s degree of learning, success, or accomplishment in a subject matter that has been taught directly. The purpose of this test, then, is to determine how much of the material the examinee has absorbed or mastered. The tests of typical performance are used to investigate not what the person can do best but what he usually does. It does not single any particular response as “good”. This category includes: Creativity tests - assess a subject’s ability to produce new ideas, insights, or artistic creations that are accepted as being of social, esthetic, or scientific value. These tests emphasize novelty and originality in the solution of problems or in the production of artistic works. Personality tests - measure the traits, qualities, or behaviors that determine a person’s individuality; this information helps to predict behavior. These tests come in several different varieties, including checklists, inventories, and projective techniques. Interest inventories - measure an individual’s preference for certain activities or topics and thereby help to determine occupational choices. Behavioral procedures - assess the antecedents and consequences of behavior, including checklists, rating scales, interviews, structured observations. Neuropsychological tests - are used in the assessment of persons with known or suspected brain dysfunction. Neuropsychology is the study of brain-behavior relationship. Their primary purpose is to evaluate the sensory, motor, cognitive, and behavioral strengths and weaknesses of the neurologically-impaired patient. Another classification of tests pertains to speed and power tests. Speed tests are those in which a subject must, in a limited amount of time, answer a series of questions or tasks of uniformly low level of difficulty. Their intent is to measure primarily the rapidity with which examinees can do what is asked. Power tests, in contrast, have items which are more difficult and time limits are generous enough that a very large percentage of subjects for whom the test is designed will have ample time to completely/try all of the items. Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 13 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 14 of 198 The distinction between verbal, nonverbal, and performance tests has also been noted. Verbal tests are those in which the respondent utilizes written or spoken language, either in the direction presented to the examinee, in responses to the test items, or in both cases. Nonverbal tests are so constructed that the instruction is given orally and the persons tested respond without the use of language. In the strict sense, a nonverbal test does not require the use of language either by the examiner or the person taking the test. Pantomime is used as a means of giving directions. Performance tests involve the manipulation of objects, with the minimal use of paper-and-pencil. The task to be performed requires an overt motor response other than verbal. Another way by which tests are distinguished is in the interpretative framework used for the raw score. Norm-referenced tests interpret an individual’s score based on the results on a typical group of subjects for whom the instrument is designed, whereas criterion-referenced tests are used to ascertain an examinee’s status with respect to some criterion, for instance, an established performance standard. The point here is to determine what examinees can do, rather than how they compare with others. The classification scheme presented in the foregoing presents a convenient way of organizing the discussion on tests, but let it be reiterated that by no means are these absolute categories. PURPOSE OF A PSYCHOLOGICAL TEST: Early tests were used predominantly for two purposes: to measure intelligence and to detect personality disorders. From birth to old age, though, one encounters tests at almost every turning point of life. From the baby’s first test - the APGAR test - to the toddler’s developmental assessments, then the school readiness test for the preschool child. Once a career begins, academic tests are given - not to mention the vocational tests and admission tests. After graduation, adults still face tests for job entry, personality function, marital compatibility, etc. - the list is nearly endless. Elaborate: CURRENT USES OF PSYCHOLOGICAL TESTS Traditionally the purpose of a psychological test has been to measure differences between individuals or between the reactions of the same individual on different occasions. One of the first problems that stimulated the development of psychological tests was the identification of the mentally retarded. To this day, the detection of intellectual deficiencies remains an important application of certain types of psychological tests, apart from the examination of the emotionally disturbed, the delinquent and other types of behavioral deviants. Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 14 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 15 of 198 By far the most common use of psychological tests is to make decisions about persons. For example, educational institutions frequently use tests to determine placement levels for students and universities ascertain who should be admitted, in part, on the basis of test scores. State, federal, and local civil service systems also rely heavily upon tests for purposes of personnel selection. Even the individual practitioner exploits tests for decision making. Examples include the consulting psychologist who uses a personality test to determine that a police department hires one candidate and not another and the neuropsychologist who employs a test to conclude that a client has suffered brain damage. But simple decision making is not the only function of psychological testing. It is convenient to distinguish five uses of tests: Classification Diagnosis and Treatment Planning Self-knowledge Program Evaluation Research These applications frequently overlap and, on occasion, are difficult to distinguish from one another. For example, a test that helps determine a psychiatric diagnosis might also provide a form of self-knowledge. The foregoing applications will be examined in more detail. The term classification encompasses a variety of procedures that share a common purpose: assigning a person to one category rather than another. There are many variant forms of classification, each emphasizing a particular purpose in assigning persons to categories. These are: Placement Screening Certification Selection Placement is the sorting of persons into different programs appropriate to their needs or skills. For example, universities often use a mathematics placement exam to determine if students should enroll in calculus, algebra, or a remedial course. Screening refers to quick and simple tests or procedures to identify persons who might have special characteristics or needs. Ordinarily, psychometricians acknowledge that screening tests will result in many misclassifications. Examiners are therefore advised to do follow up testing with additional instruments before making important decisions on the basis of screening tests. Certification and selection both have a pass/fail quality. Passing a certification exam confers privileges. Examples include the right to practice psychology or drive a car. Thus, certification typically implies that a person has at least a minimum proficiency in some discipline or activity. Selection is similar to certification in that it confers privileges such as the opportunity to attend a university or to gain employment. Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 15 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 16 of 198 Another use of psychological test is for diagnosis and treatment planning. Diagnosis consists of two intertwined tasks: determining the nature and source of a person’s abnormal behavior and classifying the behavior pattern within an accepted diagnostic system. Diagnosis is usually a precursor to remediation or treatment of personal distress or impaired performance. Here, psychological tests often play an important role. For example, intelligence tests are absolutely essential in the diagnosis of mental retardation. Personality tests are helpful in diagnosing the nature and extent of emotional disturbance. In fact, some test such as the MMPI were devised for the explicit purpose of increasing the efficiency of psychiatric diagnosis. Psychological tests can also supply a potent source of self-knowledge. In some cases, the feedback a person receives from psychological test results is so self-affirming that it can change the entire course of a life. Of course, not every instance of psychological testing provides self-knowledge. Perhaps in the majority of cases, the client already knows that the test results divulge. A high functioning college student is seldom surprised to find that his IQ is in the superior range. An architect is nonplussed to hear that she has excellent spatial reasoning skills. A student with meager reading capacity is usually not startled to receive a diagnosis of “learning disability.” Another use for psychological tests is the systematic reevaluation of educational and social programs. Social programs are designed to provide services which improve social conditions and community life. So far what has been discussed are the particular application of psychological tests to everyday problems such as job selection, diagnosis, or even program evaluation. In each of these instances, testing serves an immediate, pragmatic purpose: helping the tester make a decision about persons or programs. But tests also play a major role in both the applied and theoretical branches of behavioral research. As an example of testing in applied research, consider the problem faced by neuropsychologists who wish to investigate the deficits in children. The only feasible way to explore this supposition is by testing normal and lead-burdened children with a battery of psychological tests. Needleman, Gunnoe, Leviton, Reed, Peresie, Maher, and Barrett (1979) used an array of traditional and innovative tests to conclude that low-level lead absorption causes decrements in IQ, impairments in reaction time, and escalations of undesirable classroom behaviors. Academicians and public policy makers respect psychological tests. Why else would they engage in lengthy, acrimonious debates about the validity of testing-based research findings? On occasion, tests serve a less worldly role by helping scientists investigate theoretical matters that have no immediate or obvious practical applications. For example, to analyze perceptual field dependence, Witkin (1949) invented the tilting-room tilting-chair test (TRTC) which inspired a lifetime of research on personality development, but was seldom applied to any practical problems of testing. Tests serve an important function in basic research. Nearly all problems in differential psychology utilize testing procedures as a means for gathering data. For most areas of research - studies on the nature and extent of individual differences, the organization of psychological traits, the measurement of group differences, Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 16 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 17 of 198 and the identification of biological and cultural factors associated with behavioral differences - were made possible by well constructed tests. MODULE 2: STANDARDIZATION & OBJECTIVITY: UNIFORMITY OF PROCEDURES IN TEST ADMINISTRATION, UNIFORMITY OF PROCEDURES IN SCORING, AND THE CONCEPT OF GUESSING This module shall discuss the meaning, purpose, significance of standardization as well as the ways of implementing it. The different sources of errors in test administration and scoring should be understood in the light of the lessons in this module. Factors such as noise, examinee’s motivation and emotional state (e.g. anxiety) will be looked into. Learning Outcomes: 1. Explain the concept and importance of standardization and uniformity in the context of behavioral assessment and in test administration and scoring. 2. Describe the tasks of a test constructor and test examiner in relation to standardization. 3. Identify errors or problems in test administration and scoring and determine psychometrically-appropriate ways to address or minimize these errors or problems. 4. Design item types and scoring guidelines that ensure quality standardization 5. Explain the importance of item analysis in the context of test development. Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 17 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 18 of 198 6. Analyze a given distribution of test scores from an item pool, and based on the distribution, infer the difficulty of the test and the test’s capability to differentiate among test takers. 7. Accurately carry out the steps and statistical computations for item difficulty, item discrimination, distracter analysis and total test difficulty, either via the use of calculator or through computer. 8. Correctly interpret obtained item difficulty, item discrimination, and total test difficulty indices. 9. Evaluate the merits of the individual items that make up an item pool and decide whether an item should be modified, retained in, or removed from the final form of the test. 10. Strategize ways to increase the standardization and objectivity of a test based on the distribution of scores and results from item analysis. 11. Advocate the regular and timely evaluation and possible update or revision of items of existing tests to ensure continuity of objectivity, reliability and validity. Unit 1: Standardization in Test Administration Engage: “UNIFORMLY USEFUL” Explore: “Two junior classes, Class A and Class B, will be taking an Aptitude Test today” Explain: What may limit the generalizability of test results? Introduction to Standardization. Standardization and Uniformity of Procedures in Test Administration Elaborate: Test Anxiety Unit 2: Standardization in Scoring and the Concept of Guessing Engage: Of scores obtained from essay questions. Explore: Try your hand at scoring. Explain: Uniformity of procedures in scoring different types of items Elaborate: Guessing and its “effect” on the obtained test score Unit 3: Building Objectivity and Standardization through Item Analysis: Introduction to Item Analysis and the Analysis for Item Difficulty Index Engage: Identifying strengths and flaws, “What makes a good test?” Explore: Quality versus Quantity Explain: Distribution of Scores, Analysis of Items Via Item Difficulty Index Elaborate: Computation and interpretation of Item Difficulty Interval Scales to indicate difficulty of an item Unit 4: Building Objectivity and Standardization through Item Analysis: Item Analysis for Index of Discrimination, and Distracter Analysis Engage: What is a psychometrically sound item? Explore: What?! Discrimination?! Explain: Computation of Item Discrimination Index Elaborate: Putting it all together, Distracter Analysis Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 18 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 19 of 198 UNIT 1: Standardization in Test Administration Engage: “UNIFORMLY USEFUL” Today, a second group of 40 college freshmen will be taking an Intelligence Test. Ana, a Psychometrician, makes sure to prepare the same materials, set up the testing room the same way, and distribute the materials in the same order as she did when she conducted the same test last week to the first group of freshmen. Once the examinees are settled, she gives the same instructions like she did during the test last week. What do you think is the importance of: - doing the same procedure when conducting a test? - giving the same instructions to examinees?. Explore: Read the given scenario: Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 19 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 20 of 198 Two junior classes, Class A and Class B, will be taking an Aptitude Test today. The school is currently being renovated, so the Psychometricians- in-charge need to be flexible. Class A will be taking the test in the mini gymnasium, where windows are big enough to allow sunlight in, and the room is wide enough for proper ventilation. The chairs and tables are spread apart, making it comfortable for the examinees. The gym’s sound system is used, so the Psychometrician’s voice is audible and clear enough to be heard across the gym. Class B, on the other hand, is assigned to take the test in a classroom at the basement where it is cold, and quite dark. The lights tend to flicker, as the classroom has not been used for a while. For their test, armchairs will be used, and there is not much distance between examinees. Every now and then, the sounds of hammering and pouring of cement disrupts the Psychometrician who is giving instructions. Reflect on the scenario. What can you say about it? Will test performance be affected? How so? To better understand these two activities, let us discuss the concept of “Standardization”. Explain: Do you still recall that a psychological test is a standardized measure? What exactly do we mean by “standardized”? STANDARDIZATION refers to the uniformity of procedures in the administration and scoring of a psychological test. A prerequisite in psychological testing is the creation of a behavioral situation which is standardized in space, as well as in time. When conditions are standardized, differences in test results are more certainly attributable to differences in the person factor, and not to differences in the stimuli or conditions affecting them. Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 20 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 21 of 198 Refer back to the scenario in the Explore part of this unit. Classes A and B will be taking the same test, but under different conditions. This goes against standardization. Class B will probably get lower scores, resulting in the test not being able to accurately and reliably measure what it intends to measure. This is caused, not by individual differences, but by the test-taking condition they are in. What should have been done? To apply standardization, the rooms should have been the same in terms of lighting, ventilation, seating arrangements, and the like. Also, the condition of the examinees should have been considered. The rationale requires that the Psychometrician should try to standardize the state of the examinee, as well as the test stimuli. Procedures must be designed to eliminate irrelevant individual differences or extraneous variables, so that only factors that the test aims to measure are left. The psychometrician should try to reduce all examinees to a “standard state” of motivation, expectation, and interpretation of the task, It is important to keep in mind that the single independent variable is usually the individual being tested / the examinee. UNIFORMITY OF PROCEDURES IN TEST ADMINISTRATION ✓ Task of the Test Constructor Psychological tests are administered under a prescribed set of procedures. The test constructor is the authority on matters regarding the test, from construction of items to the final version of the test. He/she is responsible for providing a detailed set of procedures for administering the developed test. These procedures should be clearly and completely stated in the test manual as the interpretation of a psychological test is most reliable when measurements are obtained under standardized conditions. Directions or instructions are a major part of the standardization of a new test, and it would help to also have instructions that a test examiner could simply read to examinees. Standardization extends to the exact materials employed, time limits, preliminary demonstrators, and ways of handling examinees’ queries, as well as other details of the testing situation. Necessary test materials should be included in the test packet. ✓ Task of the Test Examiner/Psychometrician With RA10029, there is a need for a qualified examiner (licensed Psychometrician). Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 21 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 22 of 198 The examiner should be trained in the task he/she is expected to perform in every test he/she will be using. A prerequisite of test administration is familiarity with the test. Advanced Preparation is the most important single requirement for good testing procedures. There can be no emergencies in testing, as every step should be planned and prepared for. Special efforts must be made to foresee and forestall emergencies. It is in this way that uniformity of procedures can be assured. Advanced Preparation for the testing session can be achieved in various forms and ways: Memorizing the exact verbal instructions is most essential in individual testing. In group testing, instructions are read to the test takers, but familiarity with the statements to be read is still a must to prevent misreading and hesitation (“umm..”, “err..”). it also permits a more natural, informal manner during test administration, and allows the examiner to be more at ease with him/herself. Instructions should be read word per word, adding nothing and changing nothing. Test materials should be prepared before the test is administered. Familiarization of test materials should be done prior to test administration. This includes being knowledgeable about when materials are to be used and how to use them. In individual testing, the actual layout of necessary materials facilitates subsequent use, and decreases the need to search or fumble. Materials should be placed within easy reach of the examiner, and should not distract the test taker. In group testing, all materials such as answer sheets, test questionnaires, pencils, and the like should be carefully counted, checked, and arranged in advance of the testing day. Time limits for each test or subtest should be strictly observed. Sufficient time should be allotted for the entire testing process. This includes setting up, reading of instructions, and actual test taking. Thorough familiarity with the specific testing procedure is an important prerequisite in both individual and group testing. For individual testing, supervised training in test administration is essential. For group testing, briefing of examiners and proctors should be done for awareness of functions that each should perform. Awareness of the following test conditions should also be observed by the Psychometrician: Physical Conditions Place: The selection of a suitable testing room should be given attention. The room should be free from undue noise and distraction, and should have adequate lighting and ventilation. Sitting facilities and enough working space for test takers should be provided. The type of desks or chairs should be Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 22 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 23 of 198 considered. A sign on the door indicating that testing is ongoing may be posted to avoid disruptions. There are possible differences between paper-and-pencil and computer administration of the same test. Professional guidelines have been formulated to aide test users in accessing comparability of test scores obtained under these two types of administration. The nature of the test and the population of test takers are considered in relation to the effect of differences in test administration on norms, reliability, and validity. Time: In administering a test, the time in which a test is taken should be considered. Alert subjects are more likely to give their best than subjects who are tired. Children, for instance, may be more active in the morning after having a good breakfast than in the afternoon when there is a need for a nap. Generally though, equally good results can be produced at any hour of the day if the examinee really wants to do well. Occasionally, it is necessary to administer a test to a person at a psychologically inopportune time. If this is the case, the only correct procedure is to maintain an adequately critical attitude toward the results. The unfavorable testing conditions should be taken into account when interpreting results. Motivational Conditions It is but proper for the Psychometrician to remember that the “subject” being tested is a person, a human being. This then makes testing a more complex psychological relationship. The traditional concern with motivation recognizes this fact. Influence of the Examiner: Rapport is defined as “a close and harmonious relationship in which the people or groups concerned understand each other's feelings or ideas and communicate well.” Examiners may differ in their abilities to establish rapport. Those who are quite unfriendly or unwelcoming will likely obtain less cooperation from their subjects, and may result in reduced performance in ability tests or defensive, distorted results on personality tests. On the other hand, those who are overly warm or affectionate may err in the opposite direction, and may even give subtle cues to correct answers. Examinees may feel uncomfortable with both extremes, thus, they should be avoided. Examiners are urged to establish rapport with their subjects. In psychometrics, rapport refers to the examiner’s efforts to arouse the examinee’s interest in the test, elicit cooperation, and ensure that the examinee follows the standard test Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 23 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 24 of 198 instructions. A crucial aspect of valid testing is the ability to initiate a cordial testing environment. Part of examiner training is training on some techniques for establishing rapport. These will vary depending on the nature of the test, the age, and other special characteristics of the person tested. In ability tests, for instance, careful concentration on the given tasks and giving one’s best efforts to perform well is the objective. In personality inventories, the objective calls for frank and honest responses, while in projective tests, full reporting of associations evoked by stimuli without censoring or editing content is required. In all instances, the examiner aims to motivate test takers to follow directions as conscientiously as they can. In terms of age groups, there are special factors to be considered in the establishment of rapport and building of motivation. In testing preschool children, a friendly, cheerful, and relaxed approach on the part of the examiner may help reassure the child. The test may be presented as a game, and should be intrinsically interesting. Early elementary school children may also be given the game approach, while older school children can usually be motivated through an appeal to their competitive spirit and the desire to do well on tests. Adult testing may present some additional problems, as adults are not likely to work hard at a task merely because it is assigned. It is then important to promote the purpose of the test to the adult so that he/she would understand the need for such, and be motivated as well. For people of any age, the examiner should note that every test presents an implied threat to the individual’s prestige. Providing some reassurance should then be given from the very start. Special motivational problems may be encountered in testing emotionally disturbed persons, prisoners, or juvenile delinquents. Such persons are likely to manifest unfavorable attitudes, such as suspicion, insecurity, fear, or cynical indifference. The examiner should make special efforts to establish rapport under these conditions. He or she must be sensitive to these special conditions and take them into account when interpreting and explaining test results and performance. Background and Motivation of the Examinees: Examinees differ not only in personality and internal characteristics, but also in other extraneous ways that Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly prohibited. 24 Document Code FM-STL-013 Saint Louis University Revision No. 01 School of Teacher Education and Liberal Arts Effectivity June 07, 2021 Page 25 of 198 might affect or influence test results. In some instances, test results may be inaccurate due to the filtering and distorting effects of certain characteristics, such as: ❖ Test Anxiety – This is a type of performance anxiety characterized by a combination of physiological over-arousal, tension and somatic symptoms, along with worry, dread, fear of failure, and catastrophizing, that occur before or during test situations. Undoubtedly, subjects experience different levels of test anxiety ranging from a carefree outlook to incapacitating dread at the prospect of being tested. Emotionality and worry are two important components of test anxiety. The emotionality component consist of feelings and physiological reactions such as

Psych 312 Manual 2022 PDF

Document Details

Tags

Related

Summary

Full Transcript