The Tools of Psychological Assessment PDF

The Tools of Psychological Assessment The Test A test is defined simply as a measuring device or procedure. When the word test is prefaced with a modifier, it refers to a device or procedure designed to measure a variable related to that modifier. Consider, for example, the term medical test, which refers to a device or procedure designed to measure some variable related to the practice of medicine (including a wide range of tools and procedures, such as X-rays, blood tests, and testing of reflexes). In a like manner, the term psychological test refers to a device or procedure designed to measure variables related to psychology (such as intelligence, personality, aptitude, interests, attitudes, or values). Whereas a medical test might involve analysis of a sample of blood, tissue, or the like, a psychological test almost always involves analysis of a sample of behavior. The behavior sample could range from responses to a pencil-and-paper questionnaire, to verbal responses to questions related to the performance of some task. The behavior sample could be elicited by the stimulus of the test itself, or it could be naturally occurring behavior (observed by the assessor in real time as it occurs, or it can be recorded and observed at a later time). Psychological tests and other tools of assessment may differ with respect to a number of variables, such as content, format, administration procedures, scoring and interpretation procedures, and technical quality. The content (subject matter) of the test will, of course, vary with the focus of the particular test. But even two psychological tests purporting to measure the same thing—for example, personality—may differ widely in item content. This difference is, in part, because two test developers might have entirely different views regarding what is important in measuring “personality”; different test developers employ different definitions of “personality.” Additionally, different test developers come to the test development process with different theoretical orientations. For example, items on a psychoanalytically oriented personality test may have little resemblance to those on a behaviorally oriented personality test, yet both are personality tests. A psychoanalytically oriented personality test might be chosen for use by a psychoanalytically oriented assessor, and an existentially oriented personality test might be chosen for use by an existentially oriented assessor. The term format pertains to the form, plan, structure, J UST T H INK... arrangement, and layout of test items as well as to related Imagine you wanted to develop a test for a considerations such as time limits. Format is also used to refer to personality trait you termed “goth.” How the form in which a test is administered: computerized, pencil-and- would you define this trait? What kinds of paper, or some other form. When making specific reference to a items would you include in the test? Why computerized test, the format may also involve the form of the would you include those kinds of items? software: local or online/cloud-based software and storage. The How would you distinguish this personality term format is not confined to tests. Format is also used to denote trait from others? the form or structure of other evaluative tools and processes, such as the guidelines for creating a portfolio work sample. Tests differ in their administration procedures. Some tests, particularly those designed for administration on a one-to-one basis, may require an active and knowledgeable test administrator. The test administration may involve demonstration of various kinds of tasks demanded of the assessee, as well as trained observation of an assessee’s performance. Alternatively, some tests, particularly those designed for administration to groups, may not even require the test administrator to be present while the testtakers independently complete the required tasks. Tests differ in their scoring and interpretation procedures. To better understand how and why, let’s define score and scoring. Sports enthusiasts are no strangers to these terms. For them, these terms refer to the number of points accumulated by competitors and the process of accumulating those points. In testing and assessment, we formally define score as a code 8 Part 1: An Overview or summary statement, usually but not necessarily numerical in nature, that reflects an evaluation of performance on a test, task, interview, or some other sample of behavior. Scoring is the process of assigning such evaluative codes or statements to performance on tests, tasks, interviews, or other behavior samples. In the world of psychological assessment, many different types of scores exist. Some scores result from the simple summing of responses (such as the summing of correct/incorrect or agree/disagree responses), and some scores are derived from more elaborate procedures. Scores themselves can be described and categorized in many different ways. For example, one type of score is the cut score. A cut score (also referred to as a cutoff score or simply a cutoff) is a reference point, usually numerical, derived by judgment and used to divide a set of data into two or more classifications. Some action will be taken or some inference will be made on the basis of these classifications. Cut scores on tests, usually in combination with other data, are used in schools in many contexts. For example, they may be used in grading, and in making decisions about the class or program to which children will be assigned. Cut scores are used by employers as aids to decision making about personnel hiring, placement, and advancement. State agencies use cut scores as aids in licensing decisions. There are probably more than a dozen different methods that can be used to formally derive cut scores (Dwyer, 1996). If you’re curious about what some of those different methods are, stay tuned; we cover that in an upcoming chapter. Sometimes no formal method is used to arrive at a cut score. Some teachers use an informal “eyeball” method to proclaim, for example, that a score of 65 or more on a test means “pass” and a score of 64 or below means “fail.” Whether formally or informally derived, cut scores typically take into account, at least to some degree, the values of those who set them. Consider, for example, two professors who teach the same course at the same college. One professor might set a cut score for passing the course that is significantly higher (and more difficult for students to attain) than the other professor. There is also another side to the human equation as it relates to cut scores, one that is seldom written about in measurement texts. This phenomenon concerns the emotional consequences of “not making the cut” and “just making the cut” (see Figure 1–1). Tests differ widely in terms of their guidelines for scoring and interpretation. Some tests are self-scored by the testtakers themselves, others are scored by computer, and others require scoring by trained examiners. Some tests, such as most tests of intelligence, come with test manuals that are explicit not only about scoring J UST THI N K... criteria but also about the nature of the interpretations that can How might one test of intelligence have more be made from the scores. Other tests, such as the Rorschach utility than another test of intelligence in the Inkblot Test, are sold with no manual at all. The (presumably same school setting? qualified) purchaser buys the stimulus materials and then selects and uses one of many available guides for administration, scoring, and interpretation. Tests differ with respect to their psychometric soundness or technical quality. Synonymous with the antiquated term psychometry, psychometrics is defined as the science of psychological measurement. Variants of these words include the adjective psychometric (which refers to measurement that is psychological in nature) and the nouns psychometrist and psychometrician (both terms referring to a professional who uses, analyzes, and interprets psychological test data). One speaks of the psychometric soundness of a test when referring to how consistently and how accurately a psychological test measures what it purports to measure. Assessment professionals also speak of the psychometric utility of a particular test or assessment method. In this context, utility refers to the usefulness or practical value that a test or other tool of assessment has for a particular purpose. These concepts are elaborated on in subsequent chapters. Now, returning to our discussion of tools of assessment, meet one well-known tool that, as they say, “needs no introduction.” Chapter 1: Psychological Testing and Assessment 9 Figure 1–1 Emotion engendered by categorical cutoffs. People who just make some categorical cutoff may feel better about their accomplishment than those who make the cutoff by a substantial margin. But those who just miss the cutoff may feel worse than those who miss it by a substantial margin. Evidence consistent with this view was presented in research with Olympic athletes (Medvec et al., 1995; Medvec & Savitsky, 1997). Bronze medalists were—somewhat paradoxically—happier with the outcome than silver medalists. Bronze medalists might say to themselves “at least I won a medal” and be happy about it. By contrast, silver medalists might feel frustrated that they tried for the gold and missed winning it. Jean Catuffe/Getty Images The Interview In everyday conversation, the word interview conjures images of face-to-face talk. But the interview as a tool of psychological assessment typically involves more than talk. If the interview is conducted face-to-face, then the interviewer is probably taking note of not only the content of what is said but also the way it is being said. More specifically, the interviewer is taking note of both verbal and nonverbal behavior. Nonverbal behavior may include the interviewee’s “body language,” movements, and facial expressions in response to the interviewer, the extent of eye contact, apparent willingness to cooperate, and general reaction to the demands of the interview. The interviewer may also take note of the way the interviewee is dressed. Here, variables such as neat versus sloppy, and appropriate versus inappropriate, may be noted. Because of a potential wealth of nonverbal information to be gained, interviews are ideally conducted face-to-face. However, face-to-face contact is not always possible and interviews may be conducted in other formats. In an interview conducted by telephone, for example, the interviewer may still be able to gain information beyond the responses to questions by being sensitive to variables such as changes in the interviewee’s voice pitch or the extent to which 10 Part 1: An Overview particular questions precipitate long pauses or signs of emotion in response. Of course, interviews need not involve verbalized J UST THI N K... speech, as when they are conducted in sign language. Interviews What type of interview situation would you may also be conducted by various electronic means, as would envision as ideal for being carried out entirely be the case with online interviews, e-mail interviews, and through the medium of text-messaging? interviews conducted by means of text messaging. In its broadest sense, then, we can define an interview as a method of gathering information through direct communication involving reciprocal exchange. Interviews differ with regard to many variables, such as their purpose, length, and nature. Interviews may be used by psychologists in various specialty areas to help make diagnostic, treatment, selection, or other decisions. So, for example, school psychologists may use an interview to help make a decision about the appropriateness of various educational interventions or class placements. A court-appointed psychologist may use an interview to help guide the court in determining whether a defendant was insane at the time of a commission of a crime. A specialist in head injury may use an interview to help shed light on questions related to the extent of damage to the brain that was caused by the injury. A psychologist studying consumer behavior may use an interview to learn about the market for various products and services, as well as how best to advertise and promote them. A police psychologist may instruct eyewitnesses to serious crimes to close their eyes when they are interviewed about details related to the crime. They do so because there is suggestive evidence that the responses will have greater relevance to the questions posed if the witness’s eyes are closed (Vredeveldt et al., 2015). An interview may be used to help professionals in human resources to make more informed recommendations about the hiring, firing, and advancement of personnel. In some instances, what is called a panel interview (also referred to as a board interview) is employed. Here, more than one interviewer participates in the assessment. A presumed advantage of this personnel assessment technique is that any idiosyncratic biases of a lone interviewer will be minimized (Dipboye, 1992). A disadvantage of the panel interview relates to its utility; the cost of using multiple interviewers may not be justified (Dixon et al., 2002). Some interviewing, especially in the context of clinical and counseling settings, has as its objective not only the gathering of information from the interviewee, but a targeted change in the interviewee’s thinking and behavior. A therapeutic technique called motivational interviewing, for example, is used by counselors and clinicians to gather information about some problematic behavior, while simultaneously attempting to address it therapeutically (Bundy, 2004; Miller & Rollnick, 2002, 2012). Motivational interviewing may be defined as a therapeutic dialogue that combines person-centered listening skills such as openness and empathy, with the use of cognition-altering techniques designed to positively affect motivation and effect therapeutic change. Motivational interviewing has been employed to address a relatively wide range of problems (Hoy et al., 2016; Kistenmacher & Weiss, 2008; Miller & Rollnick, 2009; Pollak et al., 2016; Rothman & Wang, 2016; Shepard et al., 2016) and has been successfully employed in intervention by means of telephone (Lin et al., 2016), Internet chat (Skov-Ettrup et al., 2016), and text messaging (Shingleton et al., 2016). The popularity of the interview as a method of gathering information extends far beyond psychology. Just try to think of one day when you were not exposed to an interview on television, radio, or the Internet! J U S T T H I N K... Regardless of the medium through which it is conducted, an What types of interviewing skills must the interview is a reciprocal affair in that the interviewee reacts to host of a talk show possess to be considered the interviewer and the interviewer reacts to the interviewee. an effective interviewer? Do these skills differ The quality, if not the quantity, of useful information produced from those needed by a professional in the by an interview depends in no small part on the skills of the field of psychological assessment? If so, how? interviewer. Interviewers differ in many ways: their pacing of Chapter 1: Psychological Testing and Assessment 11 interviews, their rapport with interviewees, and their ability to convey genuineness, empathy, and humor. Keeping these differences firmly in mind, consider Figure 1–2. How might the distinctive personality attributes of these two celebrities affect responses of interviewees? Which of these two interviewers do you think is better at interviewing? Why? The Portfolio Students and professionals in many different fields of endeavor ranging from art to architecture keep files of their work products. These work products—whether retained on paper, canvas, film, video, audio, or some other medium—constitute what is called a portfolio. As samples of one’s ability and accomplishment, a portfolio may be used as a tool of evaluation. Employers of commercial artists, for example, will make hiring decisions based, in part, on the impressiveness of an applicant’s portfolio of sample drawings. As another example, consider the employers of on-air radio talent. They, too, will make hiring decisions that are based partly upon their judgments of (audio) samples of the candidate’s previous work. J UST T H INK... The appeal of portfolio assessment as a tool of evaluation If you were to prepare a portfolio representing extends to many other fields, including education. Some have “who you are” in terms of your educational argued, for example, that the best evaluation of a student’s writing career, your hobbies, and your values, what skills can be accomplished not by the administration of a test, but would you include in your portfolio? by asking the student to compile a selection of writing samples. Also in the field of education, portfolio assessment has been Figure 1–2 On interviewing and being interviewed. Different interviewers have different styles of interviewing. How would you characterize the interview style of Jimmy Fallon as compared to that of Howard Stern? Theo Wargo/Getty Images 12 Part 1: An Overview employed as a tool in the hiring of instructors. An instructor’s portfolio may consist of various documents such as lesson plans, published writings, and visual aids developed expressly for teaching certain subjects. All of these materials can be extremely useful to those who must make hiring decisions. Case History Data Case history data refers to records, transcripts, and other accounts in written, pictorial, or other form that preserve archival information, official and informal accounts, and other data and items relevant to an assessee. Case history data may include files or excerpts from files maintained at institutions and agencies such as schools, hospitals, employers, religious institutions, and criminal justice agencies. Other examples of case history data are letters and written correspondence including email, photos and family albums, newspaper and magazine clippings, home videos, movies, audiotapes, work samples, artwork, doodlings, and accounts and pictures pertaining to interests and hobbies. Postings on social media such as Facebook, Instagram, or Twitter may also serve as case history data. Employers, university admissions departments, healthcare providers, forensic investigators, and others may collect data from postings on social media to help inform inference and decision making (Lis et al., 2015; Pirelli et al., 2016). Case history data is a useful tool in a wide variety of assessment contexts. In a clinical evaluation, for example, case history data can shed light on an individual’s past and current adjustment as well as on the events and circumstances that may have contributed to any changes in adjustment. Case history data can be of critical value in neuropsychological evaluations, where it often provides information about neuropsychological functioning prior to the occurrence of a trauma or other event that results in a deficit. School psychologists rely on case history data for insight into a student’s current academic or behavioral standing. Case history data is also useful in making judgments concerning future class placements. The assembly of case history data, as well as related data, into an illustrative account is referred to by terms such as case study or case history. We may formally define a case study (or case history) as a report or illustrative account concerning a person or an event that was compiled on the basis of case history data. A J UST THI N K... case study might, for example, shed light on how one individual’s personality and a particular set of environmental conditions What are the pros and cons of using case history data as a tool of assessment? combined to produce a successful world leader. A case study of an individual who attempted to assassinate a high-ranking political figure could shed light on what types of individuals and conditions might lead to similar attempts in the future. Work on a social psychological phenomenon referred to as groupthink contains rich case history material on collective decision making that did not always result in the best decisions (Janis, 1972). Groupthink arises as a result of the varied forces that drive decision-makers to reach a consensus (such as the motivation to reach a compromise in positions). Case history data, usually in combination with other intelligence (informative data), also play an important role in military or political threat assessment (Bolante & Dykeman, 2015; Borum, 2015; Dietz et al., 1991; Gardeazabal & Sandler, 2015; Malone, 2015; Mrad et al., 2015). The United States Secret Service has long relied on such information to help protect the President as well its other protectees (Coggins et al., 1998; Institute of Medicine, 1984; Takeuchi et al., 1981; Vossekuil & Fein, 1997). Behavioral Observation If you want to know how someone behaves in a particular situation, observe the individual’s behavior in that situation. Such “down-home” wisdom underlies at least one approach to evaluation. Behavioral observation, as it is employed by assessment professionals, is defined as Chapter 1: Psychological Testing and Assessment 13 monitoring the actions of others or oneself by visual or electronic means while recording quantitative and/or qualitative information regarding those actions. Behavioral observation is often used as a diagnostic aid in various settings such as inpatient facilities, behavioral research laboratories, and classrooms. Behavioral observation may be used for purposes of selection or placement in corporate or organizational settings. In such instances, behavioral observation may be used as an aid in identifying personnel who best demonstrate the abilities required to perform a particular task or job. Sometimes researchers venture outside of the confines of clinics, classrooms, workplaces, and research laboratories in order to observe behavior of humans in a natural setting—that is, the setting in which the behavior would typically be expected to occur. This variety of behavioral observation is referred to as naturalistic observation. So, for example, to study the socializing behavior of children with autism spectrum disorders with same-age peers, one research team opted for natural settings rather than a controlled, laboratory environment (Bellini et al., 2007; Dekker et al., 2016; Handen et al., 2018). Behavioral observation as an aid to designing therapeutic J UST T H INK... intervention is extremely useful in institutional settings such as schools, hospitals, prisons, and group homes. Using published or What are the advantages and disadvantages self-constructed lists of targeted behaviors, staff can observe of naturalistic observation as tools of firsthand the behavior of individuals and design interventions assessment? accordingly. In a school situation, for example, naturalistic observation on the playground of a culturally different child suspected of having linguistic problems might reveal that the child has the necessary English language skills but is unwilling—for reasons of shyness, cultural upbringing, or whatever—to demonstrate those abilities to adults. In practice, behavioral observation, and especially naturalistic observation, tends to be used most frequently by researchers in settings such as classrooms, clinics, prisons, and other types of facilities where observers have ready access to assessees. For private practitioners, it is typically not practical or economically feasible to spend hours out of the consulting room observing clients as they go about their daily lives. Still, there are some mental health professionals, such as those in the field of assisted living, who find great value in behavioral observation of patients outside of their institutional environment. For them, it may be necessary to accompany a patient outside of the institution’s walls to learn if that patient is capable of independently performing activities of daily living. In this context, a tool of assessment that relies heavily on behavioral observation, such as the Test of Grocery Shopping Skills (see Figure 1–3), may be extremely useful. Role-Play Tests Role play may be defined as acting an improvised or partially improvised part in a simulated situation. A role-play test is a tool of assessment wherein assessees are directed to act as if they were in a particular situation. Assessees may then be evaluated with regard to their expressed thoughts, behaviors, abilities, and other variables. (Note that role play is hyphenated when used as an adjective or a verb but not as a noun.) Role play is useful in evaluating various skills. For example, grocery shopping skills (Figure 1–3) could conceivably be evaluated through role play. Depending upon how the task is set up, an actual trip to the supermarket could or could not be required. Of course, role play may not be as useful as “the real thing” in all situations. Still, J UST T H INK... role play is used quite extensively, especially in situations where What are the pros and cons of role play as a it is too time-consuming, too expensive, or simply too inconvenient tool of assessment? In your opinion, what to assess in a real situation. For example, astronauts in training type of presenting problem would be ideal may be required to role-play many situations “as if” in outer for assessment by role play? space. Such “as if” scenarios for training purposes result in truly “astronomical” savings. 14 Part 1: An Overview Figure 1–3 Price (and judgment) check in aisle 5. Designed primarily for use with persons with psychiatric disorders, the context-based Test of Grocery Shopping Skills (Brown et al., 2009; Hamera & Brown, 2000) may be very useful in evaluating a skill necessary for independent living. Dave and Les Jacobs LLC/Blend Images Individuals being evaluated in a corporate, industrial, organizational, or military context for managerial or leadership ability may routinely be placed in role-play situations. They may be asked, for example, to mediate a hypothetical dispute between personnel at a work site. The format of the role play could range from “live scenarios” with live actors, or computer-generated simulations. Outcome measures for such an assessment might include ratings related to various aspects of the individual’s ability to resolve the conflict, such as effectiveness of approach, quality of resolution, and number of minutes to resolution. Role play as a tool of assessment may also be used in various clinical contexts. For example, it is routinely employed in many interventions with substance abusers. Clinicians may attempt to obtain a baseline measure of substance abuse, cravings, or coping skills by administering a role-play test prior to therapeutic intervention. The same test is then administered again subsequent to completion of treatment. Role play can thus be used as both a tool of assessment and a measure of outcome. Computers as Tools We have already made reference to the role computers play in contemporary assessment in the context of generating simulations. They may also help in the measurement of variables that in the past were quite difficult to quantify. But perhaps the more obvious role as a tool of assessment is their role in test administration, scoring, and interpretation. As test administrators, computers do much more than replace the “equipment” that was so widely used in the past (e.g., a number 2 pencil). Computers can serve as test administrators (online or off) and as highly efficient test scorers. Within seconds they can derive not only test scores but Chapter 1: Psychological Testing and Assessment 15 patterns of test scores. Scoring may be done on-site (local processing) or conducted at some central location (central processing). If processing occurs at a central location, test-related data may be sent to and returned from this central facility by means of the Internet, phone lines (teleprocessing), mail, or courier. Whether processed locally or centrally, an account of a testtaker’s performance can range from a mere listing of a score or scores (a simple scoring report) to the more detailed extended scoring report, which includes statistical analyses of the testtaker’s performance. A step up from scoring reports is the interpretive report, which is distinguished by its inclusion of numerical or narrative interpretive statements in the report. Some interpretive reports contain relatively little interpretation and simply call attention to certain high, low, or unusual scores. At the high end of interpretive reports is what is sometimes referred to as a consultative report. This type of report, usually written in language appropriate for communication between assessment professionals, may provide expert opinion concerning analysis of the data. Yet another type of computerized scoring report is designed to integrate data from sources other than the test itself into the interpretive report. Such an integrative report will employ previously collected data (such as medication records or behavioral observation data) into the test report. An acronym you may come across is CAT, which stands for computer adaptive testing. The adaptive in this term is a reference to the computer’s ability to tailor the test to the testtaker’s ability or test-taking pattern. For example, on a computerized test of academic abilities, the computer might be programmed to switch from testing math skills to English skills after three consecutive failures on math items. Another way a computerized test could be programmed to adapt is by providing the testtaker with score feedback as the test proceeds. Score feedback in the context of CAT may, depending on factors such as intrinsic motivation and external incentives, positively affect testtaker engagement as well as performance (Arieli-Attali & Budescu, 2015). Another acronym, CAPA, refers to the term computer-assisted psychological assessment. In this case, the word assisted typically refers to the assistance computers provide to the test user, not the testtaker. One specific brand of CAPA, for example, is Q-Interactive. Available from Pearson Assessments, this technology allows test users to administer tests by means of two iPads connected by bluetooth (one for the test administrator and one for the testtaker). Test administrators may record testtakers’ verbal responses and may make written notes using a stylus with the iPad. Scoring is immediate. Sweeney (2014) reviewed Q-Interactive and was favorably impressed. He liked the fact that it obviated the need for many essentials of paper-and-pencil test administration (including test kits and a stopwatch). However, he did point out that only a limited number of tests are available to administer, and that no Android or Windows edition of the software has been made available. Also, despite the publisher’s promise of freedom from test kits, the reviewer often found himself “going back to the manual” (Sweeney, J UST T H INK... 2014, p. 19). Since the time of the Sweeney (2014) review, a total Describe a test that would be ideal for of 20 assessment tools have been added to the Q-Interactive testing computer administration. Then describe a test system, which continues to be available exclusively on iPads. Vrana that would not be ideal for computer and Vrana (2017) carefully examined the elements of the Wechsler administration. individual intelligence tests, arguing for viability of completely computer-administered assessment in the near future. CAPA opened a world of possibilities for test developers, enabling them to create psychometrically sound tests using mathematical procedures and calculations so complicated that they may have taken weeks or months to use in a bygone era. It opened a new world to test users, enabling the construction of tailor-made tests with built-in scoring and interpretive capabilities previously unheard of. For many test users, CAPA was a great advance over the past, when they had to personally administer tests and possibly even place the responses in some other form prior to analysis (such as by manually using a scoring template or other device). And even after doing all of that, they would then begin the often laborious tasks of scoring and interpreting the resulting data. Still, every rose has its thorns; some of the pros and cons of CAPA are summarized in Table 1–2. The number of tests in this format is burgeoning, and test 16 Part 1: An Overview Table 1–2 CAPA: Some Pros and Cons Pros Cons CAPA saves professional time in test administration, scoring, Professionals must still spend significant time reading and interpretation. software and hardware documentation and even ancillary books on the test and its interpretation. CAPA results in minimal scoring errors resulting from human With CAPA, the possibility of software or hardware error is error or lapses of attention or judgment. ever present, from difficult-to-pinpoint sources such as software glitches or hardware malfunction. CAPA ensures standardized test administration to all testtakers CAPA leaves those testtakers who are unable to employ familiar with little, if any, variation in test administration procedures. test-taking strategies (previewing test, skipping questions, going back to previous question, etc.) at a disadvantage. CAPA yields standardized interpretation of findings due to CAPA’s standardized interpretation of findings based on a set, elimination of unreliability traceable to differing points of unitary perspective may not be optimal; interpretation view in professional judgment. could profit from alternative viewpoints. Computers’ capacity to combine data according to rules is Computers lack the flexibility of humans to recognize the more accurate than that of humans. exception to a rule in the context of the “big picture.” Nonprofessional assistants can be used in the test Use of nonprofessionals leaves diminished, if any, opportunity administration process, and the test can typically be for the professional to observe the assessee’s test-taking administered to groups of testtakers in one sitting. behavior and note any unusual extra-test conditions that may have affected responses. Professional groups such as APA develop guidelines and Profit-driven nonprofessionals may also create and distribute tests standards for use of CAPA products. with little regard for professional guidelines and standards. Paper-and-pencil tests may be converted to CAPA products with The use of paper-and-pencil tests that have been converted consequential advantages, such as a shorter time between the for computer administration raises questions about the administration of the test and its scoring and interpretation. equivalence of the original test and its converted form. Security of CAPA products can be maintained not only by Security of CAPA products can be breached by computer traditional means (such as locked filing cabinets) but by hackers, and integrity of data can be altered or destroyed by high-tech electronic products (such as firewalls). untoward events such as introduction of computer viruses. Computers can automatically tailor test content and length Not all testtakers take the same test or have the same based on responses of testtakers. test-taking experience. users must take extra care in selecting the right test given factors such as the objective of the testing and the unique characteristics of the test user (Zygouris & Tsolaki, 2015). The APA Committee on Psychological Tests and Assessment was convened to consider the pros and cons of computer-assisted assessment, and assessment using the Internet (Naglieri et al., 2004). Among the advantages over paper-and-pencil tests cited were (1) test administrators have greater access to potential test users because of the global reach of the Internet, (2) scoring and interpretation of test data tend to be quicker than for paper-and-pencil tests, (3) costs associated with Internet testing tend to be lower than costs associated with paper-and-pencil tests, and (4) the Internet facilitates the testing of otherwise isolated populations, as well as people with disabilities for whom getting to a test center might prove a hardship. We might add that Internet testing tends to be “greener,” as it may conserve paper, shipping materials, and so forth. Further, there is probably less chance for scoring errors with Internet-based tests as compared to paper-and-pencil tests. Although Internet testing appears to have many advantages, it is not without potential pitfalls, problems, and issues. One basic issue has to do with what Naglieri et al. (2004) termed “test-client integrity.” In part this term refers to the verification of the identity of the testtaker when a test is administered online. It also refers, in more general terms, to the sometimes varying interests of the testtaker versus that of the test administrator. Depending upon the conditions of the administration, testtakers may have unrestricted access to notes, other Internet resources, and Chapter 1: Psychological Testing and Assessment 17 other aids in test-taking—despite the guidelines for the test administration. At least with regard to achievement tests, there is some evidence that unproctored Internet testing leads to “score inflation” as compared to more traditionally administered tests (Carstairs & Myors, 2009). A related aspect of test-client integrity has to do with the procedure in place to ensure that the security of the Internet-administered test is not compromised. What will prevent other testtakers from previewing past—or even advance—copies of the J UST T H INK... test? Naglieri et al. (2004) reminded their readers of the distinction What cautions should Internet test users between testing and assessment, and the importance of recognizing keep in mind regarding the source of their that Internet testing is just that—testing, not assessment. As such, test data? Internet test users should be aware of all of the possible limitations of the source of the test scores. Other Tools The next time you have occasion to stream a video, fire-up that Blu-ray player, or even break- out an old DVD, take a moment to consider the role that video can play in assessment. In fact, specially created videos are widely used in training and evaluation contexts. For example, corporate personnel may be asked to respond to a variety of video-presented incidents of sexual harassment in the workplace. Police personnel may be asked how they would respond to various types of emergencies, which are presented either as reenactments or as video recordings of actual occurrences. Psychotherapists may be asked to respond with a diagnosis and a treatment plan for each of several clients presented to them on video. Graduate students in psychology programs may use interactive online programs like Theravue to develop their basic counseling skills. The list of video’s potential applications to assessment is endless. The next generation of video assessment is the assessment that employs virtual reality (VR) technology. Assessment using VR technology is fast finding its way into a number of psychological specialty areas (Anbro et al., 2020; Morina et al., 2015; Sharkey & Merrick, 2016). Many items that you may not readily associate with psychological assessment may be pressed into service for just that purpose. For example, psychologists may use many of the tools traditionally associated with medical health, such as thermometers to measure body temperature and gauges to measure blood pressure. Biofeedback equipment is sometimes used to obtain measures of bodily reactions (such as muscular tension) to various sorts of stimuli. And then there are some less common instruments, such as the penile plethysmograph. This instrument, designed to measure male sexual arousal, may be helpful in the diagnosis and treatment of sexual predators. Impaired ability to identify odors is common in many disorders in which there is central nervous system involvement, and simple tests of smell may be administered to help determine if such impairment is present. In general, there has been no shortage of innovation on the part of J UST T H INK... psychologists in devising measurement tools, or adapting existing When is assessment using video a better tools, for use in psychological assessment. approach than using a paper-and-pencil To this point, our introduction has focused on some basic test? What are the pitfalls, if any, to using definitions, as well as a look at some of the “tools of the (assessment) video in assessment? trade.” We now raise some fundamental questions regarding the who, what, why, how, and where of testing and assessment. Who, What, Why, How, and Where? Who are the parties in the assessment enterprise? In what types of settings are assessments conducted? Why is assessment conducted? How are assessments conducted? Where does one go for authoritative information about tests? Think about the answer to each of these important questions before reading on. Then check your own ideas against those that follow. 18 Part 1: An Overview Who Are the Parties? Parties in the assessment enterprise include developers and publishers of tests, users of tests, and people who are evaluated by means of tests. Additionally, we may consider society at large as a party to the assessment enterprise. The test developer Test developers and publishers create tests or other methods of assessment. The American Psychological Association (APA) has estimated that more than 20,000 new psychological tests are developed each year. Among these new tests are some that were created for a specific research study, some that were created in the hope that they would be published, and some that represent refinements or modifications of existing tests. Test creators bring a wide array of backgrounds and interests to the test development process. Test developers and publishers appreciate the significant influence that test results can have on people’s lives. Accordingly, a number of professional organizations have published standards of ethical behavior that specifically address aspects of responsible test development and use. Perhaps the most detailed document addressing such issues is one jointly written by the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education (NCME). Referred to by many psychologists simply as “the Standards,” Standards for Educational and Psychological Testing covers issues related to test construction and evaluation, test administration and use, and special applications of tests, such as special considerations when testing linguistic minorities. Initially published in 1954, revisions of the Standards were published in 1966, 1974, 1985, 1999, and 2014. The Standards is an indispensable reference work not only for test developers but for test users as well. The test user Psychological tests and assessment methodologies are used by a wide range of professionals, including clinicians, counselors, school psychologists, human resources personnel, consumer psychologists, industrial-organizational psychologists, experimental psychologists, and social psychologists. In fact, with respect to the job market, the demand for psychologists with measurement expertise far outweighs the supply (Dahlman & Geisinger, 2015). Still, questions remain as to who exactly is qualified to use psychological tests. The Standards and other published guidelines from specialty professional organizations have had much to say in terms of identifying just who is a qualified test user and who should have access to (and be permitted to purchase) psychological tests and related tools of psychological assessment (American Psychological Association, 2017). Still, controversy exists about which professionals with what type of training should have access to which tests. Members of various professions, with little or no psychological training, have sought the right to obtain and use psychological tests. In many countries, no ethical or legal regulation of psychological test use exists (Leach & Oakland, 2007). So who are (or should be) test users? Should occupational therapists, for example, be allowed to administer psychological tests? What about employers and human resources executives with no formal training in psychology? So far, we’ve listed a number of controversial Who? questions that knowledgeable assessment professionals still debate. Fortunately, there is at least one Who? question about which there is very little debate: the one regarding who the testtaker or assessee is. J UST THI N K... The testtaker We have all been testtakers. However, we have In addition to psychologists, who should be not all approached tests in the same way. On the day a test is permitted access to, as well as the privilege to be administered, testtakers may vary with respect to numerous of using, psychological tests? variables, including these: The amount of test anxiety they are experiencing and the degree to which that test anxiety might significantly affect their test results Chapter 1: Psychological Testing and Assessment 19 The extent to which they understand and agree with the rationale for the assessment Their capacity and willingness to cooperate with the examiner or to comprehend written test instructions The amount of physical pain or emotional distress they are experiencing The amount of physical discomfort brought on by not having had enough to eat, having had too much to eat, or other physical conditions The extent to which they are alert and wide awake as opposed to nodding off The extent to which they are predisposed to agree or disagree when presented with stimulus statements The extent to which they have received prior coaching The importance they may attribute to portraying themselves in a good (or bad) light The extent to which they are, for lack of a better term, “lucky” and can “beat the odds” on a multiple-choice achievement test (even though they may not have learned the subject matter). In the broad sense in which we are using the term “testtaker,” J UST T H INK... anyone who is the subject of an assessment or an evaluation can be a testtaker or an assessee. As amazing as it sounds, this means that What recently deceased public figure would even a deceased individual can be considered an assessee. True, a you like to see a psychological autopsy done deceased person is the exception to the rule, but there is such a thing on? Why? What results might you expect? as a psychological autopsy. A psychological autopsy is defined as a reconstruction of a deceased individual’s psychological profile on the basis of archival records, artifacts, and interviews previously conducted with the deceased assessee or people who knew the person well. For example, using psychological autopsies, Townsend (2007) explored the question of whether suicide terrorists were indeed suicidal from a classical psychological perspective. She concluded that they were not. Other researchers have provided fascinating postmortem psychological evaluations of people from various walks of life in many different cultures (Bhatia et al., 2006; Chan et al., 2007; Dattilio, 2006; Fortune et al., 2007; Foster, 2011; Giner et al., 2007; Goldstein et al., 2008; Goodfellow et al., 2020; Heller et al., 2007; Knoll & Hatters Friedman, 2015; McGirr et al., 2007; Nock et al., 2017; Owens et al., 2008; Palacio et al., 2007; Phillips et al., 2007; Pouliot & De Leo, 2006; Ross et al., 2017; Rouse et al., 2015; Sanchez, 2006; Thoresen et al., 2006; Vento et al., 2011; Zonda, 2006). Society at large The uniqueness of individuals is one of the most fundamental characteristic facts of life.... At all periods of human history men have observed and described differences between individuals.... But educators, politicians, and administrators have felt a need for some way of organizing or systematizing the many-faceted complexity of individual differences. (Tyler, 1965, p. 3) The societal need for “organizing” and “systematizing” has historically manifested itself in such varied questions as “Who is a witch?,” “Who is schizophrenic?,” and “Who is qualified?” The specific questions asked have shifted with societal concerns. The methods used to determine the answers have varied throughout history as a function of factors such as intellectual sophistication and religious preoccupation. Proponents of palmistry, podoscopy, astrology, and phrenology, among other pursuits, have argued that the best means of understanding and predicting human behavior was through the study of the palms of the hands, the feet, the stars, bumps on the head, tea leaves, and so on. Unlike such pursuits, the assessment enterprise has roots in science. Through systematic and replicable means that can produce compelling evidence, the assessment enterprise responds to what Tyler (1965, p. 3) described as society’s demand for “some way of organizing or systematizing the many-faceted complexity of individual differences.” 20 Part 1: An Overview Society at large exerts its influence as a party to the assessment enterprise in many ways. As society evolves and as the need to measure different psychological variables emerges, test developers respond by devising new tests. Through elected representatives to the legislature, laws are enacted that govern aspects of test development, test administration, and test interpretation. Similarly, by means of court decisions, as well as less formal means (see Figure 1–4), society at large exerts its influence on various aspects of the testing and assessment enterprise. Other parties Beyond the four primary parties we have focused on here, let’s briefly make note of others who may participate in varied ways in the testing and assessment enterprise. Organizations, companies, and governmental agencies sponsor the development of tests for various reasons, such as to certify personnel. Companies and services offer test-scoring or interpretation services. In some cases these companies and services are simply extensions of test publishers, and in other cases they are independent. There are people whose sole responsibility is the marketing and sales of tests. Sometimes these people are employed by the test publisher; sometimes they are not. There are academicians who review tests and evaluate their psychometric soundness. All of these people, as well as many others, are parties to a greater or lesser extent in the assessment enterprise. Having introduced you to some of the parties involved in the Who? of psychological testing and assessment, let’s move on to tackle some of the What? and Why? questions. In What Types of Settings Are Assessments Conducted, and Why? 08/20124 Educational settings You are probably no stranger to the many types of tests administered in the classroom. As mandated by law, tests are administered early in school life to help identify children who may have special needs. In addition to school ability tests, another type of test commonly given in schools is an achievement test, which evaluates accomplishment or the degree of learning that has taken place. Some of the achievement tests you have taken in school were constructed by your teacher. Other achievement tests were constructed for more widespread use by educators working with measurement professionals. In the latter category, initialisms such as SAT and GRE may ring a bell. Figure 1–4 Public feedback regarding an educational testing program. In recent years there have been many public demonstrations against various educational testing programs. Strident voices have called for banishing such programs, or for parents to “opt out” of having their children tested. As you learn more about the art and science of testing, assessment, and measurement, you will no doubt develop an informed opinion about whether tests do more harm than good, or vice versa. Eric Crama/Shutterstock Chapter 1: Psychological Testing and Assessment 21 You know from your own experience that a diagnosis may be defined as a description or conclusion reached on the basis of evidence and opinion. Typically this conclusion is reached through a process of distinguishing the nature of something and ruling out alternative conclusions. Similarly, the term diagnostic test refers to a tool of assessment used to help narrow down and identify areas of deficit to be targeted for intervention. In educational settings, diagnostic tests of reading, mathematics, and other academic subjects may be administered to assess the need for educational intervention as well as to establish or rule out eligibility for special education programs. Schoolchildren receive grades on their report cards that are not based on any formal assessment. For example, the grade next to “Works and plays well with others” is probably based more on the teacher’s informal evaluation in the classroom J UST T H INK... than on scores on any published measure of social interaction. What tools of assessment could be used to We may define informal evaluation as a typically non- evaluate a student’s social skills? systematic assessment that leads to the formation of an opinion or attitude. Informal evaluation is, of course, not limited to educational settings; it is a part of everyday life. In fact, many of the tools of evaluation we have discussed in the context of educational settings (such as achievement tests, diagnostic tests, and informal evaluations) are also administered in various other settings. And some of the types of tests we discuss in the context of the settings described next are also administered in educational settings. So please keep in mind that the tools of evaluation and measurement techniques that we discuss in one context may well be used in other contexts. Our objective at this early stage in our survey of the field is simply to introduce a sampling (not a comprehensive list) of the types of tests used in different settings. Clinical settings Tests and many other tools of assessment are widely used in clinical settings such as public, private, and military hospitals, inpatient and outpatient clinics, private-practice consulting rooms, schools, and other institutions. These tools are used to help screen for or diagnose behavior problems. What types of situations might prompt the employment of such tools? Here’s a small sample: A private psychotherapy client wishes to be evaluated to see if the assessment can provide any nonobvious clues regarding his maladjustment. A school psychologist clinically evaluates a child experiencing learning difficulties to determine what factors are primarily responsible for it. A psychotherapy researcher uses assessment procedures to determine if a particular method of psychotherapy is J UST T H INK... effective in treating a particular problem. What kinds of issues do psychologists have A psychologist-consultant retained by an insurance to consider when assessing prisoners in company is called on to give an opinion as to the reality of contrast to assessing workplace managers? a client’s psychological problems; is the client really experiencing such problems or just malingering? A court-appointed psychologist is asked to give an opinion as to a defendant’s competency to stand trial. A prison psychologist is called on to give an opinion regarding the extent of a convicted violent prisoner’s rehabilitation. The tests employed in clinical settings may be intelligence tests, personality tests, neuropsychological tests, or other specialized instruments, depending on the presenting or suspected problem area. The hallmark of testing in clinical settings is that the test or measurement technique is employed with only one individual at a time. Group testing is used primarily for screening—that is, identifying those individuals who require further diagnostic evaluation. 22 Part 1: An Overview Counseling settings Assessment in a counseling context may occur in environments as diverse as schools, prisons, and governmental or privately owned institutions. Regardless of the particular tools used, the ultimate objective of many such assessments is the improvement of the assessee in terms of adjustment, productivity, or some related variable. Measures of social and academic skills and measures of personality, interest, attitudes, and values are among the many types of tests that a counselor might administer to a client. Referral questions to be answered range from “How can this child better focus on tasks?” to “For what career is the client best suited?” to “What activities are recommended for retirement?” Having mentioned retirement, let’s hasten to introduce another type of setting in which psychological tests are used extensively. Geriatric settings In the United States, more than 14.2 million adults are currently in the age range of 75 to 84; this is about 18 times more people in this age range than there were in 1900. More than six million adults in the United States are currently 85 years old or older, which is a 52-fold increase in the number of people of that age since 1900. People in the United States are living longer, and the population as a whole is getting older. Older Americans may live at home, in special housing designed for independent living, in housing designed for assisted living, or in long-term care facilities such as hospitals and hospices. Wherever older individuals reside, they may at some point require psychological assessment to evaluate cognitive, psychological, adaptive, or other functioning. At issue in many such assessments is the extent to which assessees J UST THI N K... are enjoying as good a quality of life as possible. The definition of quality of life has varied as a function of perspective in different Tests are used in geriatric, counseling, and studies. In some research, for example, quality of life is defined other settings to help improve quality of life. from the perspective of an observer; in other research it is defined But are there some aspects of quality of life from the perspective of assessees themselves and refers to an that a psychological test just can’t measure? individual’s own self-report regarding lifestyle-related variables. However defined, what is typically assessed in quality of life evaluations are variables related to perceived stress, loneliness, sources of satisfaction, personal values, quality of living conditions, and quality of friendships and other social support. Generally speaking, from a clinical perspective, the assessment of older adults is more likely to include screening for cognitive decline and dementia than the assessment of younger adults (Gallo & Bogner, 2006; Gallo & Wittink, 2006). Dementia is a loss of cognitive functioning (which may affect memory, thinking, reasoning, psychomotor speed, attention, and related abilities, as well as personality) that occurs as the result of damage to or loss of brain cells. Perhaps the best known of the many forms of dementia that exist is Alzheimer’s disease. The road to diagnosis by the clinician is complicated by the fact that severe depression in the elderly can contribute to cognitive functioning that mimics dementia, a condition referred to as pseudodementia (Madden et al., 1952). It is also true that the majority of individuals suffering from dementia exhibit depressive symptoms (Strober & Arnett, 2009). Clinicians rely on a variety of different tools of assessment to make a diagnosis of dementia or pseudodementia. Business and military settings In business, as in the military, various tools of assessment are used in sundry ways, perhaps most notably in decision making about the careers of personnel. A wide range of achievement, aptitude, interest, motivational, and other tests may be employed in the decision to hire as well as in related decisions regarding promotions, transfer, job satisfaction, and eligibility for further training. For a prospective air traffic controller, successful performance on a test of sustained attention to detail may be one requirement of employment. For promotion to the rank of officer in the military, successful performance on a series of leadership tasks may be essential. Another application of psychological tests involves the engineering and design of products and environments. Engineering psychologists employ a variety of existing and specially devised Chapter 1: Psychological Testing and Assessment 23 tests in research designed to help people at home, in the workplace, and in the military. Products ranging from home computers to office furniture to jet cockpit control panels benefit from the work of such research efforts. Using tests, interviews, and other tools of assessment, J UST T H INK... psychologists who specialize in the marketing and sale of products are involved in taking the pulse of consumers. They help corporations Assume the role of a consumer psychologist. predict the public’s receptivity to a new product, a new brand, or a What ad campaign do you find particularly new advertising or marketing campaign. Psychologists working in effective in terms of pushing consumer “buy” the area of marketing help “diagnose” what is wrong (and right) buttons? What ad campaign do you find about brands, products, and campaigns. On the basis of such particularly ineffective in this regard? Why? assessments, these psychologists might make recommendations regarding how new brands and products can be made appealing to consumers, and when it is time for older brands and products to be retired or revitalized. Have you ever wondered about the variety of assessments conducted by a psychologist in the military? In this chapter’s Meet an Assessment Professional (MAP) feature, we meet U.S. Air Force psychologist, Lt. Col. Alan Ogle, Ph.D., and learn about his wide range of professional duties. Note that each chapter of this book contains a “MAP” feature allowing readers unprecedented access to the “real world life” of a mental health professional who uses psychological tests and other tools of psychological assessment. Each of the featured assessment professionals were asked to write a brief essay in which they shared a thoughtful and educational perspective on their assessment-related activities. Governmental and organizational credentialing One of the many applications of measurement is in governmental licensing, certification, or general credentialing of professionals. Before they are legally entitled to practice medicine, physicians must pass an examination. Law school graduates cannot present themselves to the public as attorneys until they pass their state’s bar examination. Psychologists, too, must pass an examination before adopting the official title “psychologist.” Members of some professions have formed organizations with requirements for membership that go beyond those of licensing or certification. For example, physicians can take further specialized training and a specialty examination to earn the distinction of being “board certified” in a particular area of medicine. Psychologists specializing in certain areas may be evaluated for a diploma from the American Board of Professional Psychology (ABPP) to recognize excellence in the practice of psychology. Another organization, the American Board of Assessment Psychology (ABAP), awards its diploma on the basis of an examination to test users, test developers, and others who have distinguished themselves in the field of testing and assessment. Academic research settings Conducting any sort of research typically entails measurement of some kind, and any academician who ever hopes to publish research should ideally have a sound knowledge of measurement principles and tools of assessment. To emphasize this simple fact of research life, imagine the limitless number of questions that psychological researchers could conceivably raise, and the tools and methodologies that might be used to find answers to those questions. For example, Thrash et al. (2010) wondered about the role of inspiration in the writing process. Herbranson and Schroeder (2010) raised the question “Are pigeons smarter than mathematicians?” Milling et al. (2010) asked whether one’s level of hypnotizability predicts responses to pain-lessening hypnotic suggestions. Angie et al. J UST T H INK... (2011) explored whether the potential for violence of an ideological What research question would you like to group can be assessed by studying the group’s website. see studied? What tools of assessment might be used in that research? Other settings Many different kinds of measurement procedures find application in a wide variety of settings. For example, the 24 Part 1: An Overview M E E T A N A S S E S S M E N T P R O F E S S I O N A L Meet Dr. Alan Ogle I arrived at my first duty station on 8th September, 2001, having completed doctoral training at a civilian university followed by an internship at Wright-Patterson Air Force Medical Center. An amazing, challenging, and rewarding career has ensued, with assignments at various bases in the United States, the United Kingdom, and Afghanistan. As a clinical psychologist for the Air Force, I provide assessment and treatment to military personnel and their families, as well as consultation to military commanders regarding psychological health, substance abuse prevention, and combat and operational stress control. A postdoctoral fellowship and additional military Alan Ogle, Ph.D., Lieutenant Colonel, U.S. Air Force coursework has qualified me to also support various Alan Ogle other military activities such as high-risk survival, evasion, resistance, and escape (SERE) training, Here, the “best fit” would be those candidates who not reintegration support services for military and civilians only are free of vulnerabilities in psychological health returning from isolation or captivity, human and psychosocial circumstances that might impair performance optimization, and the evaluation and performance and possess the requisite qualifications selection of personnel for special assignments. but also excel in job-relevant skills and characteristics The use of clinical assessment measures in the for success in a specific unit and mission set. military is comparable to civilian practice. Commonly One example of psychological assessment for a used measures include brief symptom screeners special duty is the program developed and utilized (such as the Patient Health Questionnaire-9 and the for selection of Military Training Instructors (MTIs) for Generalized Anxiety Disorder scale-7). We also USAF Basic Military Training (BMT). Called drill administer, as indicated, measures of personality and instructors or drill sergeants in other services, these cognitive functioning (such as the current versions of are noncommissioned officers (NCOs) with seven or the MMPI and Wechsler tests) to identify treatment more years of service in their primary career field needs, monitor progress, and/or assess fitness for (e.g., aircraft maintenance, security forces, military service. intelligence) selected for this special duty Unlike many other military selection assignments, assignment. This is a position of challenge and assessment of military personnel for special missions tremendous trust, tasked with engaging and may entail both “select-in” as well as “select-out” transforming young civilian volunteers from diverse options. Here, the tools of assessment are used to backgrounds and motivations through a highly identify psychological or psychosocial concerns that intensive training regimen into capable military would indicate risk to job candidates (or their families) members. Training can devolve dangerously when if selected for a challenging assignment as well as to not well managed by the instructor—intense training identify areas that might make a challenging coupled with the power differential between MTI and assignment as well as to identify areas that might make recruits may lead to errors in decision making, overly a candidate a liability to a mission. Beyond helping to affective responses, maltreatment, or maltraining. “select out” candidates deemed to be at risk, Assigning the right instructors, those best skilled and psychologists assist in helping to “select in” candidates suited for this special duty, is paramount to the deemed to be the best for a particular unit and mission. success and safety of the training. (continued) Chapter 1: Psychological Testing and Assessment 25 M E E T A N A S S E S S M E N T P R O F E S S I O N A L Meet Dr. Alan Ogle (continued) I had the opportunity to serve on a working group Responses are confidential and not released to the of psychologists to develop an empirically derived, candidate or other coworkers. There is also a standardized psychological screening protocol of component of the MD360 completed by the candidate candidates for entry into MTI duty. Job analytic that includes self-assessment of relevant skills, studies were conducted to identify knowledge, skills, personality and attitude scales, and a situational abilities, and other characteristics (KSAOs) important judgment test developed specific to types of challenges to serving successfully in MTI duty, with emphasis on faced in MTI duty. A concurrent validation study of the both identification of factors important to safe, self-assessment measures found significant effective performance, as well as potential “red flag” relationships of several attitudes to performance in warning signs for this position of trust and power leadership, mentorship, and risk for maltreatment by over a vulnerable population of trainees. An MTIs. Based on results of the interview and MD360, assessment protocol was developed including an a recommendation is made regarding strengths interview by a mental health provider meeting with and any concerns regarding suitability for MTI duty, the MTI candidate and their significant other (if including nonrecommend (select out) as well as partnered). With awareness that a large body of recommend with sufficient characterization of skills for research indicates clinicians are at risk to prioritization of candidates. overestimate clinical judgment’s accuracy for At least equally important to “getting the right predicting behavior and job success, the interview is people” are efforts to sufficiently train, supervise, and structured by behaviorally anchored rating scales for support MTIs through their challenging duties. A team each of the job-critical areas. Ratings for the domain titled the USAF BMT Military Training Consult Service of judgment/self-control, for instance, include was established, providing ongoing assessment and consideration of history of childhood delinquency support to serving MTIs, as well as training in behaviors (such as skipping school, or fighting), adult appropriate use of stress inoculation training of discipline and legal issues, and interview questions recruits. Additionally, training and command such as “What are some choices or mistakes that you consultation is provided to mitigate risks of behavioral particularly regret?” Assessment of Family Stability/ drift inherent to the positional power dynamics of the Support includes interview of the candidate and instructor–recruit relationship. The goal is to support partner regarding questions such as “What would be safe, effective training of new military members as the most challenging changes for your family in this well as excellence in instructor staff. assignment?” Cognitive screening is required and a Students considering service in the military are brief screening tool is used for time efficiency. encouraged to research opportunities, either in An additional component of the assessment uniform or civilian positions. The U.S. Air Force, protocol we developed is the Multidimensional 360 Army, and Navy each offer APA-approved internships Assessment (MD360), which collects input from a at multiple sites, for those meeting medical and candidate’s coworkers regarding MTI-relevant work other requirements, then requiring completion of performance behaviors and potential “red flags.” one assignment. I have been honored to remain in As examples, subordinates, peers, and supervisors service beyond the initial obligation, thoroughly provide ratings about the candidate on items such as, enjoying the opportunities for training, broad “Remains focused, on task, and decisive in stressful responsibilities from early on in my psychology situations,” “Leads others in a fair and consistent career, and service with national purpose. manner,” and, “Avoids inappropriate personal relationships (such as flirting or fraternization).” Used with permission of Dr. Alan Ogle. 26 Part 1: An Overview courts rely on psychological test data and related expert testimony as one source of information to help answer important questions such as “Is this defendant competent to stand trial?” and “Did this defendant know right from wrong at the time the criminal act was committed?” Measurement may play an important part in program evaluation, whether it is a large-scale government program or a small-scale, privately funded one. Is the program working? How can the program be improved? Are funds being spent in the areas where they ought to be spent? How sound is the theory on which the program is based? These are the types of general questions that tests and measurement procedures used in program evaluation are designed to answer. Tools of assessment can be found in use in research and practice in every specialty area within psychology. For example, consider health psychology, a discipline that focuses on understanding the role of psychological variables in the onset, course, treatment, and prevention of illness, disease, and disability (Cohen, 1994). Health psychologists are involved in teaching, research, or direct-service activities designed to promote good health. Individual interviews, surveys, and paper-and-pencil tests are some of the tools that may be employed to help assess current status with regard to some disease or condition, gauge treatment progress, and evaluate outcome of intervention. One general line of research in health psychology focuses on aspects of personality, behavior, or lifestyle as they relate to physical health. The methodology employed may entail reporting on measurable respondent variables as they change in response to some intervention, such as education, therapy, counseling, change in diet, or change in habits. Measurement tools may be used to compare one naturally occurring group of research subjects to another such group (such as smokers compared to nonsmokers) with regard to some other health-related variable (such as longevity). Many of the questions raised in health-related research have real, life-and-death consequences. All of these important questions, like the questions raised in other areas of psychology, require that sound techniques of evaluation be employed. How Are Assessments Conducted? If a need exists to measure a particular variable, a way to measure that variable will be devised. As Figure 1–5 just begins to illustrate, the ways in which measurements can be taken are limited only by imagination. Keep in mind that this figure illustrates only a small sample of the many methods used in psychological testing and assessment. The photos are not designed to illustrate the most typical kinds of assessment procedures. Rather, their purpose is to call attention to the wide range of measurement tools that have been created for varied uses. Responsible test users have obligations before, during, and after a test or any measurement procedure is administered. For purposes of illustration, consider the administration of a paper-and-pencil test. Before the test, ethical guidelines dictate that when test users have discretion with regard to the tests administered, they should select and use only the test or tests that are most appropriate for the individual being tested. Before a test is administered, the test should be stored in a way that reasonably ensures that its specific contents will not be made known to the testtaker in advance. Another obligation of the test user before the test’s administration is to ensure that a prepared and suitably trained person administers the test properly. The test administrator (or examiner) must be familiar with the test materials and procedures and must have at the test site all the materials needed to properly administer the test. Materials needed might include a stopwatch, a supply of pencils, and a sufficient number of test protocols. By the way, in everyday, non-test-related conversation, protocol refers to diplomatic etiquette. A less common use of the word is a synonym for the first copy or rough draft of a treaty or other official document before its ratification. With reference to testing and assessment, protocol typically refers to the form, sheet, or booklet on which a testtaker’s responses are entered. The term may also be used to refer to a description of a set of test- or assessment-related procedures, as in the sentence, “The examiner dutifully followed the complete protocol for the stress interview.” Test users have the responsibility of ensuring that the room in which the test will be conducted is suitable and conducive to the testing. To the extent possible, distracting conditions Chapter 1: Psychological Testing and Assessment 27 At least since the beginning of the nineteenth century, military units throughout the world have relied on psychological and other tests for personnel selection, program validation, and related reasons (Hartmann et al., 2003). In some cultures where military service is highly valued, students take preparatory courses with hopes of being accepted into elite military units. This is the case in Israel, where rigorous training such as that pictured here prepares high-school students for physical and related tests that only 1 in 60 military recruits will pass. Gil Cohen-Magen/AFP/Getty Images Evidence suggests that some people with eating disorders may actually have a self-perception disorder; that is, they see themselves as heavier than they really are (Thompson & Smolak, 2001). Thompson and his associates devised the adjustable light-beam apparatus to measure body image distortion. Assessees adjust four beams of light to indicate what they believe is the width of their cheeks, waist, hips, and thighs. A measure of accuracy of these estimates is then obtained. Joel Thompson Herman Witkin and his associates (Witkin & Goodenough, 1977) studied personality-related variables in some innovative ways. For example, they identified field (or context)-dependent and field-independent people by means of this specially constructed tilting room–tilting chair device. Assessees were asked questions designed to evaluate their dependence on or independence of visual cues. Source: Witkin, H. A., & Goodenough, D. R. (1977). Field dependence and interpersonal behavior. Psychological Bulletin, 84, 661–689. Figure 1–5 The wide world of measurement. 28 Part 1: An Overview Pictures such as these sample items from the Meier Art Judgment Test might be used to evaluate people’s aesthetic perception. Which of these two renderings do you find more aesthetically pleasing? The difference between the two pictures involves the positioning of the objects on the shelf. Norman C. Meier Papers, University of Iowa Libraries, Iowa City, Iowa. Impairment of certain sensory functions can indicate neurological deficit. For purposes of diagnosis, as well as measuring progress in remediation, the neurodevelopment training ball can be useful in evaluating one’s sense of balance. Fotosearch/Getty Images Some college admissions officers are evaluating the notebook doodles of applicants in their search for “authentic and imperfect” (as opposed to “ideal”) candidates for admission (Gray, 2016). As a result, profiles created on social media platforms such as ZeeMee may increasingly be used by applicants to convey “a side of themselves that might not come through in the typical mix of transcripts, essays and teacher recommendations” (Gray, 2016, p. 48). Chapter 1: Psychological Testing and Assessment 29 Figure 1–6 Less-than-optimal testing conditions. In 1917, new Army recruits sat on the floor as they were administered the first group tests of intelligence—not ideal testing conditions by current standards. Time Life Pictures/US Signal Corps/The LIFE Picture Collection/Getty Images such as excessive noise, heat, cold, interruptions, glaring sunlight, crowding, inadequate ventilation, and so forth should be avoided. Of course, creating an ideal testing environment is not always something every examiner can do (see Figure 1–6). During test administration, and especially in one-on-one or small-group testing, rapport between the examiner and the examinee is critically important. In this context, rapport may be defined as a working relationship between the examiner and the examinee. Such a working relationship can sometimes be achieved with a few words of small talk when the examiner and examinee are introduced. If appropriate, some words about the nature of the test and why it is important for examinees to do their best may also be helpful. In other instances—for example, with a frightened child—the achievement of rapport might involve more elaborate techniques such as engaging the child in play or some other activity until the child has acclimated to the examiner and the surroundings. It is important that attempts to establish rapport with the testtaker not compromise any rules of the test administration instructions. After a test administration, test users have many obligations as well. These obligations range from safeguarding the test protocols to conveying the test results in J UST T H INK... a clearly understandable fashion. If third parties were present during What unforeseen incidents could testing or if anything else that might be considered out of the conceivably occur during a test session? ordinary happened during testing, it is the test user’s responsibility Should such incidents be noted on the to make a note of such events on the report of the testing. Test report of that session? scorers have obligations as well. For example, if a test is to be scored by people, scoring needs to conform to pre-established scoring 30 Part 1: An Overview criteria. Test users who have responsibility for interpreting scores or other test results have an obligation to do so in accordance with established procedures and ethical guidelines. Assessment of people with disabilities People with disabilities are assessed for exactly the same reasons people with no disabilities are assessed: to obtain employment, to earn a professional credential, to be screened for psychopathology, and so forth. A number of laws have been enacted that affect the conditions under which tests are administered to people with disabling conditions. For example, one law mandates the development and implementation of “alternate assessment” programs for children who, as a result of a disability, could not otherwise participate in state- and district-wide assessments. Defining exactly what “alternate assessment” meant was left to the individual states or their local school districts. These authorities define who requires alternate assessment, how such assessments are to be conducted, and how meaningful inferences are to be drawn from the assessment data. In general, alternate assessment is typically accomplished by means of some accommodation made to the assessee. The verb to accommodate is defined as “to adapt, adjust, or make suitable.” In the context of psychological testing and assessment, accommodation is defined as the adaptation of a test, procedure, or situation, or the substitution of one test for another, to make the assessment more suitable for an assessee with exceptional needs. At first blush, the process of accommodating students, employees, or other testtakers with special needs might seem straightforward. For example, the individual who has difficulty reading the small print of a particular test may be accommodated with a large-print version of the same test or with a specially lit test environment. A student with a h

The Tools of Psychological Assessment PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue