Psychological Assessment Midterms PDF
Document Details
Uploaded by FabulousEveningPrimrose
National University - Fairview
Tags
Related
Summary
This document is a chapter from a midterms course on Psychological Assessment. It includes details on when psychological testing and assessment started and definitions of various types of assessment, including educational assessment, remote assessment, and ecological momentary assessments. It also discusses some historical and theoretical contexts related to the study of psychological assessment.
Full Transcript
NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS 🪷🌷🌸🌺🦩Chapter 1 🪷🌷🌸🌺🦩 Varieties of Assessment Psychological Testing and Assessment...
NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS 🪷🌷🌸🌺🦩Chapter 1 🪷🌷🌸🌺🦩 Varieties of Assessment Psychological Testing and Assessment Educational Assessment – Broadly speaking, When Did Psychological Testing and the use of tests and other tools to evaluate abilities Assessment Started? and skills relevant to success or failure in school or pre-school context. Alfred Binet and a colleague published a test - Intelligence tests, achievement tests, and designed to help place Paris schoolchildren in reading comprehension tests are the appropriate classes. examples. Within a decade an English-language version of Retrospective Assessment – The use of Binet’s test was prepared for use in schools in the evaluative tools to draw conclusions about United States. psychological aspects of a person as they existed at some point in time prior to the assessment. During World War II, the military would depend - “Anong nangyari? Bakit niya ginawa?” even more on psychological tests to screen recuits for service. Remote Assessment – The use of tools of psychological evaluation to gather data and draw Following the war, more and more tests conclusions about a subject who is not in physical purpoting to measure an ever-widening array of proximity to the person or people conducting the psychological variables were developed and used. evaluation. Psychological Testing and Assessment Defined Ecological Momentary Assessment (EMA) – “In the moment” evaluation of specific problems and related cognitive and behavioral variables at Psychological Assessment – The gathering and the very time and place that they occur. integration of psychology-related data for the - “Assessing your feeling, kung anong purpose of making a psychological evaluation kalagayan ng nararamdaman.” that is accomplished through the use of tools such as tests, interviews, case studies, behavioral observation, and specially designed apparatuses The Process of Assessment and measurement procedures. Referral → Initial Contact → Selection of Tools Psychological Testing – The process of → Formal Assessment → Report Writing → measuring psychology-related variables by Feedback Sessions means of devices or procedures designed to obtain a sample of behavior. Collaborative Psychological Testing – The assessor and assessee may work as “partners” from initial contact through final feeback. Therapeutic Psychological Assessment – A collaborative approach to assessment. - Therapeutic self-discovery and new understandings are encourange throughout the assessment process. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS Dynamic Assessment – An interactive approach Interview to psychological assessment that usually follows a models of evaluation, intervention of some sort, The interviewer is taking note of both verbal and and evaluation. nonverbal behavior. The Tools of Psychological Assessment Interviews may be conducted in other formats. Test Interview as a method of gathering information through direct communication involving A measuring device or procedure. reciprocal exchange. - “Requirement na dapat may exchange ng Psychological Test – Designed to measure sagutan or communication” variables related to psychology (such as intelligence, personality, aptitude, interests, Interviews differ with regard to many variables, attitudes, or values). such as their purpose, length, and nature. Psychological tests and other tools of assessment Interview may be used to help professionals in may differ with respect to a number of variables, human resources to make more informed such as content, format, administration recommendations about hiring, firing, and procedures, scoring, and interpretation advancement of personnel. procedures, and technical quality. Panel Interview (Board Interiew) – More than Score – A code or summary statement, usually one interviewers participates in the assessment. but not necessarily numerical in nature, that reflects an evaluation of performance on a test, Motivational Interviewing – A therapeutic task, interview, or some other sample of behavior. dialogue that combines person-centered listening skills such as openness and empathy, Scoring – The process of assigning such with the use of cognition-altering techniques evaluative codes or statements to performance on designed to positively affect motivation and effect tests, tasks, interviews, or other behavior therapeutic change. samples. Portfolio Cut Score (Cutoff Score) – A reference point, usually numerical, derived by judgment and used Work product–whether retained on paper, to divide a set of data into two or more canvas, film, video, audio, or some other classifications. medium–constitute. Psychometric Soundness or technical quality. Case History Data Psychometrics – The science of psychological Records, transcripts, and other accounts in measurement. written, pictorial, or other form that preserve archival information. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS Behavioral Observation Test Takers – Anyone who is the subject of an assessment or an evaluation can be a test taker or Monitoring the actions of others or oneself by an assessee. visual or electronic means while recording quantitative and/or qualitative information Psychological Autopsy – A reconstruction of a regarding those actions. deceased individual’s psychological profile. Role–Play Tests In What Types of Settings Are Assessment Conducted and Why? Acting an improvised or partially improvised part in simulated situation. Educational Settings – Achievement, aptitude, diagnostic, and etc. A tool of assessment wherein assessees are directed to act as if they were in a particular Clinical Settings – Psychotherapy, research, situation. court related, prisoners’ rehabilitation, and etc. Computer as Tools Counseling Settings – Adjustment and productivity (Typical problem ng isang tao sa Computer play in contemporary assessment in buhay na walang halong disorder). the context of generating simulations. Geriatic Settings – Quality of life and diagnostic. CAPA refers to the term computer-assisted psychological assessment. Business and Military Settings – Business, as in the military, various tools of assessment are used Who Are the Parties in the Assessment in sundry ways, perhaps most notably in decision Enterprise? making about the careers of personnel. Test Developers/Publishers – Create tests or 🪷🌷🌸🌺🦩Chapter 2 🪷🌷🌸🌺🦩 other methods of assessment. Historical, Cultural, and Legal/Ethical Considerations Test Users – Used by a wide range of professionals. A Historical Perspective: Antiquity to the Nineteenth Century Test and testing programs first came into China as early as 2200 B.C.E. Chinese Imperial Examination – Music, Archery, Horsemanship, Writing and Arithmetic, Agriculture, Geography, Civil Law, and Military Strategy. The Song (Sung) Dynasty ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS 960 to 1279 B.C.E. Chance variation in species would be selected or rejected by nature according to adaptivity and Test emphasized knowledge of classical survival value. literature. On Individual Differences Testtakers having acquired the wisdom of the past. History records that it was Darwin who spurred scientific interest in individual differences. Privileges of the Song (Sung) Dynasty Francis Galton (1869) To wear a special garb. Half cousin of Charles Darwin. Exemption from taxes. Aspired to classify people “according to their Exempt one from government-sponsored natural gifts” and to ascertain their “deviation interrogation by torture. from an average”. Who is in League with the Devil? Credited with devising or contributing to the development of many contemporary tools of Ancient Greco-Roma writing indicative of psychological assessment, including attempts to categorize people in terms of questionaires, rating scales, and self-report personality types. inventories. Reference to an overabundance or defeciency in Galton’s initial work are the fewer variations some body fluids. among the peas in a single pod. Pioneered the coefficient of correlation. Wilhelm Max Wundth (1932 – 1920) First experimental psychology laboratory founded at the University of Leipzig in Germany. Wundth and his students tried to formulate a general description of human abilities with respect to variables such as reaction time, perception, and attention span. On the Origin of Species by Means of Natural James McKeen Cattell (1860 – 1944) Selection One of the students of Wundth. Published by Charles Darwin. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS Completed a doctoral dissertation that dealth Established the psychological school of thought with individual differences in reaction time. known as structuralism. Coined the term mental test in 1890 Coined the word “empathy”, a translation of the publication. german word “Einfunlung.” Was instrumental in founding the Psychological G. Stanley Hall Corporation, which named 20 of the country’s leading psychologists as its directors. The first president of the American Psychological Association (APA) in 1892. The goal of the corporation was the “advancement of psychology and the The 2oth Century: The Measurement of promotion of the useful applications of Intelligence psychology.” 1895 – Alfred Binet (1857 – 1911) and his Charles Spearman collegue Victor Henri published several Articles in which they argued for the measurement of Credited with originating the concept of test abilities such as memory and social reliability as well as building the mathematical comprehension. framework for the statistical technique of factor analysis. 1905 – Binet and collaborator Theodore Simon published a 30-item “measuring scale of Victor Henri intelligence” designed to help identify Paris schoolchildren with intellectual disability. Frenchman who would collaborate with Alfred Binet on papers suggesting how mental test 1939 – David Wechsler, a clinical psychologist could be used to measure higher mental at Bellevue Hospital in New York City, processes. introduced a test designed to measure adult intelligence. Emil Kraepelin - “The aggregate or global capacity of the individual to act purposefully, to think An early experimenter with the word rationally, and to deal effectively with his assosiciation technique as a formal test. environment”. - Wechsler–Bellevue Intelligence Scale → Lightner Witmer Wechsler Adult Intelligence Scale → WAIS Revised → WAIS R 1981 → III (WAIS-III) Succeed Cattell as director of the psychology 1997 → WAIS IV 2008 laboratory at the University of Pennsylvania. The Measurement of Personality Founded the first psychological clinic in the United States at the University of Pennsylvania. 1914 – Committee on Emotional Fitness chaired by psychologist Robert S. Woodworth Edward B. Titchener was assigned the task of developing a measure of adjustment and emotional stability that could be ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS administered quickly and effeciently to groups of What is to be valued or prized as well as what is to recruits. be rejected or despised. To disguise the true purpose of one such test, the Evolving Interest in Culture-Related Issues questionnaire was labeled as a “Personal Data Sheet.” Henry H. Goddard raised questions about how meaningful such tests are when used with people One of the test question was “Are you troubled from various cultural and language backgrounds. with the idea that people are watching you on the street?” Culture-specific Test – Tests designed for use with people from one culture but not from After the war, Woodworth developed a another. personality test for civillian use that was based on the Personal Data Sheet. He called it the Legal and Ethical Considerations Woodworth Psychoneurotic Inventory. Laws – Rules that individuals must obey for the The WPI was the first widely used self-report good of the society as a whole–or rules thought to measure of personality. be for the good of society as a whole. Self-report – A process whereby assessees Ethics – A body of principles of right, proper, or themeselves supply assessment-related good conduct. information by responding to questions, keeping a diary, or self-monitoring thoughts or behaviors. Code of Professional Ethics – Recognized and accepted by members of a profession, it defines Projective Test – An individual is assumed to the standard of care expected of members of that “project” onto some ambiguous stimulus his or profession. her own unique needs, fears, hopes, and motivation. Standard of Care – The level at which the - The ambiguous stimulus might be an inkblot, average, reasonable, and prudent professional a drawing, a photograph, or something else. would provide diagnostic or therapeutic services under the same or similar conditions. Thematic Apperception Test (TAT) – The use of pictures as projective stimuli was popularized The Concerns of the Profession in the late 1930s by Henry A. Murray and Christiana D. Morgan. Test-user qualifications Culture Testing people with disabilities - Transforming the test into a form that can be “The socially transmitted behavior patterns, taken by the test taker. beliefs, and products of work of a particular population, community, or group of people.” Computerized test administration, scoring, and intepretation Prescribes many behaviors and ways of thinking. Guidelines with respect to certain populations ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS The Rights of Test Takers Statistics is how we communicate science, how we interpret information in behavioral sciences. The right of informed consent Personal reasons The right to be informed of test findings It adds credibility to an argument. The right to privacy and confidentiality Types of Data The right to the least stigmatizing label Data 🪷🌷🌸🌺🦩Chapter 3 🪷🌷🌸🌺🦩 Set of qualitative and/or quantitative values, Statistics Refresher made up of variables. Variable What are Statistics? Anything that can be measured. May take different values between individuals. According to the World Health Organization’s May take different values in the same individual at director in March 2022, the global prevalance of different time points. anxiety and depression increased by more than 25% in the first year of pandemic. Types of Variables Statistics refers to a range of techniques and Independent Variable procedures for analyzing, interpreting, displaying, and making decisions based on The variable you control. data. Not affected by the state of any other variable in the experiment. Math is a central component of it, but statistics is May have different levels. a broader way of organizing, interpreting, and communicating information in an objective Dependent Variable manner. The variable you measure. You assess how it responds to a change in the Figures and facts. independent variable, so you can think of it as depending on the independent variable. What are Psychological Statistics? Qualitative Variable Qualities Statistics refers to a range of techniques and Does not imply numerical ordering procedures for analyzing, interpreting, displaying, and making decisions based on Religion, Gender psychological data. Quantitative Variable Psychological constructs or variables. Measured in terms of numbers Test scores, Height Why Study Psychological Statistics? Discrete Variable Possible scores are discrete point on the scale ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS No. of students in a particular section Individuals in the population equal chances of being selected. Continuous Variable Simple random sampling Possible scores are continuous Systematic Sampling Time, Length Stratified Sampling Cluster Sampling Levels/Scale of Measurement Multi-stage Random Sampling Nominal Non-Probability Sampling Differentiates between items or subjects based on Selecting samples on the basis of accessibility or categories. personal judgment of researcher. Classification Individuals in the population equal chances of being selected. Ordinal Convenience or Haphazard Sampling Allows ranking of objects or observations Purposive or judgmental Sampling Consisting of spectrum of values Snowball Sampling Interval Types of Research Design Allows for degree of difference between observations Experimental Has equal distance True experiment Does one variable causes change in another Ratio variable? Has fixed intervals between scores Manipulated IV, with random selection, Has true zero point controlled group Collecting Data Quasi-Experimental Same with experimental but without random Population selection. Population of interest “All People” who have the same characteristics in Non-Experimental common Involves observing things as they occur naturally All Filipino registered voters and records observation as data. Cannot establish causality. Sample Drawn from the population Type of Statistical Analyses A small subset of larger population Selected Filipino voters Descriptive Statistics Involves procedures for summarizing, graphing, Sampling Techniques and describing quantitative information. What or data look like? Probability Sampling Utilizes random selection Inferential Statistics ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS Involves procedures that allow the drawing of Comparing Distributions Using Bar Charts conclusions and generalizations. Includes testing hypothesis and deriving estimates. How our data behave? Graphing Qualitative Variables Frequency Tables Histograms: First, create a frequency table Pie Charts Frequency Polygon Bar Charts Cummulative Frequency Polygon ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS The Normal Distribution Two sides are roughly the same shape. Has single peak, known as the center. Has 2 tails that extend out equally. Also known as bell shape or bell curve. Bar Charts Skewness Positive Skewed Line Graph Negative Skewed The Shape of the Distribution Symmetrical Center can be cut down to form 2 mirror images. Never a perfectly symmetrical dsitribution. Asymmetrical/Skewed One of the 2 tails of the distribution is disproportionately longer than the other. Can be positive or negative. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS Kurtosis Measures of Central Tendency The degree of flatness of peakness of a Mean distribution. The sum of the numbers divided by the number of numbers. Distribution have higher peak when the data is clustered around the middle. Distribution is more flat when the data are spread around evenly. Median The midpoint of a distribution. Count all numbers in the dataset. _____________________________________ Mode The most frequently occurring value in the data Comparing individuals scores to a distribution of set. scores is fundamental. Only measure that can be used in qualitative data. Measures of Central Tendency The mean, median, and mode are identical in a Measures of Variability or Spread bell-shaped normal distribution. Measures of Spread or Variability Refers to how “spread out” a group of scores is within a distribution. The Center Range Balanced scale, the point at which a distribution is in balance. Simplest measure of variability. Highest Score – Lowest Score Extremely sensitive to outliers. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS Interquartile Range (IQR) 🪷🌷🌸🌺🦩Chapter 4 🪷🌷🌸🌺🦩 Of Tests and Testing Divides data set into 4 parts, with each containing 25% of data. Assumption 1: Psychological Traits and States Exist The IQR is range of the middle 50% of the scores in a distribution. Traits – “Any distinguishable, relatively enduring way in which one individual varies from IQR = 75th Percentile – 25th Percentile another” Sum of Squares States – Distinguish one person from another but are relatively less enduring. Talks about how close the scores in a distribution are to the middle of the distribution. Construct – Informed, scientific concept developed or constructed to describe or explain behavior. Overt Behavior – An observable action or the product of an observable action, including test-or assessment-related responses. Assumption 2: Psychological Traits and States Can Be Qualified and Measured Variance Specific traits and states to be measured and quantified need to be carefully defined. Refers to the average squared difference of the scores from the mean. A test developer considers the types of item content that would provide insight into it. The test score is presumed to represent the strength of the targeted ability or trait or state and is frequently based on cumulative scoring. Standard Deviation The higher that testtaker is presumed to be on the targeted ability or trait. Simply the square root of the variance. Assumption 3: Test-Related Behavior Predicts Puts the standard deviation back into the original Non-Test-Related Behavior units of the measure we used. The tasks in some tests mimic the actual behaviors that the test user is attempting to understand. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS The obtained sample of behavior is typically used can readily appreciate the need for tests, to make predictions about future behavior. especially good tests. Postdict It – To aid in the understanding of Reliability behavior that has already taken place. It yields the same numerical measurement every Assumption 4: Tests and Other Measurement time it measures the same thing under the same Techniques Have Strength and Weaknesses conditions. Competent test users understand and appreciate Reliability is a necessary but not sufficient the limitations of the tests they use as well as how element of a good test. those limitations might be compensated for data from other sources. Validity Assumption 5: Various Sources of Error are Part Measure what it purports to measure. of the Assessment Process Other Consideration Error – A log-standing assumption that factors other than what a tests attempts to measure will Norms influence performance on the test. Norm-Referenced Testing and Assessment – Error Variance – Component of a test score Evaluating an individual testtaker’s score and attributable to sources other than the trait or comparing it to scores of a group of testtakers. abilitity measured. Criterion–Referenced Testing and Classical Test Theory (CTT or True Score Assessment – Evaluating an individual’s score Theory) – Each testtaker has true score on a test with reference to a set standard on criterion. that would be obtained but for the action measurement error. Behavior that is usual, average, normal, standard, expected, or typical. Assumption 6: Testing and Assessment Can Be Conducted in a Fair and Unbiased Manner In a psychometric context, norms are the test performance data of a particular group of Today all major test publishers strive to develop testakers that are designed for use as a reference instruments that are fair when used in strict when evaluating or interpreting individual test accordance with guidelines in the test mannual. scores. However, despite the best efforts of many professionals, fairness-related questions and Normative Sample – Group of people whose problems do occasionally arise. performance on a particular test is analyzed for reference in evaluating the performance of Assumption 7: Testing and Assessment Benefit individual testtakers. Society Norming – The process of deriving norms. Considering the many critical decisions that are based on testing and assessment procedures, we ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS Race Norming – The controversial practice of norming on the basis of race ethnic background. National Norms – Derived from a normative sample that was nationally representative of the User Norms or Programs Norms – “Consists of population at the time the norming was descriptive statistics based on a group of conducted. testtakers in a given period of time rather than normas obtained by formal sampling methods” 🪷🌷🌸🌺🦩Chapter 5 🪷🌷🌸🌺🦩 Reliability Sampling to Develop Norms Reliability – Consistency Standardization or Test Standardization - The process of administering a test to a representative Reliability Coefficient – An index of reliability, sample of testtakers for the purpose of a proportion that indicates the ratio between the establishing norms. true score variance on a test and the total variance. Sampling – The process of selecting the portion of the universe deemed to be representative of the The Concept of Reliability whole population. If we use X to represent an observed score, T to Random Sampling – If every member of the represent a true score, and E to represent error, population had the same chance of being then the fact that an observed score equals the included in the sample. true score plus error may be expressed as follows: Types of Norms Variance – A statistic useful in describing sources of test score variability. Percentiles – An expression of the percentage of people whose score on a test or measure falls True Variance – Variance from true differences. below a particular raw score. Percentage of Testtakers – A percentile is a Error Variance – Variance from irrelevant, converted score. random sources. (Variance from others) Percentage Correct – The distribution of raw Reliability – The proportion of the total variance scores–more specifically, to the number of items attributed to true variance. that were answered correctly multiplied by 100 and divided by the total number of items. Measurement of Error – Collectively, all of the factors associated with the process of measuring Age Norms (Age-equivalent Scores) – some variable, other than the variable being Indicated the average performance of different measured. (Something wrong with the samples of testtakers who were at various ages at measurement) the time the test was administered. Random Error – A source of error in measuring Grade Norms – Designed to indicate the average a targeted variable caused by unpredictable test performance of testtakers in a given school fluctuations and inconsistencies of other grade. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS variables in the measurement process. (Specific Spearman-Brown Formula – Split Half Error) Reliability - Obtained by correlating two pairs of scores obtained from equivalent halves of a single Systematic Error – a source of error in test administered once. measuring a variable that is typically constant or proportionate to what is presumed to be the true The general Spearman-Brown (rsb) formula is: value of the variable being measured. (Applicable to all) The symbol rhh stands for the Pearson r of scores Sources of Error Variance in the two half tests: Test Constructions Test Administration Test Scoring and Interpretation Other Methods of Estimating Internal Other Sources of Error Consistency Reliability Estimates Inter-Item Consistency – The degree of correlation among all the items on a scale. A Test-Retest Reliability Estimates – An estimate measure of inter-item consistency is calculated of reliability obtained by correlating pairs of from a single administration of a single form of scores from the same people on two different a test. administrations of the same test. Homogeneity – The extent to which items in a Coefficient of Stability – Interval between scale are unifactorial. testing is greater than six months. Heterogeneity – The degree to which a test Internal Consistency Estimate of measures different factors. Reliability/Inter Item Consistency (Homogeneity of Test) Kuder–Richardson Formula 20 (KR-20) – The most widely known of the many formulas Split-Half Reliability Estimates they collaborated on. Named because it was the 20th formula developed in a series. The statistic Split Half Reliability – Obtained by correlating of choice for determining the inter-item two pairs of scores obtained from equivalent consistency of dichotomous items. halves of a single test administered once. Measures of Inter-Scorer Reliability o Step 1: Divide the test into equivalent halves. Variously referred to as scorer reliability, judge o Step 2 : Calculate a Pearson r between reliability, observer reliability, and inter-rater scores on the two halves of the test. reliability. o Step 3: Adjust the half-test reliability using the Spearman–Brown formula (discussed Inter-Scorer Reliability – The degree of shortly). agreement or consistency between two or more scorers with regard to a particular measure. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS Coefficient of Inter-Scorer Reliability – b. How scores on the test can be understood Correlation Coefficient. within some theoretical framework for understanding the construct that the test The True Score Model of Measurement and was designed to measure. Alternatives to It Ecological Validity Classical Test Theory (CTT) – True score. The most widely used and accepted model in the A judgment regarding how well a test measures psychometric literature today. what it purpots to measure at the time and place that the variable being measured (typically a Domain Sampling Theory – Seek to estimate behavior, cognition, or emotion) is actually the extent to which specific sources of variation emitted. under defined conditions are contributing to the test score. Face Validity 🪷🌷🌸🌺🦩Chapter 6 🪷🌷🌸🌺🦩 Relate more to what a test appears to measure Validity to the person being tested than to what the test actually measures. Validation – The process of gathering and evaluating evidenve about validity. Stated another way, if a test definitely appears to measure what it purports to measure “on the Local Validation Studies – Absolutely face of it,” then it could be said to be high in face necessary when the user plans to alter in some validity. way the format, instructions, language, or content of the test. Content Validity Trinitarian Model of Validity A judgment how adequately a test samples behavior representative of the universe of Content Validity – A measure of validity based behvior that the test was designed to sample. on an evaluation of the subjects, topics, or content covered by items in the test. Criterion-Related Validity Criterion-Related Validity – A measure of A judgment how adequately a test scores can be validity obtained by evaluating the relationsip of used to infer an individual’s most probable scores obtained on the test to scores on other standing on some measure of interest. The tests or measures. measure of interests being the criterion. Construct Validity – A measure of validity that Concurrent Validity – An index of the degree is arrived at by executing a comprehensive to which a test score is related to some analysis of. crieterion measure obtained at the same time (concurrently). a. How scores on the test relate to the other test score and measures, and Predictive Validity – An index of the degree to which a test score predicts some criterion measure. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS Construct – An informed, scientific idea Predictive Validity developed or hyphotesized to describe or explain behavior. Base Rate – The extent to which a particular trait, behaior, characteristic, or attribute exixts Evidence of Homogeneity – How uniform a in the population (expressed as a proportion). test in measuring a single concept. Hit Rate – The proportion of people a test Evidence of Changes with Age – If a test score accurately identifies as possessing or exhibiting purpots to be a measure of a construct that a particular trait, behavior, characteristic, or could be expected to change over time, then the attribute. test score, to, should show the same progressive changes with age to be considered a valid Miss Rate – The proportion of people the tests measure of the construct. fails to identify as having, or not having, a particular characteristic or attribute. Evidence of Pretest–Posttest Changes – Test scores change as a result of some experience False Positive – A miss wherein the test between a pretest and a posttest can be evidence predicted that the testtaker did possess the of construct validity. particular characteristic or attribute being measured when in fact the testtaker did not. Evidence From the Distinct Groups – Method of Contrasted Groups. One way of providing False Negative – A miss wherein the test evidence for the validity of a test to demonstrate predicted that the testtaker did not possess the that scores on the test vary in a predictable way particular characteristic or attribute being as a function of membership in some group. measured when the testtaker actually did. Convergent Evidence – For the construct Validity Coefficient – A correlation coefficient validity of a particular test may converge from a that provides a measure of the relationship number of sources, such as other tests or between test scores and scores on the criterion measures designed to assess the same (or a measure. similar) construct. Incremental Validity – The degree to which an Discriminant Evidence – A validity coefficient additional predictor explains something that the showing little (a statistically insignificant) criterion measure that is not explained by relationship between test scores and/or other predictors already in use. variables with which scores on the test being construct-validated should not theoretically be Construct Validity correlated provides discriminant evidence of construct validity (also known as discriminant A judgment about the appropriates of validity. inferences drawn from test scores regarding individual standings on a variable called Factor Analysis – Shorthand term for a class of construct. mathematical procedures designed to identify factors or variables that are typically attributes, characteristics, or dimensions on which people may differ. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS Exploratory Factor Analysis – Typically Halo Effect - describes the fact that, for some entails “estimating”, or extracting factors; rater, some ratees can do no wrong. Rater’s deciding how many factors to retain; and failure to discriminate. rotating factors to an interpretable orientation. 🪷🌷🌸🌺🦩Chapter 7 🪷🌷🌸🌺🦩 Confirmatory Factor Analysis – Researchers Utility test the degree to which a hypothetical model (which includes factors) fits the actual data. Utility - we may define utility in the context of testing and assessment as the usefulness or Test Bias practical value of testing to improve efficiency. Bias – A factor inherent in a test that Psychometric soundness – reliability and systematecally prevents accurate, impartial validity of a test. measurement. Implies systematic variation. Cost – in the context of test utility refers to Rating Error disadvantages, losses, or expenses in both economic and noneconomic terms. Rating Error - is a judgemenr resulting from the intentional or unintentional misuse of a Benefits – refers to profits, gains, or advantages. rating scale. As we did in discussing costs associated with testing (and not testing), we can view benefits in Leniency Error – (also know as generosity both economic and noneconomic terms. error) arises from the tendency on the part of the rater to be lenient in scoring, marking, Utility Analysis and/or grading. Utility Analysis – A family of techniques that Central Tendency Error entail a cost-benefit analysis designed to yield information relevant to a decision about the Central Tendency Error – The rataer for usefulness and/or practical value of a tool of whatever reason, exhibits a general and assessment. Utility analysis is an umbrella term systematic reluctance to giving rating at either covering various possible methods, each the positive or the negative extreme. requiring various kinds of data to be inputted and yielding various kinds of output. Restriction-of-range rating errors (central tendency, leniency, severity errors) is to use The Cut Score in use rankings, a procedure that requires the rater to measure individuals against one another instead Cutoff Score – we have previously defined a cut of against an absolute scale. score as a (usually numerical) reference point derived as a result of a judgement and used to Leniency Error – (also know as generosity divide a set of data into two or more error) arises from the tendency on the part of classifications, with some action to be taken or the rater to be lenient in scoring, marking, some inference to be made on the basis of theses and/or grading. classsifications. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆ NATIONAL UNIVERSITY – FAIRVIEW 2ND TERM OF 2024-2025 PSYCHOLOGICAL ASSESSMENT – MIDTERMS Relative Cut Score/Norm-referenced Cut Score – This type of cut score is set with reference to the performance of a group (or some target segment of a group). Fixed Cut Score – which we may define as a reference point—in a distribution of test scores used to divide a set of data into two or more classifications—that is typically set with reference to a judgement concerning a minimum level of profeciency required to be included in a particular classification. Multiple Cut Scores – Two or more cut scores with reference to one predictor for the purpose of categorizing test takers. Multiple Hurdle – may be thought of as one collective element of a multistage decision- making process in which the achievement of a particular cut score on one test is necessary in order to advance to the next stage of evaluation in the selection process. Compensatory model of selection – an assumption is made that high scores on one attribute can, in fact, “balance out” or compensate for low scores on another attribute. The Angoff Method Angoff Method – Devised by William Angoff (1971), the Angoff Method for setting fixed cut scores can be applied to personnel selection tasks as well as to questions regarding the presence or absence of a particular trait, attribute, or ability. The Known Groups Method Known Groups Method – entails collection of date on the predictor of interest from groups known to possess, and not to possess, a trait, attribute, or ability of interest. ˚˖𓍢ִ◌໋🌷͙֒✧˚.🎀◌༘⋆ TRANSES BY: TAROG, DANICA C. ˚˖𓍢ִ◌໋🌷͙֒✧.̊🎀◌༘⋆