PSYC3204 Measurement Techniques Lecture 20/11/2024 PDF

PSYC3204 MEASUREMENT TECHNIQUES Kardelen Yıldırım, PhD LECTURE 08- 20/11/2024 OVERVIEW OF TOPICS INTRODUCTION TO PSYCHOLOGICAL TESTING AND ASSESSMENT Psychological testing is the use of standardized tests to assess an individual's behavior, personality, and capabilities. Assessment refers to the process of using multiple methods, including tests, to gather information about an individual. Psychological testing and assessment have various applications, including clinical, educational, and organizational settings. TEST AND ASSESSMENT COMPARISON Aspect Testing Assessment Objective Obtain numerical gauge of ability or attribute Answer a referral question, solve a problem, or arrive at a decision Process Individual or group; add up correct answers or Individualized; focuses on how an individual certain types of responses processes Role of Evaluator Tester is not key; one tester may be substituted Assessor is key; selects tests and draws conclusions for another Skill of Evaluator Technician-like skills in administering and scoring Educated selection of tools, skill in evaluation, thoughtful organization Outcome Test score or series of test scores Logical problem-solving approach using many sources of data THE TEST Definition of a Test  A measuring device or procedure  Modified by specific terms to measure related variables Medical Test  Measures variables related to medicine DEFINITION  Examples: X-rays, blood tests, reflex testing Psychological Test AND EXAMPLES  Measures variables related to psychology  Examples: intelligence, personality, aptitude, interests, attitudes, values Behavior Sample in Psychological Tests  Involves analysis of behavior  Can include responses to questionnaires, oral responses, task performance Psychometric Soundness  Refers to the consistency and accuracy of psychological tests  Includes terms like psychometry, psychometric, psychometrist, and psychometrician Utility of Tests  Refers to the practical value or usefulness of a test for a specific purpose TECHNICAL General Use of 'Test' QUALITY  Used broadly to discuss principles applicable to various measurement procedures  Includes paper-and-pencil exams and situational performance measures Emotion and Categorical Cutoffs  Research shows people just making a cutoff feel better than those who exceed it by a large margin  Bronze medalists are happier than silver medalists due to different perspectives on their achievements THE INTERVIEW Interview as a Psychological Tool  Goes beyond mere conversation DEFINITION  Involves more than just talking AND Face-to-Face Interviews  Interviewer notes content and delivery IMPORTANCE  Observes how things are said  Observation of verbal and nonverbal behavior Non-verbal behavior  Includes body language, movements, and facial expressions  Extent of eye contact and willingness to cooperate Assessment of Appearance  Noting neat versus sloppy attire NONVERBAL  Appropriate versus inappropriate dressing BEHAVIOR Strengths of Interviews  Provides in-depth understanding of the interviewee  Allows for observation of nonverbal cues Weaknesses of Interviews  Subject to interviewer bias  May not be entirely objective CASE HISTORY DATA Definition of Case History Data  Records, transcripts, and other accounts preserving archival information  Includes official and informal accounts Sources of Case History Data  Institutions and agencies such as schools, hospitals, employers, DEFINITION religious institutions, and criminal justice agencies AND EXAMPLES Examples of Case History Data  Letters and written correspondence  Photos and family albums  Newspaper and magazine clippings  Home videos, movies, and audiotapes Additional Examples  Work samples and artwork Clinical Evaluation  Sheds light on past and current adjustment  Highlights events contributing to changes in adjustment USES IN Neuropsychological Evaluations CLINICAL AND  Provides information on functioning prior to trauma EDUCATIONAL School Psychology  Offers insight into academic or behavioral standing SETTINGS  Aids in future class placement decisions Case Study Assembly  Illustrates personality and environmental conditions  Provides examples of successful or problematic outcomes Definition of Behavioral Observation  Monitoring actions visually or electronically  Recording quantitative and/or qualitative information DEFINITION Uses in Various Settings  Inpatient facilities for diagnostic purposes AND USES  Behavioral research laboratories for research  Classrooms for educational assessments Selection Purposes  Corporate settings to identify required abilities Definition of Naturalistic Observation  Observing behavior in natural settings  Behavior expected to occur in these settings Example of Naturalistic Observation  Study of socializing behavior of autistic children NATURALISTIC  Researchers chose natural settings over controlled environments OBSERVATION Strengths of Behavioral Observation  Provides realistic insights into behavior  Behavior observed in its natural context Weaknesses of Behavioral Observation  Less control over variables  Potential for observer bias Importance of Behavioral Observation  Useful in schools, hospitals, prisons, and group homes  Helps design targeted interventions Application in School Settings  Naturalistic observation can reveal hidden skills INSTITUTIONAL  Example: Child with linguistic problems showing English skills on playground SETTINGS Limited Use Outside Research Facilities  Economic constraints limit use in private practice  More common in prisons, inpatient clinics, and similar settings Value in Assisted Living  Observation outside institutional environment is crucial  Helps assess daily living skills Common Ground in Test Procedures  Preparation for assessment  Administration of assessment  Usage of scores or results ETHICAL  Storage of assessment records GUIDELINES AND Obligations of Responsible Test Users  Before the test: secure storage of test contents RESPONSIBILITIES  During the test: proper administration by trained personnel  After the test: appropriate use and storage of results Pretest Obligations  Ensure test contents are not disclosed in advance WUNDT'S LABORATORY Wilhelm Max Wundt's Contributions  Founded the first experimental psychology lab at the University of Leipzig  Focused on human abilities like reaction time, perception, and attention span Wundt's Approach to Assessment  In contrast to Galton, he emphasized similarities among people  Viewed individual differences as sources of error  Aimed to control extraneous variables to minimize error Standardization in Testing  Ensures observed differences are due to testtakers, not conditions  Manuals provide explicit instructions to standardize test conditions BINET AND SIMON'S SCALE Early Development of Intelligence Testing  Alfred Binet and Victor Henri's early work in 1895  Focus on measuring abilities like memory and social comprehension Creation of the Binet-Simon Scale  Published in 1905 with 30 items  Aimed at identifying Paris schoolchildren with intellectual disabilities Global Adoption and Diverse Applications  Used worldwide beyond its original purpose  Implemented in schools, hospitals, clinics, courts, reformatories, and prisons GODDARD'S RESEARCH Introduction of Intelligence Testing  Alfred Binet introduced intelligence tests in France  U.S. Public Health Service used these tests for immigrants Henry H. Goddard's Role  Instrumental in adopting Binet’s test in the U.S.  Chief researcher for testing immigrants Controversial Findings  High percentages of immigrants found mentally deficient  83% Jews, 80% Hungarians, 79% Italians, 87% Russians deemed “feebleminded” Methodological Issues  Use of translated Binet test overestimated mental deficiency Nature vs. Nurture Debate SOME ISSUES REGARDING CULTURE AND ASSESSMENT Importance of Communication in Assessment  Basic part of the assessment process  Requires sensitivity to language and dialect differences Sensitivity to Cultural Exposure  Assessors must understand the assessee's exposure to dominant culture Verbal and Nonverbal Communication  Assessment must consider cultural context  Both verbal and nonverbal communication are important ADVANTAGES AND DISADVANTAGES OF SELF- REPORT TESTS Respondents are arguably the best-qualified people to provide answers about themselves. However, respondents may have poor insight into themselves Some respondents are unwilling to reveal anything about themselves that is very personal Some respondents are unwilling unwilling to reveal anything that could put them in a negative light. Level A  Tests or aids administered with manual guidance  Requires general orientation to the institution or organization  Examples: achievement or proficiency tests Level B LEVELS OF TEST  Requires technical knowledge of test construction and use  Knowledge in psychological and educational fields needed USE  Examples: aptitude tests, adjustment inventories for normal populations Level C  Requires substantial understanding of testing and psychological fields  Supervised experience in using these devices is necessary  Examples: projective tests, individual mental tests THE RIGHTS OF TESTTAKERS Right of Informed Consent  Testtakers should know why they are being evaluated  Information on how test data will be used  Details on what information will be released and to whom Right to Be Informed of Test Findings  Testtakers should be informed of the results Right to Privacy and Confidentiality  Test data should be kept confidential Right to the Least Stigmatizing Label  Labels used should minimize stigma DEFINITION OF TRAITS AND STATES A trait has been defined as “any Definition of Traits distinguishable, relatively enduring way in  Distinguishable and relatively enduring ways which one individual varies from another” individuals vary (Guilford, 1959, p. 6). States vs. Traits  States are less enduring compared to traits  Example: feeling nervous for an exam : state  Being introverted: trait Observation of Traits  Based on observing a sample of behavior  Methods include direct observation, self-reports, and tests Range of Psychological Traits  Thousands of terms in the English language Intelligence and Intellectual Abilities  Specific intellectual abilities EXAMPLES OF  Cognitive style Adjustment and Interests PSYCHOLOGICAL  Attitudes and preferences  Sexual orientation TRAITS Psychopathology and Personality  General personality traits  Specific personality traits Traits and Behavior  Traits are not always manifested 100% of the time  Context or situation influences behavior CONTEXT- Influence of Situation  Strength of the trait in the individual DEPENDENT  Nature of the situation  For example, a violent person may be prone to behave in a MANIFESTATION subdued way when with police and much more violently in the presence of his family and friends. OF TRAITS Example of Contextual Influence  John is perceived differently by his wife and business associates. Because perhaps he wants to impress his co-workers. Context and Trait Terms  Behavior labeled differently based on context STANDARDIZATION PROCESS Standardization of Tests  Administering a test to a representative sample  Establishing norms through standardized procedures  Clearly specified administration and scoring procedures  Test users may change  however, test processes conditions and standards including test scoring, application does not change Normative data is typically included in standardized tests, therefore, sampling is important. Understanding Sampling  Targeting a defined group as the population  Population has at least one common, observable characteristic  Examples : high-school seniors or housewives with specific shopping habits STANDARDIZATION PROCESS Definition and Importance of Standardization  Replicable procedures for administering, scoring, and interpreting tests Components of a Standardized Test  Clearly specified procedures  Manuals with detailed guidelines  Training for test users Scoring and Interpretation Guidelines  Examples of correct, incorrect, and partially correct responses  Guidelines for interpreting results NORM-BASED INTERPRETATION When a person’s test score is interpreted by comparing that score with the scores of several other people, this comparison is referred to as a norm-based interpretation. The scores with which each individual is compared are referred to as norms, which provide standards for interpreting test scores. A norm-based score indicates where an individual stands in comparison with the particular normative group that defines the set of standards. DEFINITION AND PURPOSE Definition and Purpose of Norm-Referenced Testing  Evaluates individual scores by comparing them to a group  Determines standing or ranking relative to others Concept of Norms in Testing  Norms refer to usual, average, or expected behavior  Used as a reference for evaluating individual scores Normative Sample  Group whose performance is analyzed for reference  Can be broad or narrow in scope Norming Process  Deriving norms from test performance data TYPES OF NORMS Local Norms Percentile Norms Norms from a Fixed Reference Group Age Norms  Classified based on age groups Subgroup Norms Grade Norms  Classified based on educational grades National Norms  Standardized across a nation National Anchor Norms  Linked to a national standard CAUTIONS FOR INTERPRETING NORMS Normative information can be Grade norms: examinee may be misinterpreted in many ways. expected to perform not only of that specific test but also in other areas and Norms provided in test manual or behaviors. articles may not be based on samples that adequately represent the type of population to which the examinee’s scores are to be compared. Careful reading is necessary Developing local norms on samples for which one has control over selection and testingmay help. TRANSFORMATION It does not change the person’s score; it simply expresses that score in different units. It takes into account information not contained in the raw score itself, here, the number of items on the test. It presents the person’s score in units that are more informative or interpretable than the raw score. Z SCORES Definition of z Score  Conversion of a raw score into standard deviation units  Indicates how many standard deviations a score is from the mean Example Calculation  Raw score of 65 converted to z score using formula  z = (X - Mean) / Standard Deviation  Example: z = (65 - 50) / 15 = 1 Contextual Meaning  z score provides context and meaning for raw scores  Example: z score of 1 indicates 16% scored higher Comparison Across Tests  Standard scores allow comparison of different tests Importance of Inferences in Psychological Testing  Inferences are deduced conclusions about relationships  They link traits, abilities, or interests to behavior CORRELATION Understanding Correlation  Correlation measures the strength of relationships AND INFERENCE  Expressed as a coefficient of correlation Central Role in Tests and Measurement  Essential for studying tests and measurement  Ability to compute correlation coefficients is crucial CORRELATION EXAMPLES CLASSICAL TEST THEORY Observed Score Components  Reflects testtaker’s true score  Includes error component Definition of Error  Component unrelated to testtaker’s ability Mathematical Representation  X = Observed score  T = True score  E = Error  Equation: X = T + E Sources of error variance include SOURCES OF test construction, ERROR administration, VARIANCE scoring, and/or interpretation. CONSISTENCY AND PRECISION Definition of Reliability  Consistency of the measuring tool  Precision in measurements Importance of Reliability  Ensures accurate and consistent results  Minimizes measurement error EXAMPLES OF RELIABLE AND UNRELIABLE TOOLS Example  Three digital scales: A, B, and C Scale A: Consistent and Accurate  Repeated weighings register 1 kg every time Scale B: Consistent but Inaccurate  Repeated weighings register 1.3 kgs  Consistently inaccurate by 0.3 kg Scale C: Neither Consistent nor Accurate  Weights registered vary randomly  Inconsistent and unreliable Importance of Reliable Measurement  Consistency in measurement tools is crucial RELIABILITY Psychological tests Reliable to varying degrees Reliability is necessary However not sufficient element of a good test Tests must be reasonably accurate: Tests must be valid TEST-RETEST RELIABILITY ESTIMATES  Test-retest reliability estimates the reliability of a test by correlating scores from two administrations.  Used for tests measuring stable traits over time.  Differences in scores attributed to measurement error Factors Affecting Reliability  Time interval between tests can affect reliability.  Intervening factors like learning, trauma, or counseling can impact results. PARALLEL-FORMS AND ALTERNATE- FORMS RELIABILITY ESTIMATES Parallel-Forms Reliability  Equal means and variances of test scores  Scores correlate equally with true score Alternate-Forms Reliability  Different versions designed to be equivalent  Equivalent in content and difficulty SPLIT-HALF RELIABILITY ESTIMATES Definition and Purpose Steps to Compute Split-Half Reliability  Correlates two pairs of scores from equivalent  Step 1: Divide the test into equivalent halves halves of a single test  Step 2: Calculate a Pearson r between scores  Useful when impractical to use two tests or on the two halves administer a test twice  Step 3: Adjust reliability using the Spearman- Brown formula Methods to Split a Test  Randomly assign items to each half  Assign odd-numbered items to one half and even-numbered items to the other  Divide by content to ensure equivalent difficulty ESTIMATING INTERNAL CONSISTENCY Kuder-Richardson Formula  Measures inter-item consistency  specifically used for dichotomous (e.g., true/false) items. Cronbach's Alpha  Assesses homogeneity of test items (0-1) Test Homogeneity  Measures a single trait  Leads to straightforward test-score interpretation COEFFICIENT ALPHA Coefficient alpha is the preferred statistic for obtaining an estimate of internal consistency reliability. Essentially, this formula yields an estimate of the mean of all possible test-retest, split-half coefficients. Coefficient alpha is widely used as a measure of reliability, in part because it requires only one administration of the test. COEFFICIENT ALPHA α>0.9: Excellent internal consistency α≥0.8: Good internal consistency α≥0.7: Acceptable internal consistency α≥0.6: Questionable internal consistency α≥0.5: Poor internal consistency α

PSYC3204 Measurement Techniques Lecture 20/11/2024 PDF

Document Details

Tags

Related

Summary

Full Transcript