Ch. 4 Psychological Measurement PDF
Document Details
Uploaded by GraciousAutoharp3237
Karla Emeno
Tags
Summary
This document is lecture notes for a psychology course covering psychological measurement, focusing on reliability and validity of measures. It discusses different types of measurements and scales.
Full Transcript
9/30/2024 2 Lecture Overview Ch. 4 – Psychological...
9/30/2024 2 Lecture Overview Ch. 4 – Psychological Understanding psychological measurement Measurement Reliability and validity Practical strategies for psychological PSYC 2900U – RESEARCH METHODS measurement KARLA EMENO Assignment #2 1 1 2 3 4 Measurement What do we measure in psychology? The assignment of scores to individuals so that the scores represent some characteristic of the individuals Constructs – Variables we want to measure that are Examples: seemingly not straightforward or simple to measure (e.g., personality, emotional states, attitudes, abilities) Using a scale to “measure” one’s weight i.e., a concept we are trying to measure Using a thermometer to “measure” the internal temperature of a turkey roasting Must use a conceptual definition In Psychology, we often refer to measurement as Describes the behavior and internal processes that Psychometrics make up that construct, along with how it relates to Using a test to “measure” one’s working memory capacity other variables 3 4 5 6 How do we measure our variables? Levels (or Scales) of Measurement Must use an operational definition – a definition of the variable in terms of precisely how it will be measured A variable can be measured on one of four scales: Types: Nominal – Categorical (qualitative) Ordinal – Rank order; discrete; difference between rank is Self-report measures – Participants report on their own NOT equal thoughts, feelings, and actions Interval – Numeric scale with NO true zero point; each Behavioral measures – Some other aspect of participants’ point is equal distance between each other behaviour is observed and recorded Ratio – Numeric scale with TRUE zero (i.e., complete absence of variable) Physiological measures – Involve recording any of a wide variety of physiological processes (e.g., heart rate, blood Important: Determines the type of statistics you can do and pressure, hormone levels) conclusions you can make 5 6 1 9/30/2024 7 8 Scales of Measurement Scales of Measurement Nominal Scale Ordinal Scale Interval Scale Ratio Scale Non-ordered categorial Ordered categorical Numerical responses that are Numeric responses that are responses responses equally spaced ratios of each other No specific continuum On a continuum (low to high) Scores are not ratios Allows for ratio comparisons Ex: Mood, major, gender Not equally spaced No true zero point True zero point Ex: Anxiety ratings, rank order Ex: Reaction time, accuracy, Ex: Temperature (C or F), IQ height, weight 7 8 Scales of Measurement Flow Chart 9 Reliability vs. Validity 10 Nominal Reliability – Ability to obtain consistent scores (precision) No Validity – Ability of a test to measure what it’s supposed to (accuracy) Can the Category Categories be ordered? Yes Ordinal Does the Neither Valid nor Reliable Valid (but not reliable) scale involve categories or Interval numbers? Does the scale No have a true Numbers zero point (absence of Yes Ratio Valid AND Reliable Reliable (but not valid) variable)? NOTE: Likert scales with 4 or less response options tend to be used in analyses as ordinal variables. Likert scales with 5 or more response options tend to be used in analyses as interval variables. 9 10 Test-Retest Reliability 11 Internal Consistency 12 A measure’s consistency over The consistency of people’s time. responses across the items on a Assessing this requires using the multiple-item measure. measure on a group of people at Split-Half Correlation – Split the one time, using again on the same items into two sets. Then a score is group of people at a later time, computed for each set of items and then looking at the correlation and the relationship between the between the two sets of scores. two sets of scores is examined. In general, a test-retest correlation In general, a split-half correlation of +.80 or greater is considered to of +.80 or greater is considered to indicate good reliability. Test-retest correlation = +.95 indicate good internal Split-half correlation = +.88 consistency. 11 12 2 9/30/2024 13 14 Inter-Rater Reliability Face Validity How accurate a measure looks on the surface Many behavioural measures involve significant It can be assessed quantitatively, but it is usually assessed judgment on the part of an observer or rater. informally. Inter-Rater Reliability – The extent to which different Face validity is weak evidence that a measure is measuring observers (i.e., raters) are consistent in their judgments. what it is supposed to because: Inter-rater reliability is often assessed using Cronbach’s People’s intuitions about human behaviour are often wrong alpha when the judgments are quantitative or Cohen’s Many established scales in psychology work quite well Kappa when the judgments are categorical. despite lacking face validity 13 14 Criterion Validity 15 16 Content and Discriminant Validity The extent to which scores on a measure correlate with other Content Validity: variables (known as criteria) that one would expect them to be correlated with The extent to which your test accurately/actually measures the behavior you are trying to measure (conceptual definition) Ex: Test anxiety being negatively correlated with exam performance and course grades Typically assessed informally by carefully comparing the measurement method to the conceptual definition Concurrent validity – Criterion is measured at the same time as the construct Discriminant Validity: Predictive validity – Criterion is measured at some point in the future The extent to which scores on a measure DO NOT correlate with (after the construct has been measured) other UNrelated variables Convergent validity – When new measures are correlated with Low correlations suggest that the measure is reflecting a existing established measures of the same construct. conceptually distinct construct 15 16 Practical Strategies for Psychological 17 Conceptually Defining the Construct 18 Measurement 1. Conceptually define the Construct You want to have a clear and complete conceptual definition of a construct. This is a prerequisite for good 2. Decide on operational definition measurement. Use an existing measure OR Create your own measure This allows you to make sound decisions about 3. Implement the measure exactly how to measure the construct. Caution: Look out for socially desirable responding and demand characteristics Read the research literature on a construct and pay close attention to how others have defined it. 4. Evaluate the measure 17 18 3 9/30/2024 19 20 Operationally Defining the Construct Using an Existing Measure Some advantages of using an existing measure: Next, you must operationally define the construct. 1. You save the time and trouble of creating your own Most variables can be operationally defined in many 2. There is already some evidence that the measure is valid (if it different ways, but you must define them in a way that has been used successfully) can be directly observed and measured. 3. Your results can more easily be compared, and combined, Operationally defining your variables may involve: with previous results You may have several existing measures to choose from – could Using an existing measure (usually the best option) OR choose the most common one, the one with the best reliability Creating your own measure and validity, the one that best measures the aspect of a contrast you are most interested in, or even the easiest one to use. 19 20 21 22 Creating Your Own Measure Implementing the Measure You might want to create your own measure if: no measure exists, Implement any measure in a way that maximizes its reliability existing ones are difficult or time-consuming, or you want to and validity – this typically involves testing everyone under compare a new measure to existing ones. similar conditions that are (ideally) quiet and free of Most new measures in psychology are really just variations of distractions existing measures. People can react in a variety of ways to being measured Keep it simple and brief when possible. that reduce reliability and validity of the scores It is almost always better for a measure to include multiple items Socially desirable responding – Doing or saying things instead of just a single item. because they think it is the socially appropriate thing Pilot test your new measure to identify any potential issues early on. Demand characteristics – Subtle cues that reveal how the researcher expects participants to behave 21 22 Implementing the Measure 23 Evaluating the Measure 24 There are several precautions you can take to minimize Assess test-retest reliability if possible (typically not possible) participant reactivity: Assess criterion validity: 1. Make the procedure as clear and brief as possible Compare correlations among any measures assessing the 2. Guarantee participants’ anonymity and make it clear to same OR different constructs to see if they match them that you are doing so expectations 3. Do not let participants know how you expect them to Successful experimental manipulation also provides respond – Ex: Do not reveal your hypothesis in the informed evidence of criterion validity consent If your measure is not reliable or valid, then consider: (1) 4. If possible, have the measure administered by a helper who revising your measure, (2) revising the conceptual definition, or is unaware of its intent or any hypotheses. At a minimum, (3) trying a new manipulation standardize all interactions. 23 24 4 9/30/2024 Assignment #2: Literature Review (20%) 25 Questions about Chapter 4? 26 Due: Submit in Canvas by 11:59 pm on Wednesday, October 23rd For this assignment, you will write a Literature Review that incorporates a minimum of 8 peer-reviewed psychological papers (ideally from Assignment 1, but you can select new ones if you prefer). This literature review will summarize past research, make it clear how your work contributes/adds/builds upon what has already been done, and mention why this is an important area of study. You will end Assignment 2 with a Current Study section that outlines your research question and hypothesis (or hypotheses). Your variables and the general study idea should be briefly introduced here as well. Instructions for Assignment 2 (including the grading rubric), as well as an example template, have been added to Canvas. 25 26 5