Psychology 240 Lecture--Good Measurement PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This Psychology lecture covers various aspects of measurement, particularly for research. It touches on different measurement techniques, including self-report, observational, and physiological methods, focusing on the importance of proper operational definitions.
Full Transcript
Good Measurement Operational Definitions & Measurement Your measurement technique will be an important part of your operational definition. The more precise your conceptual operational definition, the better your measurement scale will be. Common Types of Measures Self-report measure...
Good Measurement Operational Definitions & Measurement Your measurement technique will be an important part of your operational definition. The more precise your conceptual operational definition, the better your measurement scale will be. Common Types of Measures Self-report measures—people answer direct questions about themselves Observational measures—recorded observations of behavior Physiological measures—measuring biological processes through some type of equipment Many variables can be measured by all three types. The best studies use all three types. Measurement Techniques You can make your own, or use one of the hundreds that other researchers have used. If you make your own, you’ll want to test it against the old ones to make sure it works. All measures, even physiological ones, should be validated in this way. Levels of Measurement There are two types of level systems. Categorical (or nominal) variables— category variables Things like gender/sex, race, religion, geographical region, species, etc. Quantitative variables—variables with meaningful numbers Height, weight, age, dosage, time, IQ, amount, etc. Quantitative Variables There are three kinds of quantitative scales. Ordinal—things with a rank order, but with no hard measurement between the ranks 1st, 2nd, 3rd, etc. Interval—a scale with precise measurements between the numbers, and no true zero. IQ scores, temperature, etc. Quantitative Variables Ratio scales—an interval scale with a true zero Income, brain activity, GPA, etc. It is possible to have no GPA, no income, or no brain activity. Ratio scales are the only ones that allow us to say things are “twice as” or “three times as” or “one fourth of”, etc. Our Favorite Scale The most common scale in psychology is the Likert scale. The “rate something on a scale of 1 to x” one. It is an interval scale, since we rarely use zero. Common number choices are 1 to 5, 1 to 7, and 1 to 9. Less than 5 or more than 9 gives bad results. We prefer odd numbers so that there is a true neutral. Likert Scale When making a scale, each question must be written to be answerable by the same scale. For example, if you choose “1 is Always, 3 is Sometimes, 5 is never”, then all your questions must be written as time questions. If you want to use a different scale, you have to start over with a new questionnaire. Reliability As we discussed, reliability is a component of construct validity. Reliability—a measure will always give the same result every time you give it Types of Reliability Test-retest reliability—a test gives the same results when given a second time to the same people Interrater reliability—two independent observers agree on their reports Internal reliability—all the questions or items on a scale are measuring the same conceptual thing Testing Reliability You can measure reliability a number of different ways. You can put the results on a scatterplot. You then draw a line through what looks like the middle. The correlation coefficient tells you how close to the line the points are on average. Correlation Coefficient We designate the correlation coefficient with r. This r is always between 1 and -1; the closer to either, the stronger the relationship. Test-retest reliability can easily be measured with r. Testing Reliability Interrater reliability can be measured with r, but only when dealing with quantitative variables. Qualitative variables—variables that try to turn subjective judgments into numbers For qualitative variables, we measure interrater reliability with a statistic called kappa. Kappa should be at least 0.70. Testing Reliability Internal reliability usually refers to self- report scales. You want to be sure that each question relates to the concept you’re studying. We use a statistic called Cronbach’s alpha to check internal reliability. It correlates all the items with each other individually. Like kappa, Cronbach’s alpha should be at least 0.70. Measurement Validity Like all validity, this has to do with applicability to the real world. In this case, does your scale actually measure what it’s supposed to measure? Unlike reliability, there are no easy statistics for measurement validity. Nevertheless, we use data to measure it. Measurement Validity Face validity—Does it sound plausible on its face? Construct validity—Does it capture the entirety of the concept? Measurement Validity Predictive validity—Over time, does the scale seem to correlate with observed results? In other words, if our scale says people with high IQs will make more money, do they in fact eventually make more money? Concurrent validity—Is the scale correlating with observed results at this moment? Known-Groups Paradigm How can you test your scale’s predictive or concurrent validity? You can measure it against a group whose characteristics are already known. For instance, if you make a new anxiety scale, you can give it to people who have been diagnosed with anxiety disorder. If they score high on your scale, then Convergent & Discriminant Validity Convergent validity—Does your scale give similar results to other scales measuring the same thing? Discriminant validity—Does your scale give completely different results from scales that measure other concepts? Reliability vs. Validity It is possible to be reliable but not valid. It is not really possible to be valid but not reliable. Reliability is like internal validity; it deals with the scale itself as opposed to the real world. Ideally, you want both as high as possible. Sometimes, particularly with a new