Psychology 240 Lecture--Good Measurement PDF

Good Measurement Operational Definitions & Measurement  Your measurement technique will be an important part of your operational definition.  The more precise your conceptual operational definition, the better your measurement scale will be. Common Types of Measures  Self-report measures—people answer direct questions about themselves  Observational measures—recorded observations of behavior  Physiological measures—measuring biological processes through some type of equipment  Many variables can be measured by all three types.  The best studies use all three types. Measurement Techniques  You can make your own, or use one of the hundreds that other researchers have used.  If you make your own, you’ll want to test it against the old ones to make sure it works.  All measures, even physiological ones, should be validated in this way. Levels of Measurement  There are two types of level systems.  Categorical (or nominal) variables— category variables  Things like gender/sex, race, religion, geographical region, species, etc.  Quantitative variables—variables with meaningful numbers  Height, weight, age, dosage, time, IQ, amount, etc. Quantitative Variables  There are three kinds of quantitative scales.  Ordinal—things with a rank order, but with no hard measurement between the ranks  1st, 2nd, 3rd, etc.  Interval—a scale with precise measurements between the numbers, and no true zero.  IQ scores, temperature, etc. Quantitative Variables  Ratio scales—an interval scale with a true zero  Income, brain activity, GPA, etc.  It is possible to have no GPA, no income, or no brain activity.  Ratio scales are the only ones that allow us to say things are “twice as” or “three times as” or “one fourth of”, etc. Our Favorite Scale  The most common scale in psychology is the Likert scale.  The “rate something on a scale of 1 to x” one.  It is an interval scale, since we rarely use zero.  Common number choices are 1 to 5, 1 to 7, and 1 to 9.  Less than 5 or more than 9 gives bad results.  We prefer odd numbers so that there is a true neutral. Likert Scale  When making a scale, each question must be written to be answerable by the same scale.  For example, if you choose “1 is Always, 3 is Sometimes, 5 is never”, then all your questions must be written as time questions.  If you want to use a different scale, you have to start over with a new questionnaire. Reliability  As we discussed, reliability is a component of construct validity.  Reliability—a measure will always give the same result every time you give it Types of Reliability  Test-retest reliability—a test gives the same results when given a second time to the same people  Interrater reliability—two independent observers agree on their reports  Internal reliability—all the questions or items on a scale are measuring the same conceptual thing Testing Reliability  You can measure reliability a number of different ways.  You can put the results on a scatterplot.  You then draw a line through what looks like the middle.  The correlation coefficient tells you how close to the line the points are on average. Correlation Coefficient  We designate the correlation coefficient with r.  This r is always between 1 and -1; the closer to either, the stronger the relationship.  Test-retest reliability can easily be measured with r. Testing Reliability  Interrater reliability can be measured with r, but only when dealing with quantitative variables.  Qualitative variables—variables that try to turn subjective judgments into numbers  For qualitative variables, we measure interrater reliability with a statistic called kappa.  Kappa should be at least 0.70. Testing Reliability  Internal reliability usually refers to self- report scales.  You want to be sure that each question relates to the concept you’re studying.  We use a statistic called Cronbach’s alpha to check internal reliability.  It correlates all the items with each other individually.  Like kappa, Cronbach’s alpha should be at least 0.70. Measurement Validity  Like all validity, this has to do with applicability to the real world.  In this case, does your scale actually measure what it’s supposed to measure?  Unlike reliability, there are no easy statistics for measurement validity.  Nevertheless, we use data to measure it. Measurement Validity  Face validity—Does it sound plausible on its face?  Construct validity—Does it capture the entirety of the concept? Measurement Validity  Predictive validity—Over time, does the scale seem to correlate with observed results?  In other words, if our scale says people with high IQs will make more money, do they in fact eventually make more money?  Concurrent validity—Is the scale correlating with observed results at this moment? Known-Groups Paradigm  How can you test your scale’s predictive or concurrent validity?  You can measure it against a group whose characteristics are already known.  For instance, if you make a new anxiety scale, you can give it to people who have been diagnosed with anxiety disorder.  If they score high on your scale, then Convergent & Discriminant Validity  Convergent validity—Does your scale give similar results to other scales measuring the same thing?  Discriminant validity—Does your scale give completely different results from scales that measure other concepts? Reliability vs. Validity  It is possible to be reliable but not valid.  It is not really possible to be valid but not reliable.  Reliability is like internal validity; it deals with the scale itself as opposed to the real world.  Ideally, you want both as high as possible.  Sometimes, particularly with a new

Psychology 240 Lecture--Good Measurement PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue