Week 6 Reliability and Validity PDF

Revision: Data Collection Methods in Quantitative Research  Self-report  Participants’ responses to questions posed by the researcher (e.g. questionnaire)  Observation  Direct observation of people’s behaviors, characteristics and other circumstances through visual, auditory, tactile and other senses.  Bio-physiologic measures  Methods used to assess clinical variables (e.g. BP, body temperature, blood glucose)  Are relatively accurate, precise and objective © Copyright National University of Singapore. All Rights Reserved. Introduction An ideal data collection procedure is one that captures a construct in a way that is relevant, credible, accurate, truthful, and sensitive. For most concepts of interest to nurse researchers, there are few data collection procedure that match this ideal. Biophysiologic methods have a higher chance of success in attaining these goals than self-report or observational methods. © Copyright National University of Singapore. All Rights Reserved. Polit & Beck 2014 Data Quality How do you really know if you measured what you thought you did? Anxiety Pain Stress level Quality of life © Copyright National University of Singapore. All Rights Reserved. Learning Outcomes Describe the major characteristics of measurement. Identify the major sources of measurement error. Define validity and reliability. Describe the dimensions of reliability and validity. Discuss common methods for evaluating reliability and validity. Interpret the meaning of reliability and validity information. © Copyright National University of Singapore. All Rights Reserved. Measurement Definition The assignment of numbers to represent the amount of an attribute present in an object or person using specific rules. Example. Suppose we were studying attitudes toward distributing contraceptives in school-based clinics and asked parents to express their extent of agreement with the following statements: Teenagers should have access to contraceptives in school clinics: (1) Strongly agree Link numeric (2) Agree values to reality (3) Slightly agree (4) Slightly disagree (5) Disagree (6) Strongly disagree © Copyright National University of Singapore. All Rights Reserved. Measurement Advantages Removes guesswork in gathering and communicating information 180cm Obtains more precise information 130cm Provides a language for communication and analysis Peter John is is tall short Polit & Beck 2014 © Copyright National University of Singapore. All Rights Reserved. Measurement Error Distortion in measurement related to the effects of extraneous factors. IV: Muscle relaxation DV: Anxiety level therapy DV: Anxiety level EV: family loss © Copyright National University of Singapore. All Rights Reserved. Measurement Error (Con’t) Measurement error Obtained score = True score ±Error - Obtained score: an actual data value True score: due to IV for a participant (e.g. participant’s score on an anxiety scale) ± - True score: the score that would be obtained with an infallible measure Obtained Score (caused by independent variable) -- Error: caused by extraneous factors Error: due to EV that distort measurement (e.g., family loss) © Copyright National University of Singapore. All Rights Reserved. Polit & Beck 2014 Error of measurement (Con’t) Situational Response-set contaminants biases Common Factors Transitory personal factors © Copyright National University of Singapore. All Rights Reserved. Error of measurement (Con’t) Situational contaminants: Scores can be affected by the conditions under which they are produced.  the friendliness of researchers  the location of the data gathering  environmental factors: e.g. temperature, lighting, time of day © Copyright National University of Singapore. All Rights Reserved. Polit & Beck 2014 Error of measurement (Con’t) Response-set biases:  are potential problems in self-report measures, particularly in psychological scales  a temporary reaction to a situational demand Social desirability – expected public disclosure Acquiescence response - e.g. time pressure Extreme responses – e.g., select extremity responses almost exclusively Polit & Beck , 2006; 2014 © Copyright National University of Singapore. All Rights Reserved. Error of measurement (Con’t) Transitory personal factors: A person’s score can be influenced by such temporary personal states as fatigue, hunger, anxiety, or mood. In some cases, such factors directly affect the measurement, as when anxiety affects a pulse rate measurement. © Copyright National University of Singapore. All Rights Reserved. Polit & Beck 2014 Error of measurement (Con’t) Train interviewers  Assure the thoroughly so that they questionnaires will be aren’t inadvertently anonymous. introducing errors  Assure the participants Collect data at the similar will be given enough time. place and time Error  Assure the participants are mentally and physically ready for the assessment  Interviewer has to check if the participants are in © Copyright National University of Singapore. All Rights Reserved. unusual mood/states. Data Quality How do you really know if you measured what you thought you did? Instrument?? Reliability & Validity Data measurement must be valid and reliable in order to obtain a trustworthy answer! © Copyright National University of Singapore. All Rights Reserved. Key Criteria for Evaluating Quantitative measurement Reliability: How CONSISTENTLY a data collection instrument measures the variable that it is supposed to measure. The consistency with which an instrument measures the target attribute. 5 min later Unreliable 150 pounds 120 pounds © Copyright National University of Singapore. All Rights Reserved. Polit & Beck 2014 Key Criteria for Evaluating Quantitative measurement Validity Does the data collection instrument measure the variable that it is supposed to measure The degree to which an instrument measures what it is supposed to be measuring. Pain level Body T Pain scale Not BP Anxiety level © Copyright National University of Singapore. All Rights Reserved. Polit & Beck 2014 Reliability ThreeAspects of Reliability Stability Internal Consistency Equivalence © Copyright National University of Singapore. All Rights Reserved. Polit & Beck 2014 Reliability Stability The extent to which scores are similar on two separate administration of an instrument Example basal body temperature Normal BT Normal BT stable 36.4oC 36.2oC unstable 40.4oC © Copyright National University of Singapore. All Rights Reserved. Reliability Stability Assessed through test-retest reliability procedure The same instrument is given twice to the same group of people The reliability is the correlation between the scores on the two tests The scores on the two tests are not identical but most differences are small Reliability coefficient (r), a numeric index that quantifies an instrument’s reliability can be computed. Range from 0-1 0.7 ------ unsatisfactory 0.7-0.8 ----- acceptable 0.80 ----- desirable  More appropriate for fairly enduring characteristics (e.g. self- esteem, personality, IQ test) © Copyright National University of Singapore. All Rights Reserved. Polit & Beck 2014 SPSS demonstration test-retest reliability (ICC: Intraclass Correlation Coefficient ) ICC: the most popular statistical procedure used © Copyright National University of Singapore. All Rights Reserved. SPSS demonstration test-retest reliability (ICC) Time_1 Time_2 © Copyright National University of Singapore. All Rights Reserved. SPSS demonstration test-retest reliability (ICC) Test-retest reliability shows that its stability is satisfactory (ICC =0.975) © Copyright National University of Singapore. All Rights Reserved. Reliability Internal consistency (homogeneous) The extent that all the subparts of the instrument measure the same trait. Appropriate for most multi-item instruments. Evaluated by administering instrument on one occasion. The most widely used reliability approach. © Copyright National University of Singapore. All Rights Reserved. Polit & Beck 2014 Reliability Internal consistency Evaluated by Cronbach’s alpha (coefficient alpha) Alpha indicates how well a group of items together measure the trait of interest. If all items on a test measure the same underlying dimension, then the items will be highly correlated with all other items.  Range of score = 0.00 – 1.00  Acceptable level = 0.7 – 0.9  Cronbach’s alpha too low --Items are measuring different traits.  Cronbach’s alpha too high -- Redundancy of items © Copyright National University of Singapore. All Rights Reserved. Reliability Equivalence Concerns the degree to which two or more independent observers or coders agree about the scoring on an instrument. Assessed by comparing observations or ratings of two or more observers. A high level of agreement between the raters indicates a good equivalence of the instrument. © Copyright National University of Singapore. All Rights Reserved. Polit & Beck 2014 19 Reliability Equivalence Assessed through Inter-rater (interobserver) reliability procedure: which having two or more trained observers/coders watching an event simultaneously, and independently recording data according to the instrument’s instructions. An index of agreement is calculated. Cohen’s Kappa is used to measure inter-rater reliability for categorical outcomes ( 0.6) Intraclass Correlation Coefficient (ICC) is used to measure inter- rater reliability for continuous measures ( 0.7) © Copyright National University of Singapore. All Rights Reserved. Polit & Beck 2014 SPSS demonstration Inter-rater reliability Rater_1 Rater_2 © Copyright National University of Singapore. All Rights Reserved. SPSS demonstration (Con’t) Analyze Descriptive Statistics Crosstabs Rater_1 2 Rater_2 1 3 5 4 © Copyright National University of Singapore. All Rights Reserved. SPSS demonstration (Con’t) Kappa statistics is 0.800 (p

Week 6 Reliability and Validity PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue