Validity and Reliability of Measurement PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document discusses the concepts of validity and reliability in measurement. It explains what validity means in the context of a measurement process and describes different types of validity (face, concurrent, predictive, content, and convergent validity), with examples to illustrate each concept. The document also covers divergent validity.
Full Transcript
Validity and reliability of measurement Having defined any variables of interest, another part of Step 3 of our quantitative [[research process]](https://canvas.uts.edu.au/courses/35018/pages/research-process?module_item_id=2016473) is to determine how to measure those variables. Two general criter...
Validity and reliability of measurement Having defined any variables of interest, another part of Step 3 of our quantitative [[research process]](https://canvas.uts.edu.au/courses/35018/pages/research-process?module_item_id=2016473) is to determine how to measure those variables. Two general criteria employed for evaluating any measurement procedure are *validity* and *reliability*. The attitudes toward research survey (R-ATR: Papanastasiou, 2014) that you completed during your [[get to know your classmates]](https://canvas.uts.edu.au/courses/35018/discussion_topics/555414?module_item_id=2016455) activity is considered to be a valid and reliable measurement (Loayza-Rivas & Icaza, 2023; Papanastasiou, 2014). So, what do we mean by validity and reliability? Validity The validity of a measurement procedure relates to the degree to which it measures the variable it claims to measure. For example, to measure someone's height we would use a ruler or a tape measure (valid measures) rather than a weighing scale (invalid measure). This might seem obvious when dealing with directly observable variables, but whether a measure is measuring what it purports to measure becomes muddier when dealing with variables which we observe indirectly. For example, we might consider an elevated heart rate as a sign that a person is attracted to another person, but without some other measurements here, we could not be sure that a person's elevated heart rate was related to attraction rather than to, say, anxiety regarding whether they were doing the "right" thing while participating in a research study. When using IQ tests to measure intelligence, if participants cannot understand the language used in the test or the test is not appropriate to the cultural background of those being assessed, then it seems unlikely that such tests are measuring those participants' intelligence. From this, it is clear that validity is not an inherent property of a measure or test. Types of validity There are numerous ways to assess the validity of a measure. Some of the more common types are listed here. - **Definition:** Face validity refers to the face value of the measure. That is, does the measure give the appearance of measuring what it claims to measure. **Example:** The following items on the Depression Anxiety Stress Scales (DASS-21: Lovibond & Lovibond, 1995) assessing stress would be considered as having a high level of face validity: "I found it hard to wind down" and "I found it difficult to relax." Face validity is a minimum requirement for any measurement process. However, it can be subjective. For example, a researcher's intended meaning of questionnaire items may be different to that of participants. - **Definition:** Concurrent validity refers to how well a measure of a construct compares with another known criterion or "gold standard" when assessed at the same time. This other criterion can be another measure. **Example:** Developing a new 12-item measure of clinical depression, then administering this new measure followed by the more established Hamilton Depression Rating Scale (HDRS: Hamilton, 1960) to patients. When there is consistency between the scores (scores on tests that claim to measure the same construct), as one measure is valid we then infer that the other measure must also be valid. - **Definition:** Predictive validity is the ability of a measure to predict future events. This validity is measured by the correlation between the test and some future event. **Example:** Asking people charged with a criminal offence to complete a pretrial risk assessment (e.g., U.S. Federal PTRA \[Lowenkamp & Whetzel, 2009\]) and then checking that assessment against the charged offenders later behaviour. For example, whether a person attended court or perpetrated a new crime during the pretrial period. - **Definition:** Content validity is based on comparing the content of the measure with the existing content which defines the construct under investigation. **Example:** One construct for intelligence is as containing multiple components including the ability to "reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience (Gottfredson, 1997, p. 13). To have high content validity when using this construct as a basis for investigating intelligence means including a measure that assesses each of these seven components of intelligence. - **Definition:** Convergent validity is demonstrated by a strong relationship between two measures of the same construct. **Example:** Krefetz et al. (2002) found that assessments of depression among adolescents were similar across two measures: the Beck Depression Inventory-II (BDI-II: Beck et al., 1996) and the Reynolds Adolescent Depression Scale (RADS: Reynolds, 1987). - **Definition:** Divergent validity (or discriminant validity) is demonstrated by little or no relationship between the measurements of two different constructs. **Example:** In support of social anxiety and depression being different constructs, Ranta et al. (2002) found a weak correlation when measuring scores on the Beck Depression Inventory-II (BDI-II: Beck et al., 1996) and scores on the Social Anxiety Scale for Adolescents (SAS-A: La Greca & Lopez, 1998).