Chapter 3: Goniometry Validity & Reliability PDF

Validity For goniometry to provide meaningful information, measurements must be valid. **Validity** is "the degree to which a useful (meaningful) interpretation can be inferred from a measurement."1 Stated in another way, the validity of a measurement refers to how well the measurement represents the true value of the variable of interest and how well this measurement can be used for a specific purpose. The purpose of goniometry is to measure the angle created at a joint by the adjacent bones of the body. Therefore, a valid goniometric measurement is one that represents the actual joint angle and one that can provide data for use in clinical decision-making. The joint angle obtained from a goniometric measurement is used to describe a specific joint position or, if a beginning and ending joint position are compared, a range of motion (ROM). In this section, the four main types of validity (face validity, content validity, criterion-related validity, and construct validity) are discussed as they relate to the measurement of joint motion. Face Validity **Face validity** indicates that the instrument generally appears to measure what it proposes to measure---that it is plausible to those using the test.2--4 Much of the literature on goniometric measurement does not specifically address face validity because this type of validity is not generally tested. Rather, an assumption is made that the angle created by aligning the arms of a universal goniometer with bony landmarks truly represents the angle created by the proximal and distal bones composing the joint. One infers that changes in goniometer alignment reflect changes in joint angle and represent a range of joint motion. Portney and Watkins3 report that face validity is easily established for some tests, such as the measurement of ROM, because the instrument measures the variable of interest through direct observation. Content Validity **Content validity** is determined by judging whether an instrument adequately measures and represents the domain of content---the substance---of the variable of interest.1--4 Both content and face validity are based on opinion. However, face validity is the most basic and elementary form of validity, whereas content validity involves more rigorous and careful consideration of experts familiar with the content of interest. Gajdosik and Bohannon5 state, "Physical therapists judge the validity of most ROM measurements based on their anatomical knowledge and their applied skills of visual inspection, palpation of bony landmarks, and accurate alignment of the goniometer. Generally, the accurate application of knowledge and skills, combined with interpreting the results as measurement of ROM only, provide sufficient evidence to ensure content validity." Criterion-Related Validity **Criterion-related validity** justifies the validity of the measuring instrument by comparing measurements made with the instrument to a well-established gold standard of measurement--- the criterion.1--4 If the measurements made with the instrument and criterion are taken at approximately the same time, **concurrent validity** is tested. Concurrent validity is a type of criterion-related validity and is the most frequent type of validity reported for goniometry. Criterion-related validity can be assessed using statistical methods such as correlation. In terms of goniometry, an examiner may question the construction of a particular goniometer on a very basic level and consider whether the degree units of the goniometer accurately represent the degree units of a circle. The angles of the goniometer can be compared with known angles of a protractor---the criterion. Usually the construction of goniometers is adequate, and concurrent validity may then focus on whether the measurement of joint position with a goniometer reflects the true joint angle. In this case, a measure of joint position obtained *3* **Validity and Reliability** **of Goniometric** **Measurement** C H A P T E R *David A. Scalzitti, PT, PhD* *D. Joyce White, PT, DSc* 44 PART I Introduction to Goniometry and Muscle Length Testing by radiography may serve as the criterion measure to represent the true joint angle. **Criterion-Related Validity Studies** **of Extremity Joints** Some of the classic studies that have examined the concurrent validity of goniometric and radiographic measurements for the extremity joints are summarized here. As appropriate, summaries of additional studies comparing goniometry to radiographs and/or photographs are included in the Research Findings sections of Chapters 4 through 13. Furthermore, recent systematic reviews have also reported strong concurrent validity between universal goniometers and radiographs for knee joint position6 and between smartphone goniometer applications (apps) and radiographs.7 Gogia and associates8 measured the knee position of 30 healthy individuals with radiography and with a universal goniometer. Knee positions ranged from 0 to 120 degrees. High correlation (correlation coefficient \[*r*\] = 0.97) and agreement (intraclass correlation coefficient \[ICC\] = 0.98) were found between the two types of measurements. These authors concluded that the measurement of knee joint position as obtained in their study was valid to reflect the actual joint position. Enwemeka9 studied the validity of measuring knee ROM with a universal goniometer by comparing the goniometric measurements of 10 individuals with radiographs. No significant differences were found between the two types of measurements when ROM was within 30 to 90 degrees of flexion (mean difference between the two measurements ranged from 0.5 to 3.8 degrees). However, a significant difference was found when ROM was within 0 to 15 degrees of flexion (mean difference = 4.6 degrees). Ahlback and Lindahl10 found that a joint-specific goniometer used to measure total hip flexion and extension of 14 hips closely agreed with radiographic measurements. Kato and coworkers11 compared the accuracy of three types of goniometers aligned on the lateral and dorsal surfaces of the proximal interphalangeal joints of the 16 fixated fingers with radiographs. Mean differences between the goniometers and radiographs ranged from 0.5 to 3.3 degrees. **Criterion-Related Validity Studies of the Spine** Various instruments used to measure ROM of the spine have also been compared with a radiographic criterion, although some authors question the use of radiographs as the gold standard given the variability of total ROM measurements derived from summed segmental motions on spinal readiographs. 12 Three cross-sectional studies that contrasted cervical ROM measurements taken with gravity-dependent goniometers with those recorded on radiographs found concurrent validity to be high. Herrmann,13 in a study of 11 adults, noted a high correlation (*r* = 0.97) and agreement (ICC = 0.98) between radiographic measures and pendulum goniometer measures of head and neck flexion--extension. Ordway and colleagues14 simultaneously measured cervical flexion and extension in 20 healthy persons with a cervical ROM (CROM) device, a computerized tracking system, and radiographs. There were no significant differences between measurements taken with the CROM device and radiographic angles determined by an occipital line and a vertical line, although there were differences between the CROM device and the radiographic angles between the occiput and C7. Tousignant and coworkers15 measured cervical flexion and extension in 31 volunteers with a CROM goniometer and radiographs that included cervical and upper thoracic motion. They found a high correlation between the two measurements for flexion (*r* = 0.97) and extension (*r* = 0.98). An additional study by Tousignant and colleagues16 reported a high correlation for concurrent validity between cervical rotational and lateral flexion movements and an optoelectronic gold standard. Studies that compared clinical ROM measurement methods for the lumbar spine with radiographic results have reported high to low values for validity. Macrae and Wright17 measured lumbar flexion in 342 individuals by using a tape measure according to the Schober and modified Schober methods and compared these results with those shown in radiographs. Their findings support the validity of these measures: correlation coefficient values between the Schober method and the radiographic evidence were 0.90 (standard error = 6.2 degrees) and between the modified Schober and the radiographs were 0.97 (standard error = 3.3 degrees). Portek and associates,18 in a study of 11 men, reported low correlations (0.42 to 0.57) for lumbar flexion and extension ROM measurements taken with a skin distraction method and with a single inclinometer as compared with radiographic evidence. Limitations of this study include the following: measurements were made sequentially rather than concurrently, and different test positions were used. Radiographs and skin distraction methods were performed with subjects standing, whereas inclinometer measurements were performed with subjects sitting for flexion and prone for extension. Burdett, Brown, and Fall,19 in a study of 27 healthy participants, found a fair correlation between measurements taken with a single inclinometer and radiographs for lumbar flexion (*r* = 0.73) and a very poor correlation for lumbar extension (*r* = 0.15). Mayer and coworkers20 compared total lumbar flexion and extension motion in 12 persons with chronic low back pain as measured with a double inclinometer technique and radiographs. No significant difference in group means was observed between the two methods. Saur and colleagues,21 in a study of 54 persons with chronic low back pain, found lumbar flexion ROM measurement taken with two inclinometers correlated highly with radiographs (*r* = 0.98). Extension ROM measurement correlated with radiographs to a fair degree (*r* = 0.75). Samo and associates22 used double inclinometers and radiographs to measure 30 volunteers held in a position of flexion and extension. Radiographs resulted in flexion values that were 11 to 15 degrees greater than those found with inclinometers and extension values that were 4 to 5 degrees less than those found with inclinometers. CHAPTER 3 Validity and Reliability of Goniometric Measurement 45 Construct Validity **Construct validity** is the ability of an instrument to measure an abstract concept (construct) or to be used to make an inferred interpretation.2,3 Rehabilitation professionals may use ROM measurements to make inferences about the function of a person. In Chapters 4 through 13 on measurement procedures, the results of research studies that report joint ROM observed during functional tasks are included. These findings begin to quantify the joint motion needed to avoid functional limitations. Several researchers have artificially restricted joint motion with splints or braces and examined the effect on function.23--25 These studies have demonstrated that many functional tasks can be completed with severely restricted elbow or wrist ROM, provided other adjacent joints are able to compensate. Some studies have measured the correlation between ROM values and the ability to perform functional tasks in patient populations. A study by Hermann and Reese26 examined the relationship among impairments, functional limitations, and disability in 80 persons with cervical spine disorders. The highest correlation (*r* = 0.82) occurred between impairment measures and functional limitation measures, with ROM contributing more to the relationship than the other two impairment measures of cervical muscle force and pain. Triffitt27 found significant correlations between the amount of shoulder ROM and the ability to perform nine functional activities in 125 persons with shoulder conditions. Wagner and colleagues28 measured passive ROM of wrist flexion, extension, radial and ulnar deviation, and the strength of the wrist extensor and flexor muscles in 18 boys with Duchenne muscular dystrophy. A highly significant negative correlation was found between difficulty performing functional hand tasks and radial deviation ROM (*r* = −0.76 to −0.86) and between difficulty performing functional hand tasks and wrist extensor strength (*r* = −0.61 to −0.83). Other studies, however, have demonstrated weaker associations between ROM and function. For example, Waddell and colleagues29 measured lumbosacral motion with inclinometers and compared the results with the Roland-Morris Low Back Pain Disability Questionnaire (*r* = −0.47 for lumbosacral flexion and *r* = −0.33 for lumbosacral extension). A less-than-perfect correlation between ROM and function is not surprising because function is a multidimensional construct and an impairment of one factor related to body functions and structure, such as joint motion, may be responsible only for a small component.30,31 Reliability In order for a measurement to be valid, not only should the measurement represent the true variable of interest but the same value should be obtained when the measurement is repeated under the same conditions. **Reliability** refers to the amount of consistency between successive measurements of the same variable on the same individual under the same conditions.1--3 A goniometric measurement is highly reliable if successive measurements of a joint angle or ROM on the same individual and under the same conditions yield the same results. A highly reliable measurement contains little measurement error. Assuming that a measurement is both highly reliable and valid, an examiner can confidently use its results to determine a true absence, presence, or change in dysfunction. For example, a highly reliable and valid goniometric measurement could be used to determine the presence of limited joint ROM, to evaluate progress toward rehabilitative goals, and to assess the effectiveness of therapeutic interventions. Consistency is necessary for a measurement to be considered valid, although one can obtain a highly consistent measurement that is absent of meaning and therefore is still not valid. An unreliable measurement is inconsistent, does not produce the same results when the same variable is repeatedly measured on the same individual under the same conditions, and contains a large amount of measurement error. This lack of consistency and heightened error will make validity poor as well. A measurement that has poor reliability and validity is not dependable and should not be used to make clinical decisions. Summary of Goniometric Reliability Studies The reliability of goniometric measurement has been the focus of many research studies. Given the variety of study designs and measurement techniques, it is difficult to compare the results of many of these studies. However, some findings noted in several studies can be summarized. An overview of such findings is presented here. More information on reliability studies that pertain to the featured joint is reviewed in Chapters 4 through 13. Readers may also wish to refer to several review articles and book chapters on this topic.32--37 The measurement of joint position and ROM of the extremities with a universal goniometer has generally been found to have good-to-excellent reliability. Numerous reliability studies have been conducted on joints of the upper and lower extremities. Some studies have examined the reliability of measuring joints held in a fixed position, whereas others have examined the reliability of measuring passive or active ROM. Studies that measured a fixed joint position usually have reported higher reliability values than studies that measured ROM.8,13,38,39 This finding is expected because more potential sources of error are present in measuring ROM than in measuring a fixed joint position. Additional sources of error in measuring ROM include movement of the joint axis, variations in manual force applied by the examiner during passive ROM, and variations in an individual's effort during active ROM. The reliability of goniometric ROM measurements varies somewhat depending on the joint and motion. Range of motion measurements of upper-extremity joints have been found by several researchers to be more reliable than ROM 46 PART I Introduction to Goniometry and Muscle Length Testing measurements of lower-extremity joints,36,37,40,41 although opposing results have also been reported.42 Differences in reliability have also been reported for different joints and for different motions of the same joint. For example, Hellebrandt, Duvall, and Moore,43 in a study of upper- extremity joints, noted that measurements of wrist flexion, medial rotation of the shoulder, and abduction of the shoulder were less reliable than measurements of other motions of the upper extremity. Low44 found ROM measurements of wrist extension to be less reliable than measurements of elbow flexion. Greene and Wolf45 reported ROM measurements of shoulder rotation and wrist motions to be more variable than elbow motion and other shoulder motions. Reliability studies on ROM measurement of the cervical and thoracic spine in which a universal goniometer was used have generally reported lower reliability values than studies of the extremity joints.19,46--49 Many devices and techniques have been developed to try to improve the reliability of measuring spinal motions. Gajdosik and Bohannon5 suggested that the reliability of measuring certain joints and motions might be adversely affected by the complexity of the joint. Measurement of motions that are influenced by movement of adjacent joints or multi-joint muscles may be less reliable than measurement of motions of simple hinge joints. Difficulty palpating bony landmarks and passively moving heavy body parts may also play a role in reducing the reliability of measuring ROM of the lower extremity and spine.5,37,40 Many studies of joint measurement methods have found intratester reliability to be higher than intertester reliability. 19,38--44,46,47,49--68 Reliability was higher when successive measurements were taken by the same examiner than when successive measurements were taken by different examiners. This is true for studies that measured joint position and ROM of the extremities and spine with universal goniometers and other devices such as joint-specific goniometers, inclinometers, tape measures, and flexible rulers. Only a few studies found intertester reliability to be higher than intratester reliability. 69--72 In most of these studies, the time interval between repeated measurements by the same examiner was considerably greater than the time interval between measurements by different examiners. Boone et al40 reported mean standard deviations of repeated measurements taken of six extremity joints by one examiner using a universal goniometer to range from 3.7 to 4.0 degrees, whereas Bovens et al42 examined nine joint motions and reported mean standard deviations of repeated measurements of one examiner from 2.5 to 8.1 degrees. The mean of the mean standard deviations reported in these studies was 3.9 degrees and 4.8 degrees, respectively. One interpretation of these findings is that a difference of at least 5 degrees (1 standard deviation) to 10 degrees (2 standard deviations) may be necessary to show improvement or worsening of a joint motion measured by the same examiner. This is somewhat consistent with a recent study of 30 joint motions in 12 adult women that reported intratester standard error of measurement (SEM) values ranging from 1 to 7 degrees (mean SEM = 3.5 degrees), and minimal detectable change \(MDC) values at the 95% confidence level ranging from 4 to 21 degrees (mean MDC95 = 9.6 degrees).73 When more than one examiner took repeated goniometric measurements, the mean of the mean standard deviations increased to 4.7 degrees in the study by Boone et al40 and to 5.9 degrees in the study by Bovens et al.42 This implies a difference of at least 6 to 12 degrees (1 to 2 standard deviations) may be necessary to show true change when repeated measurements are taken by more than one examiner. These values should serve only as a very general guideline of the measurement error of goniometry of extremity joints. Readers are referred to the Research Findings sections of Chapters 4 through 13 for more joint-specific information on intratester and intertester reliability. The reliability of goniometric measurements is affected by the measurement procedure. Several studies found that intertester reliability improved when all the examiners used consistent, well-defined testing positions and measurement methods.51,53,54,74 Intertester reliability was lower if examiners used a variety of positions and measurement methods. Several investigators have examined the reliability of using the mean of several goniometric measurements compared with using one measurement. Low44 recommends using the mean of several measurements made with the goniometer to increase reliability over one measurement. Early studies by Cobe75 and Hewitt76 also used the mean of several measurements. However, Boone and associates40 found no significant difference between repeated measurements made by the same examiner during one session and suggested that one measurement taken by an examiner is as reliable as the mean of repeated measurements. Rothstein, Miller, and Roettger,54 in a study on knee and elbow ROM, found that intertester reliability determined from the means of two measurements improved only slightly from the intertester reliability determined from single measurements. The authors of some texts on goniometric methods suggest the use of universal goniometers with longer arms to measure joints with large body segments such as the hip and shoulder.33,77,78 Goniometers with shorter arms are recommended to measure joints with small body segments such as the wrist and fingers. Robson,79 using a mathematical model, determined that goniometers with longer arms are more accurate in measuring an angle than goniometers with shorter arms. Goniometers with longer arms reduce the effects of errors in the placement of the goniometer axis. However, Rothstein, Miller, and Roettger54 found no difference in reliability among large plastic, large metal, and small plastic universal goniometers used to measure knee and elbow ROM. Riddle, Rothstein, and Lamb52 also reported no difference in reliability between large and small plastic universal goniometers used to measure shoulder ROM. Numerous studies have compared the measurement values and reliability of different types of devices used to measure joint ROM. Universal and gravity-dependent (pendulum and CHAPTER 3 Validity and Reliability of Goniometric Measurement 47 fluid) goniometers; joint-specific devices; tape measures; and wire tracing are some of the devices that have been compared. Studies comparing clinical measurement devices have been conducted on the shoulder,43,45,80 elbow,38,43,44,62,81,82 wrist,38,45 hand,39,65,83,84 hip,85,86 knee,54,85,87 ankle,87,88 cervical spine,46,47,70 and thoracolumbar spine.19,22,48,68,89--95 Many studies have found differences in values and reliability between measurement devices, whereas some studies have reported no differences. A recent systematic review reported that measurements of ROM of upper-extremity joints using instruments, including goniometers, were more reliable than measurements using visual estimation.36 In conclusion, on the basis of the literature and practical experience, several procedures are recommended to improve the reliability of goniometric measurements (Table 3.1). Examiners should use consistent, well-defined testing positions, stabilize the proximal body segment, and carefully palpate anatomical landmarks to align the arms of the goniometer. During successive measurements of passive ROM, examiners should strive to apply the same amount of manual force to move the limb segment. During successive measurements of active ROM, the individual should be urged to exert the same effort to perform a motion. To reduce measurement variability, it is prudent to take repeated measurements on an individual using the same type of measurement device. For example, an examiner should take all repeated measurements of a ROM with a universal goniometer, rather than taking the first measurement with a universal goniometer and the second measurement with an inclinometer. Most examiners should find it easier and more accurate to use a large universal goniometer when measuring joints with large body segments and a small goniometer when measuring joints with small body segments. Inexperienced examiners may wish to take several measurements and record the mean of those measurements to improve reliability, but one measurement is usually sufficient for more experienced examiners using good technique. Clinicians should also remember that successive measurements are more reliable if taken by the same examiner using the same methods than measurements obtained by different examiners. A final recommendation is to calibrate the device at regular intervals by checking the angles obtained with known standards. This recommendation is provided to ensure the measurements obtained reflect the true angle and is especially relevant for devices such as inclinometers and smartphone apps. Statistical Methods of Evaluating Measurement Reliability Clinical measurements may be affected by three main sources of variation: (1) true biological variation, (2) temporal variation, and (3) measurement error.96 **True biological variation** refers to variation in measurements from one individual to another, caused by factors such as age, sex, race, genetics, medical history, and condition. **Temporal variation** refers to variation in measurements made on the same individual at different times, caused by changes in factors such as a person's health status, activity level, emotional state, and circadian rhythms. **Measurement error** refers to variation in measurements made on the same individual under the same conditions at different times, caused by factors such as the examiners (testers), measuring instruments, and procedural methods. For example, the skill level and experience of the examiners, the accuracy of the measurement instruments, and the standardization of the measurement methods all may affect the amount of measurement error. Reliability reflects the degree to which a measurement is free of measurement error; therefore, highly reliable measurements have little measurement error. Statistics can be used to assess variation in numerical data and hence to assess measurement reliability.3,96 A brief digression into statistical methods of expressing reliability is included to assist the examiner in correctly interpreting goniometric measurements and in understanding the literature on joint measurement. This discussion starts with presenting measures of variability, including the standard deviation and the coefficient of variation. This is followed by a discussion of measures of relative reliability including the Pearson product- moment correlation coefficient and the intraclass correlation coefficient. Examples that show the calculation of these statistical tests are presented. This section finishes with a discussion of absolute measures of reliability that provide values for the amount of error associated with the measurement in the original units of the measurement. The measures discussed TABLE 3.1 Recommendations for Improving the Reliability of Goniometric Measurements Use consistent, well-defined testing positions. Stabilize the part of the body that is proximal to the joint being examined to prevent unwanted movements. Use consistent, well-defined, and carefully palpated anatomical landmarks to align the goniometer. Use the same amount of manual force to move the body part during successive measurements of passive ROM. Provide consistent direction, including asking that an individual exerts the same effort to move the body part during successive measurements of active ROM. Use the same device to take successive measurements. Use a goniometer that is suitable in size to the joint being measured. If the examiner is less experienced, record the mean of several measurements rather than a single measurement. Have the same examiner, rather than a different examiner, take successive measurements. Calibrate the measurement instrument at regular intervals. 48 PART I Introduction to Goniometry and Muscle Length Testing include the standard error of measurement and the minimal detectable change. For additional information, including the assumptions underlying the use of all of these statistical tests, the reader is referred to the cited references. At the end of this chapter, four exercises are included for examiners to assess their consistency in obtaining goniometric measurements and performing the calculations for the measures presented. Clinicians are also encouraged to collect data from their staff and patient population to determine reliability of their own measurements. Miller33 has presented a step-by-step procedure for conducting such studies. **Measures of Variation** ***Standard Deviation*** In the biomedical literature, the statistic most frequently used to indicate variation in a sample is the standard deviation. 3,96,97 The **standard deviation** is the square root of the mean of the squares of the deviations from the mean. The standard deviation is symbolized in the literature as **SD, s,** or **sd.** The sample mean is generally denoted as *x*--*,* and is calculated by dividing the sum of each data observation (*x*) by the number of observations in the sample (*n*). The equation for the standard deviation of the distribution of the data around a mean is: *n* SD ( *x*) 1 2 = Σ(*x* − − The standard deviation is expressed in the same units as the original data observations. For example, in goniometry this will be in degrees. If the data observations have a normal (bell-shaped) distribution, one standard deviation above and below the mean includes about 68% of all the observations, and two standard deviations above and below the mean include about 95% of the observations. A large value for the standard deviation value indicates large variability in a series of measurements. Several standard deviations may be determined from a single measurement study.96 These standard deviations represent the dispersion of data around different means. Two of these standard deviations are discussed here. One standard deviation that can be determined represents mainly ***inter*subject** **variation** around the mean of measurements taken of a group of individuals, indicating biological variation. This standard deviation may be of interest in deciding whether an individual has an abnormal ROM in comparison with other people of the same age and gender. Another standard deviation that can be determined represents ***intra*subject variation** around the mean of repeated measurements taken of an individual, indicating measurement error. Assuming the individual's joint was in the same position for each measurement, this is the standard deviation of interest to indicate that the examiner was consistent in obtaining the measurement and was reliable. An example of how to determine these two standard deviations is provided. Table 3.2 presents ROM measurements taken on five subjects.\* Three repeated measurements (observations) were taken on each subject by the same examiner. The **standard deviation indicating biological variation** (intersubject variation) is determined by first calculating the mean ROM measurement for each subject. The mean ROM measurement for each of the five subjects is found in the last column of Table 3.2. The grand mean of the mean ROM measurement for each of the five subjects equals 56 degrees. The grand mean is symbolized by *X* --. The standard deviation is determined by finding the differences between each of the five subjects' means and the grand mean. The differences are squared to ensure having positive numbers, and added together. The sum is used in the formula for the standard deviation. Calculation of the standard deviation indicating biological variation is found in Table 3.3. In the example, the standard deviation indicating biological variation equals 13.6 degrees. This standard deviation denotes primarily intersubject variation. Knowledge of intersubject variation may be helpful in deciding whether a subject has an abnormal ROM in comparison with other people of the same age and gender. If a normal distribution of the measurements is assumed, one way of interpreting this standard deviation from the example is to predict that about 68% of all subjects' mean ROM measurements would fall between 42.4 degrees and 69.6 degrees (plus or minus 1 standard deviation around the grand mean of 56 degrees). One would expect that about 95% of all subjects' mean ROM measurements would fall between 28.8 degrees and 83.2 degrees (plus or minus 2 standard deviations around the grand mean of 56 degrees). The **standard deviation indicating measurement error** (intrasubject variation) also is determined by first calculating the mean ROM measurement for each subject. However, this standard deviation is determined by finding the differences between each of the three repeated measurements taken on a subject and the mean of that subject's measurements. The differences are squared to ensure positive numbers and added together. The sum of these squared differences is then used in the formula for the standard deviation. Using the information on subject 1 in the example, the calculation of the standard deviation indicating measurement error is shown in Table 3.4. Referring to Table 3.2 for information on each of the other subjects and using the same procedure as shown in Table 3.4, the standard deviation for subject 1 = 5.3 degrees, the standard deviation for subject 2 = 2.6 degrees, the standard deviation for subject 3 = 4.0 degrees, the standard deviation for subject 4 = 3.6 degrees, and the standard deviation for subject 5 = 3.0 degrees. The mean standard deviation for \* Five subjects are included in the example to illustrate the calculations. Ideally, a reliability study would include more than fi ve individuals to ensure adequate statistical power. CHAPTER 3 Validity and Reliability of Goniometric Measurement 49 all of the subjects combined is determined by summing the five subjects' standard deviations and dividing by the number of subjects: SD 5.3 2.6 4.0 3.6 3.0 5 18.5 5 = 3.7 degrees \+ + + + = = In the example, the standard deviation indicating intrasubject variation equals 3.7 degrees. This standard deviation is appropriate for indicating measurement error, especially if the repeated measurements on each subject were taken within a short period of time. Note that in this example the standard deviation indicating measurement error (3.7 degrees) is much smaller than the standard deviation indicating biological variation (13.6 degrees). One way of interpreting the standard deviation for measurement error is to predict that about 68% of the repeated measurements on a subject would fall within 3.7 degrees (1 standard deviation) above and below the mean of the repeated measurements of a subject because of measurement error (assuming a normal distribution). We would expect that about 95% of the repeated measurements on a subject would fall within 7.4 degrees (2 standard deviations) above and below the mean of the repeated measurements of a subject, again because of measurement error. A smaller value for the standard deviation of a series of measurements is indicative of less measurement error and therefore a more consistent and reliable measurement. ***Coefficient of Variation*** Sometimes it is helpful to consider the percentage of variation rather than the standard deviation, which is expressed in the TABLE 3.4 Calculation of the Standard Deviation Indicating Measurement Error in Degrees for Subject 1 Measurements (*x*) Mean (*x* -- ) (*x* -- -- *X* -- ) (*x* -- -- *X* -- )2 57 59 −2 4 55 59 −4 16 65 59 6 36 (*x*) 57 55 65 3 = 59 degrees + + = = Σ − = − SD = = ( *x*) (*n* −1) 56 (3 1) 28 5.3 degrees 2 TABLE 3.3 Calculation of the Standard Deviation Indicating Biological Variation in Degrees Subject Mean of Three Measurements (*x* -- ) Grand Mean (*X* -- ) (*x* -- -- *X* -- ) (*x* -- -- *X* -- )2 1 59 56 3 9 2 67 56 11 121 3 70 56 14 196 4 39 56 −17 289 5 45 56 −11 121 Σ( = 9 +121+196 + 289 +121 = 736 degrees; 2 = Σ − = − SD = = ( *X*) (*n* −1) 736 (5 1) 184 13.6 degrees 2 TABLE 3.2 Three Repeated ROM Measurements in Degrees Taken on Five Subjects Subject First Measurement Second Measurement Third Measurement Total Mean of Three Measurements (*x* -- ) 1 57 55 65 177 59 2 66 65 70 201 67 3 66 70 74 210 70 4 35 40 42 117 39 5 45 48 42 135 45 Grand mean (*X*) = = (59 + 67 + 70 + 39 + 45) 5 56 degrees 50 PART I Introduction to Goniometry and Muscle Length Testing units of the data observation (measurement). The **coefficient** **of variation (CV)** is a measure of variation that is relative to the mean and standardized so that the variations of different variables can be compared. The CV is the ratio of the standard deviation to the mean and is expressed as a percentage. The formula is: *x* CV SD = (100%) For the example presented in Table 3.2, the coefficient of variation indicating biological variation uses the standard deviation for biological variation (standard deviation = 13.6 degrees). CV = = 13.6 56 (100%) 24.3% The coefficient of variation indicating measurement error uses the standard deviation for measurement error (standard deviation = 3.7 degrees). CV = = 3.7 56 (100%) 6.6% In this example the coefficient of variation for measurement error (6.6%) is less than the coefficient of variation for biological variation (24.3%). A lower value for the coefficient of variation represents less measurement error and therefore a more consistent measurement. This statistic is especially useful in comparing the variability of two or more variables that have different units of measurement (for example, comparing ROM measurement methods recorded in inches versus degrees). However, the coefficient of variation is markedly influenced by the value of the mean. For example, a standard deviation indicating a measurement error of 5 degrees would result in a CV of about 3% if the mean ROM was 150 degrees, whereas the same standard deviation of 5 degrees would result in a CV of 25% if the mean ROM was 20 degrees. **Relative Measures of Reliability:** **Correlation Coefficients** Correlation coefficients are traditionally used to measure the relationship between two variables. They result in a number from --1.0 to +1.0, which indicates how closely one variable is related to another variable.3,97,98 A value of +1.0 describes a perfect positive relationship between the two variables, whereas a value of --1.0 describes a perfect negative relationship. A correlation coefficient of 0 indicates that there is no relationship between the two variables. Correlation coefficients may be used to indicate measurement reliability because it is assumed that two repeated measurements should be highly correlated and approach +1.0. As discussed earlier in this chapter, correlation coefficients may also be used to demonstrate concurrent validity between two devices for measuring joint motion. Several different cut-off values to interpret reliability using correlation coefficients have been described.3,97,98 For an example, Portney and Watkins3 provide a general guideline in which coefficients below 0.50 represent poor reliability, 0.50 to 0.75 suggest moderate reliability, and values greater than 0.75 indicate good reliability. They caution, however, that these values should be interpreted in the context of the data and should not be used as strict cut-off points. ***Pearson Product-Moment Correlation Coefficient*** Because goniometric measurements produce ratio level data, and provided the other criteria for the use of parametric statistics are met, the **Pearson product-moment** **correlation coefficient** may be calculated to compare the association between pairs of goniometric measurements. The Pearson product- moment correlation coefficient is symbolized by the lowercase letter ***r*.** The formula to calculate *r* is expressed in the following equation. In the case where *r* is used to indicate reliability of two measurements, *x* symbolizes the first measurement and *y* symbolizes the second measurement. = Σ − Σ − Σ *r* *y y* *y y* ( *x*)(− *y*) ( *x*)2 Σ(− *y*)2 Referring to the example in Table 3.2, the Pearson correlation coefficient can be used to determine the relationship between the first and the second ROM measurements on the five subjects. Calculation of the Pearson product-moment correlation coefficient for this example is found in Table 3.5. The resulting value of *r* = 0.98 indicates a highly positive linear relationship between the first and the second measurements. In other words, the two measurements are highly correlated. The Pearson product-moment correlation coefficient indicates association between the pairs of measurements rather than agreement. Therefore, to decide whether the two measurements are identical, the equation of the straight line best representing the relationship should be determined. If the equation of the straight line representing the relationship includes a slope equal to 1 *and* an intercept equal to 0, then an *r* value that approaches +1.0 indicates that the two measurements are identical. However, in cases where the slope is not equal to 1 or the intercept is not equal to 0, the value of *r* only indicates association of the two measures and does not represent agreement. Given the equation of a straight line *y* = *a* + *bx,* where *x* represents the first measurement, *y* the second measurement, *a* the intercept, and *b* the slope, the equation for the slope is: = Σ − Σ − *b* ( *x*)(*y* − *y*) ( *x*)2 and the equation for the intercept is: *a* = *y* − *bx* CHAPTER 3 Validity and Reliability of Goniometric Measurement 51 For the example using the data from Table 3.5, the calculation of the slope and intercept is: *b* = = 648.6 738.8 0.88 *a* = 55.6 − (0.88 ⋅ 53.8) = 55.6 − 47.34 = 8.26 degrees The equation of the straight line best representing the relationship between the first and the second measurements in this example is *y* = 8.26 + 0.88*x.* Although the *r* value represents a high correlation, the two measurements are *not* identical given this linear equation. One concern in interpreting correlation coefficients is that the value of the correlation coefficient is markedly influenced by the range of the measurements.3,99 The greater the biological variation between individuals for the measurement is, the more extreme the *r* value will be, so that *r* is closer to --1.0 or +1.0. Another limitation is the fact that the Pearson product-moment correlation coefficient can evaluate the relationship between only two variables or two measurements at one time. An additional limitation to remember is that the value of *r* is a point-estimate of a population parameter and one should consider the confidence interval around *r* as an estimate of the true population value. ***Intraclass Correlation Coefficient*** To avoid the need for calculating and interpreting both the correlation coefficient and a linear equation, the **intraclass** **correlation coefficient (ICC)** is frequently used to evaluate reliability of goniometric measurements. The ICC also allows the comparison of two or more measurements at a time; one can think of it as an average correlation among all possible pairs of measurements.99 This statistic is determined from an analysis of variance model, which compares different sources of variation. The ICC is conceptually expressed as the ratio of the variance associated with the subjects, divided by the sum of the variance associated with the subjects plus error variance.100 The theoretical limits of the ICC are between 0.0 and +1.0; +1.0 indicates perfect agreement (no error variance), whereas 0.0 indicates no agreement (large amount of error variance). There are six different formulas for determining ICC values based on the design of the study, the purpose of the study, and the type of measurement.3,100--102 Three models have been described, each with two different forms. In Model l, each subject is tested by a different set of testers (examiners), and the testers are considered representative of a larger population of testers---to allow the results to be generalized to other testers. In Model 2, each subject is tested by the same set of testers, and again the testers are considered representative of a larger population of testers. In Model 3, each subject is tested by the same set of testers, but the testers are the only testers of interest---the results are not intended to be generalized to other testers. The first form of all three models is used when single measurements (1) are compared, whereas the second form is used when the means of multiple measurements (k) are compared. The different formulas for the ICC are identified by two numbers enclosed by parentheses. The first number indicates the model, and the second number indicates the form. For further discussion, examples, and formulas, the reader is urged to refer to the referenced texts3 and articles.100--102 In the example of the ROM measurements from five subjects (Table 3.2), a repeated measures analysis of variance was conducted and the ICC (3,1) was calculated as 0.94. This ICC model was selected because each measurement was taken by the same tester, there was only an interest in applying the results to this tester, and three separate single measurements were compared rather than the means of several measurements. This ICC value indicates high reliability between the three TABLE 3.5 Calculation of the Pearson Product-Moment Correlation Coefficient for the First (*x*) and Second (*y*) ROM Measurements in Degrees Subject *x y* (*x* -- *x* --) (*y* -- *y*--) (*x* -- *x* -- ) (*y* -- *y*--) (*x* -- -- *X* -- )2 (*y* -- *y*--)2 1 57 55 3.2 --0.6 --1.92 10.24 0.36 2 66 65 12.2 9.4 114.68 148.84 88.36 3 66 70 12.2 14.4 175.68 148.84 207.36 4 35 40 --18.8 --15.6 293.28 353.44 243.36 5 45 48 --8.8 --7.6 66.88 77.44 57.76 Σ = 648.60 Σ = 738.80 Σ = 597.20 = \+ + + + = = \+ + + + *x y* = 57 66 66 35 45 5 53.8 degrees; 55 65 70 40 48 5 55.6 degrees = Σ − − Σ − Σ − *r* = = = *x x y y* *x x y y* ( )( ) ( ) ( ) 648.6 738.8 597.2 648.6 (27.2)(24.4) 0.98 2 2 52 PART I Introduction to Goniometry and Muscle Length Testing repeated measurements. However, this value is slightly lower than the Pearson product-moment correlation coefficient, as the calculation of the ICC is considering both association and agreement. This calculation of the ICC also differed from the calculation of the Pearson product-moment correlation coefficient as it incorporated the three repeated measurements as compared with a pair of repeated measurements. For recommendations to interpret ICC values, please refer to textbooks on clinical research.3,98 Keep in mind that these values need to be interpreted in the context of the data and are not strict cut-offs. Like the Pearson product-moment correlation coefficient, the ICC is also influenced by the range of measurements between the subjects. As the group of subjects becomes more homogeneous, the ability of the ICC to detect agreement is reduced and the ICC can erroneously indicate poor reliability. 3,100,102,103 Because correlation coefficients are sensitive to the range of the measurements and do not provide an index of reliability in the units of the measurement, some experts prefer the use of the standard deviation of the repeated measurements (intrasubject standard deviation) or the standard error of measurement to assess reliability.102--105 Furthermore, like the correlation coefficient, the value of the ICC is a pointestimate of a population parameter and one should consider the confidence interval around this point-estimate. **Absolute Measures of Reliability** Earlier in this chapter, standard deviations were discussed as a measure of variability. A standard deviation is an absolute measure of reliability as it is reported in the same units as the original measurement. Absolute measures, such as the standard deviation, provide the clinician with a sense of the magnitude of the consistency of the measurement in units that are logical to understand and may be easily explained by the clinician to the person whose joint angle or ROM is being measured. ***Standard Error of Measurement*** The standard error of measurement is another absolute measure of measurement reliability that is expressed in the same units as the original measurement.3,102,106,107 According to DuBois,106 "The **standard error of measurement** is the likely standard deviation of the error made in predicting true scores when we have knowledge only of the obtained scores." The true scores are forever unknown, but several formulas have been developed to estimate this statistic. The standard error of measurement is generally symbolized as **SEM.**† One method to estimate the SEM considers the differences between the scores from two repeated measurements such as in a test-retest reliability study.102,107 In other words, the difference between two repeated measurements of a joint motion is determined and a standard deviation from all of the difference scores is calculated. This standard deviation of the test-retest differences (SDdiff) is then divided by the square root of 2 to obtain the SEM. SEM= SD 2 diff The SEM can also be estimated from a repeated measures analysis of variance (ANOVA) model.107,109,110 This formula may be helpful when more than two repeated measurements are taken. In this case, the SEM is equivalent to the square root of the error variance. The error variance may also be referred to as the mean square error or within-subjects mean square. The value for the error variance is frequently available from the ANOVA summary table. SEM= error variance A third method to estimate the SEM incorporates information from the variation of repeated measurements and the reliability coefficient. If the pooled standard deviation from a series of repeated measurements is denoted SDp, a correlation coefficient such as the intraclass correlation coefficient is denoted ICC, and the Pearson product-moment correlation coefficient is denoted *r*, the formulas for the SEM are as follows: SEM= SD 1− ICC p or if the Pearson product-moment correlation is used for reliability SEM= SD 1− *r* p Returning to the example in Table 3.2, the SEM can be estimated using these three methods. First, the calculation of the SEM using the standard deviation of the differences of the first and second measurements is shown in Table 3.6. The resulting value for the SEM of 2.2 degrees is an indication of the stability of the observed scores. Because the SEM is a special case of the standard deviation, about 68% of the time the true measurement would be within 2.2 degrees of the observed measurement. We can also use all three measurements from the five subjects in Table 3.2 and the results of a repeated measures analysis of variance to estimate the SEM. Given that the error variance in the ANOVA is equal to 10.9, the SEM is equal to the square root of the error variance or 3.3 degrees. Note in this case the value of the SEM is larger than when only the first two measurements were used because of the increased variation added by the third measurement in this example. † Note that another statistic, the standard error of the mean, is often confused with the standard error of measurement.3,96 The standard error of the mean may also be symbolized with the same abbreviation as the standard error of measurement, which may contribute to the confusion. These two statistics are not equivalent, nor do they have the same interpretation. The standard error of the mean is the standard deviation of a distribution of means taken from samples of a population.3 The standard error of the mean describes how much variation can be expected in the means from future samples of the same size. Because we are interested in the variation of individual measurements when evaluating reliability rather than the variation of means, the standard deviation of the repeated measurements or the standard error of measurement are the appropriate statistical tests to use.108 CHAPTER 3 Validity and Reliability of Goniometric Measurement 53 Likewise, we can also use the value of the ICC, which was also obtained from a repeated measures ANOVA, to estimate the SEM. As you recall, in the example the value for the ICC is 0.94 and the value for the pooled standard deviation (SDp) among the five subjects is 13.6 degrees (in this example the SDp is also equal to the value of the standard deviation, indicating biological variation). SEM = 13.6 1− 0.94 = 13.6 0.06 = 3.3 degrees Both of these analyses using the three repeated measurements obtained a value of 3.3 degrees for the SEM, which informs us that 68% of the time the true measurement would be within 3.3 degrees of the observed measurement or 95% of the time the true measurement would be within 6.6 degrees of the observed measurement (i.e., within two SEM). ***Minimal Detectable Change*** A final absolute measure to discuss is the concept of **minimal** **detectable change (MDC),** which is the smallest amount of change in a measurement in excess of the measurement error.3,97,107,111,112 The MDC uses information regarding the reliability of the measurement in order to provide a minimal value to determine whether a change has occurred. In the literature the MDC has also been referred to as the **minimal** **detectable difference (MDD),** the **minimal important difference,** and the **smallest detectable difference (SDD).**3,112 The MDC at the 90% confidence level is calculated from the standard error of measurement using the following equation, with the value of 1.65 representing the z-score at the 90% confidence level.‡ Like the SEM, the MDC is expressed in the same units as the original measurement. MDC = SEM ⋅ 2 ⋅ 1.65 90 Because the SEM may be calculated from the standard deviation of the test-retest differences divided by the square root of 2, the MDC may also be calculated as: MDC = SD ⋅ 1.65 90 diff One may also see MDC values in the literature reported at other confidence levels. For example, equations for the MDC at the 95% confidence level are as follows and result in a larger value for the minimal change than the MDC90. MDC = SEM ⋅ 2 ⋅ 1.96 95 or MDC = SD ⋅ 1.96 95 diff Returning to our example of three repeated measurements of ROM and using a value for the SEM of 3.3 degrees, the MDC90 is calculated as 7.7 degrees. MDC = 3.3 ⋅ 2 ⋅ 1.65 = 7.7 degrees 90 TABLE 3.6 Calculation of the Standard Error of Measurement (SEM) for the First (*x*) and Second (*y*) ROM Measurements in Degrees Using the Standard Deviation of the Differences (SDdiff) Subject *x y* (*x* -- *y*) (*x* -- *y*) -- (*X* -- diff) \[(*x* -- *y*) -- (*X* -- diff)\]2 1 57 55 2 3.8 14.44 2 66 65 1 2.8 7.84 3 66 70 −4 −2.2 4.84 4 35 40 −5 −3.2 10.24 5 45 48 −3 −1.2 1.44 Σ = −9 Σ = 38.80 Mean of differences (*x* -- *y*) = = Σ − = − *X* = *y* *n* (*x y*) 9 5 --1.8 degrees diff Standard deviation of differences (SDdiff) = \[(*x y*) (*X* )\] (*n* 1) 38.80 4 di 9.7 3.11 degrees ff Σ\[(− 2= = = Standard error of measurement (SEM) = SD 2 3.11 2 3.11 1.41 diff = = = 2.2 degrees ‡ A z-score is the difference between an observation and the mean, divided by the standard deviation \[(*x* -- *x* -- )/SD\]. The z-score, which is in standard deviation units and applied to a standard normal curve distribution in which the mean is 0 and the SD = 1, can be used to determine the probability of an observation. 54 PART I Introduction to Goniometry and Muscle Length Testing Exercise 8 Intratester Reliability 1\. Select a subject and a universal goniometer. 2\. Measure elbow flexion ROM on your subject three times, following the steps outlined in Chapter 2, Exercise 7. 3\. Record each measurement on the recording form (see opposite page) in the column labeled *x*. 4\. Compare the measurements. If a discrepancy of more than 5 degrees exists between measurements, recheck each step in the procedure to make sure that you are performing the steps correctly, and then repeat this exercise. 5\. Continue practicing until you have obtained three successive measurements that are within 5 degrees of each other. 6\. To gain an understanding of several of the statistics used to evaluate intratester reliability, calculate the standard deviation and coefficient of variation by completing the following steps. a\. Add the three measurements together to determine the sum of the measurements. The symbol for summation is Σ. Record the sum at the bottom of the column labeled *x.* b\. To determine the **mean,** divide this sum by 3, which is the number of measurements. The number of measurements is denoted by *n.* The mean is denoted by *x --.* Space to calculate the mean is provided on the recording form. c\. To continue the process of calculating the **standard deviation,** subtract the mean from each of the three measurements and record the results in the column labeled (*x -- x--*). Space to calculate the standard deviation is provided on the recording form. d\. Square each of the numbers in the column labeled (*x -- x--*) and record the results in the column labeled (*x -- x--*)2. e\. Add the three numbers in column (*x -- x--*)2 to determine the sum of the squares. Record the results at the bottom of the column labeled (*x -- x--*)2. f\. Divide this sum by 2, which is the number of measurements minus 1 (*n* − 1). Then find the square root of this number. The units will be in degrees. g\. To determine the **coefficient of variation,** divide the standard deviation by the mean. Multiply this number by 100%. Space to calculate the coefficient of variation is provided on the recording form. 7\. Repeat this procedure with other joints and motions after you have learned the testing procedures. The interpretation of this MDC is that 90% of individuals whose ROM has not changed will display random fluctuations of up to 7.7 degrees between measurements because of measurement error. Expressed another way, differences greater than 7.7 degrees between repeated measurements would likely represent a real change in ROM 90% of the time. Even though we had obtained a fairly high correlation coefficient in this example (ICC = 0.94), the variability within the data resulted in an MDC of 7.7 degrees. Please refer to the Research Findings sections of Chapters 4 through 13 for more joint- specific information on measures of absolute error. Please keep in mind these measures of absolute reliability will be specific for the population in which the measure was obtained and specific to the procedures used to obtain the measurement. Exercises to Evaluate Reliability Exercises 8 and 9 have been included to help examiners assess their reliability in obtaining goniometric measurements. Calculations of the standard deviation and coefficient of variation are included in the belief that understanding is reinforced by practical application. **Exercise 8** examines **intratester** **reliability.** Intratester reliability refers to the amount of agreement between repeated measurements of the same joint position or ROM by the same examiner (tester). An intratester reliability study answers the question: How accurately can an examiner reproduce his or her own measurements? **Exercise 9** examines **intertester reliability.** Intertester reliability refers to the amount of agreement between repeated measurements of the same joint position or ROM by different examiners (testers). An intertester reliability study answers the question: How accurately can one examiner reproduce measurements taken by other examiners? **Exercises 10 and** **11** provide practice using different methods to obtain the **standard error of measurement** and the **minimal detectable** **change** from measurements repeated at two time points. In addition, Exercise 11 provides practice in calculating the Pearson product-moment correlation coefficient. Each of these four exercises provides instructions to calculate these values by hand, although the learner may also use calculators, spreadsheets, or computer applications to obtain the values for the different statistics. CHAPTER 3 Validity and Reliability of Goniometric Measurement 55 **RECORDING FORM FOR EXERCISE 8: INTRATESTER RELIABILITY** Follow the steps outlined in Exercise 8. Use this form to record your measurements and the result of your calculations. Subject's Name Date Examiner's Name Joint and Motion Right or Left Side Passive or Active Motion Type of Goniometer Measurement *x* (*x* -- *x* --) (*x* -- *x* -- )2 1 2 3 *n* = 3 Σ*x* = Σ(*x -- x--*)2 = = = Σ *x* = *x* *n* Mean of the three measurements ( ) = Σ( − − = *n* Standard deviation = SD 1 2 *x* Coefficient of variation CV SD = (100%) = 56 PART I Introduction to Goniometry and Muscle Length Testing Exercise 9 Intertester Reliability 1\. Select a subject and a universal goniometer. 2\. Measure elbow flexion ROM on your subject once, following the steps outlined in Chapter 2, Exercise 7. 3\. Ask two other examiners to measure the same elbow flexion ROM on your subject, using your goniometer and following the steps outlined in Chapter 2, Exercise 5. 4\. Record each measurement on the recording form (see opposite page) in the column labeled *x.* 5\. Compare the measurements. If a discrepancy of more than 5 degrees exists between measurements, repeat this exercise. The examiners should observe one another's measurements to discover differences in technique that might account for variability, such as faulty alignment, lack of stabilization, or reading the wrong scale. 6\. To gain an understanding of several of the statistics used to evaluate intertester reliability, calculate the mean deviation, standard deviation, and coefficient of variation by completing the following steps. a\. Add the three measurements together to determine the sum of the measurements. The symbol for summation is Σ. Record the sum at the bottom of the column labeled *x.* b\. To determine the **mean,** divide this sum by 3, which is the number of measurements. The number of measurements is denoted by *n.* The mean is denoted by *x --.* Space to calculate the mean is provided on the recording form. c\. To continue the process of calculating the **standard deviation,** subtract the mean from each of the three measurements and record the results in the column labeled (*x -- x--*). Space to calculate the standard deviation is provided on the recording form. d\. Square each of the numbers in the column labeled (*x -- x--*) and record the results in the column labeled (*x -- x--*)2. e\. Add the three numbers in column (*x -- x--*)2 to determine the sum of the squares. Record the results at the bottom of the column labeled (*x -- x--*)2. f\. Divide this sum by 2, which is the number of measurements minus 1 (*n* -- 1). Then find the square root of this number. g\. To determine the **coefficient of variation,** divide the standard deviation by the mean. Multiply this number by 100%. Space to calculate the coefficient of variation is provided on the recording form. 7\. Repeat this exercise with other joints and motions after you have learned the testing procedures. CHAPTER 3 Validity and Reliability of Goniometric Measurement 57 **RECORDING FORM FOR EXERCISE 9: INTERTESTER RELIABILITY** Follow the steps outlined in Exercise 9. Use this form to record your measurements and the results of your calculations. Subject's Name Date Examiner 1. Name Examiner 2. Name Joint and Motion Examiner 3. Name Right or Left Side Passive or Active Motion Type of Goniometer Measurement *x* (*x* -- *x* --) (*x* -- *x* -- )2 1 2 3 *n* = 3 Σ*x* = Σ(*x -- x--*)2 = = = Σ *x* = *x* *n* Mean of the three measurements ( ) = Σ( − − = *n* Standard deviation = SD 1 2 *x* Coefficient of variation CV SD = (100%) = 58 PART I Introduction to Goniometry and Muscle Length Testing Exercise 10 Calculation of the Standard Error of Measurement and Minimal Detectable Change This exercise describes the calculation of the standard error of measurement (SEM) and minimal detectable change (MDC) from two repeated measurements of five subjects. 1\. Select five subjects and a universal goniometer. 2\. Measure elbow flexion ROM on each subject once, following the steps outlined in Chapter 2, Exercise 7. 3\. After a short rest, repeat the measurement of the same elbow flexion ROM on the five subjects, using the same goniometer and following the steps outlined in Chapter 2, Exercise 7. Avoid referring to the value for the first measurement when obtaining the second measurement. 4\. Record each measurement on the recording form (see opposite page) in the column labeled *x* for the first measurement with each subject, and in the column labeled *y* for the second measurement with each subject*.* 5\. To gain an understanding of the statistics used to evaluate absolute reliability calculate the **SEM** and **MDC** by completing the following steps. a\. To calculate the difference between the two measurements, subtract *y* from *x* for each of the five measurements, and record the results in the column labeled (*x -- y*). Add these differences together to determine the sum of the measurements in the (*x -- y*) column. The symbol for summation is Σ. Record the sum at the bottom of the column labeled (*x -- y*). b\. To determine the **mean of the summed test-retest differences,** divide this sum by 5, which is the number of measurements. The number of measurements is denoted by *n.* The mean is denoted in the example by *X* *--* diff. Record this value in the space provided on the recording form. c\. Subtract the mean of the summed differences from each of the numbers in the column labeled (*x -- y*) and record the results in the column labeled (*x* -- *y*) -- (*X* *--* diff). d\. Square each of the numbers in the column labeled (*x* -- *y*) -- (*X* *--* diff), and record the results in the column labeled \[(*x* -- *y*) -- (*X* *--* diff)\]2. e\. Add the five numbers in the column labeled \[(*x* -- *y*) -- (*X* *--* diff)\]2 to determine their sum. Record the sum at the bottom of the column labeled \[(*x* -- *y*) -- (*X* *--* diff)\]2. f\. To determine the **standard deviation of the test-retest differences (SDdiff),** divide this sum by 4, which is the number of measurements minus 1 (*n* − 1). Then find the square root of this number. Space to calculate and record the standard deviation of the differences is provided on the recording form. g\. To determine the **standard error of measurement (SEM),** divide the standard deviation of the differences by the square root of 2. Record this value in the space provided. Remember to report the SEM in the same units as the original measurements (i.e., degrees). h\. To determine the **minimal detectable change (MDC90),** multiply the standard error of measurement by the square root of 2, and then multiply this value by 1.65. Space to calculate and record this value is provided on the recording form. Remember your result for the minimal detectable change will be in the units of the original measurement. You may also calculate the minimal detectable change by multiplying the standard deviation of the test-retest differences by 1.65. 6\. Repeat this exercise with other joints and motions after you have learned the testing procedures. CHAPTER 3 Validity and Reliability of Goniometric Measurement 59 **RECORDING FORM FOR EXERCISE 10: CALCULATION OF THE STANDARD ERROR** **OF MEASUREMENT AND MINIMAL DETECTABLE CHANGE** Follow the steps outlined in Exercise 10. Use this form to record your measurements and the result of your calculations. Subject 1. Name Date Subject 2. Name Subject 3. Name Joint and Motion Subject 4. Name Right or Left Side Subject 5. Name Passive or Active Motion Examiner. Name Type of Goniometer Subject *x y* (*x* -- *y*) (*x* -- *y*) -- (*X* -- diff) \[(*x* -- *y*) -- (*X* -- diff)\]2 1 2 3 4 5 *n* = 5 Σ(*x* -- *y*) = Σ\[(*x* -- *y*) -- (*X* *--* diff)\]2 = Mean of summed test-retest differences (*x* -- *y*) = = Σ − *X* = *y* *n* (*x y*) diff Standard deviation of test-retest differences (SDdiff) = ⎛ Σ\[ *y* \] ⎝ ⎜ ⎛ ⎝ ⎞ ⎠ ⎟ ⎞ ⎠ = *x x* (*n* −1) 2 Standard error of measurement (SEM) = = SD 2 diff Minimal detectable change = MDC = SEM ⋅ 2 ⋅ 1.65 = 90 Or use equation: MDC = SD ⋅ 1.65 = 90 diff 60 PART I Introduction to Goniometry and Muscle Length Testing Exercise 11 Calculation of the Pearson Product-Moment Correlation Coefficient, Standard Error of Measurement, and Minimal Detectable Change This exercise describes the calculation of the Pearson product-moment correlation coefficient (*r*) from repeated measurements of five subjects. This correlation coefficient is then used to determine the standard error of measurement (SEM) and minimal detectable change (MDC). Alternatively, the learner may use the intraclass correlation coefficient (ICC) for the calculation of the SEM and MDC. Calculation of the ICC, however, is best obtained from statistical software instead of calculation by hand. 1\. Select five subjects and a universal goniometer. (If you have completed Exercise 10, you may wish to use the same data. In this case, record the *x* and *y* values from Exercise 10 as described in Step 4 and then begin the calculations with Step 5.) 2\. Measure elbow flexion ROM on each subject once, following the steps outlined in Chapter 2, Exercise 7. 3\. After a short rest, repeat the measurement of the same elbow flexion ROM on the five subjects, using the same goniometer and following the steps outlined in Chapter 2, Exercise 7. Avoid referring to the value for the first measurement when obtaining the second measurement. 4\. Record each measurement on the recording form (see opposite page) in the column labeled *x* for the first measurement with each subject, and in the column labeled *y* for the second measurement with each subject*.* 5\. To gain an understanding of the statistics used to evaluate relative reliability calculate the **Pearson** **product-moment correlation coefficient** by completing the following steps. a\. Add the measurements together to determine the sum of the measurements in the *x* and *y* columns. The symbol for summation is Σ. Record the sum at the bottom of the column labeled *x* and the column labeled *y*. b\. To determine the **mean,** divide this sum by 5, which is the number of measurements. The number of measurements is denoted by *n.* The mean is denoted by *x --* and *y--.* Space to calculate the means is provided on the recording form. c\. To determine the **reliability correlation coefficient (Pearson's *r*),** first subtract the mean from each measurement for each subject and record the results in the appropriate columns \[(*x -- x--*)2 and (*y -- y--*)2, respectively\]. d\. Multiply each value for (*x -- x--*) by (*y -- y--*) and record the results in the column labeled (*x -- x--*) (*y -- y--*). e\. Square each of the numbers in the columns labeled (*x -- x--*) and (*y -- y--*) and record the results in the appropriate columns \[(*x -- x--*)2 and (*y -- y--*)2, respectively\]. f\. Add the five numbers in the columns (*x -- x--*) (*y -- y--*), (*x -- x--*)2, and (*y -- y--*)2 to determine their sums. Record the sums in each respective column in the space provided beneath the five scores. g\. Calculate the square roots of Σ(*x -- x--*)2 and Σ(*y -- y--*)2. Record these values in the space provided. h\. Calculate the correlation coefficient (*r*) by dividing Σ(*x -- x--*) (*y -- y--*) by the product of √Σ(*x -- x--*)2 and √Σ(*y -- y--*)2. Space to calculate and record this value is provided on the recording form. 6\. To gain an understanding of statistics used to evaluate absolute reliability, calculate the standard error of measurement and minimal detectable change (MDC90) by completing the following steps. a\. To determine the **standard error of measurement,** next determine the **standard deviation** **for x (sdx)** and the **standard deviation of y (sdy).** Use Σ(*x -- x--*)2 and Σ(*y -- y--*)2, which were previously calculated. Divide these sums by 4, which is the number of measurements minus 1 (*n* − 1). Then find the square roots of these numbers. Record these values in the space provided. b\. To obtain the **pooled standard deviation,** square the values for the standard deviation for *x* (sdx) and the standard deviation of *y* (sdy) and then add the square values together. Divide this sum by 2 (which is the number of times each subject was measured). Then obtain the square root of this value. Calculate and record this value in the space provided (SDp). c\. Multiply the pooled standard deviation by the square root of 1 minus the correlation coefficient (*r*). Space to calculate and record this value is provided on the recording form. Remember that your result for the standard error of measurement will be in the units of the original measurement, which in this case is in degrees. CHAPTER 3 Validity and Reliability of Goniometric Measurement 61 d\. To determine the **minimal detectable change** (MDC90), multiply the standard error of measurement by the square root of 2 and then multiply this value by 1.65. Space to calculate and record this value is provided on the recording form. Remember that your result for the minimal detectable change will be in the units of the original measurement. 7\. Repeat this exercise with other joints and motions after you have learned the testing procedures. **RECORDING FORM FOR EXERCISE 11: CALCULATION OF THE PEARSON PRODUCTMOMENT** **CORRELATION COEFFICIENT, STANDARD ERROR OF MEASUREMENT,** **AND MINIMAL DETECTABLE CHANGE** Follow the steps outlined in Exercise 11. Use this form to record your measurements and the result of your calculations. Subject 1. Name Date Subject 2. Name Subject 3. Name Joint and Motion Subject 4. Name Right or Left Side Subject 5. Name Passive or Active Motion Examiner. Name Type of Goniometer Subject *x y* (*x* -- *x* -- ) (*y* -- *y*--) (*x* -- *x* -- ) (*y* -- *y*--) (*x* -- *x* -- )2 (*y* -- *y*--)2 1 2 3 4 5 *n* = 5 Σ = Σ *y* = Σ(*x* − *x* )( *y y*) = *x* Σ(*x* − )2 = *y y* 2 Σ( − ) = Σ( − *x*)2 = Σ(*y* − *y*)2 = Mean of first 3 measurements (*x*) = = Σ *x* = *x* *n* Mean of second 3 measurements (*y*) = = Σ *y* = *y* *n* = = Σ − Σ − Σ *r* = *y y* *y y* Pearson product-moment correlation coefficient ( *x*)(− *y*) ( *x*)2 Σ(− *y*)2 Standard deviation of *x* = = Σ − **sd** = ( *x*) (*n* −1) **x** 2 Standard deviation of *y* = = Σ − = *y y* **sd** (*y*) (*n* −1) **y** 2 Pooled standard deviation = = \+ SD = sd sd 2 p x 2 y 2 Standard error of the measurement = SEM= SD 1− *r* = p Minimal detectable change = MDC = SEM ⋅ 2 ⋅ 1.65 = 90 R E F E R E N C E S 1\. Rothstein, JM, and Echternach, JL: Primer on Measurement: An Introductory Guide to Measurement Issues. American Physical Therapy Association, Alexandria, VA, 1993. 2\. Guide to Physical Therapist Practice 3.0. Alexandria, VA, American Physical Therapy Association, 2014. Available at: http://guidetopractice.apta.org. 3\. Portney, LG, and Watkins, MP: Foundations of Clinical Research: Applications to Practice, ed 3. FA Davis, Philadelphia, PA, 2015. 4\. Sim, J, and Arnell, P: Measurement validity in physical therapy research. Phys Ther 73:102, 1993. 5\. Gajdosik, RL, and Bohannon, RW: Clinical measurement of range of motion: Review of goniometry emphasizing reliability and validity. Phys Ther 67:1867, 1987. 6\. Piriyaprasarth, P, and Morris ME: Psychometric properties of measurement tools for quantifying knee joint position and movement: A systematic review. Knee 14:2, 2007. 7\. Milani, P, et al: Mobile smartphone applications for body position measurement in rehabilitation: A review of goniometric tools. PM&R 6: 1038, 2014. 8\. Gogia, PP, et al: Reliability and validity of goniometric measurements at the knee. Phys Ther 67:192, 1987. 9\. Enwemeka, CS: Radiographic verifi cation of knee goniometry. Scand J Rehabil Med 18:47, 1986. 10\. Ahlback, SO, and Lindahl, O: Sagittal mobility of the hip-joint. Acta Orthop Scand 34:310, 1964. 11\. Kato, M, et al: The accuracy of goniometric measurements of proximal interphalangeal joints in fresh cadavers: Comparison between methods of measurement, types of goniometers, and fi ngers. J Hand Ther 20:12, 2007. 12\. Chen, J, et al: Meta-analysis of normative cervical motion. Spine 24:1571, 1999. 13\. Herrmann, DB: Validity study of head and neck fl exion-extension motion comparing measurements of a pendulum goniometer and roentgenograms. J Orthop Sports Phys Ther 11:414, 1990. 14\. Ordway, NR, et al: Cervical sagittal range-of-motion analysis using three methods: Cervical range-of-motion device, space, and radiography. Spine 22:501, 1997. 15\. Tousignant, M, et al: Criterion validity of the cervical range of motion (CROM) goniometer for cervical fl exion and extension. Spine 25:324, 2000. 16\. Tousignant, M, et al: Criterion validity of the cervical range of motion (CROM) device for rotational range of motion on healthy adults. J Orthop Sports Phys Ther 35:242, 2006. 17\. Macrae, JF, and Wright, V: Measurement of back movement. Ann Rheum Dis 28:584, 1969. 18\. Portek, I, et al: Correlation between radiographic and clinical measurement of lumbar spine movement. Br J Rheumatol 22:197, 1983. 19\. Burdett, RG, Brown, KE, and Fall, MP: Reliability and validity of four instruments for measuring lumbar spine and pelvic positions. Phys Ther 66:677, 1986. 20\. Mayer, TG, et al: Use of noninvasive techniques for quantifi cation of spinal range-of-motion in normal subjects and chronic low-back dysfunction patients. Spine 9:588, 1984. 21\. Saur, PM, et al: Lumbar range of motion: Reliability and validity of the inclinometer technique in the clinical measurement of trunk fl exibility. Spine 21:1332, 1996. 22\. Samo, DG, et al: Validity of three lumbar sagittal motion measurement methods: Surface inclinometers compared with radiographs. J Occup Environ Med 39:209, 1997. 23\. Vasen, AP, et al: Functional range of motion of the elbow. J Hand Surg Br 20A:288, 1995. 24\. Cooper, JE, et al: Elbow joint restriction: Effect on functional upper limb motion during performance of three feeding activities. Arch Phys Med Rehabil 74:805, 1993. 25\. Nelson, DL: Functional wrist motion. Hand Clin 13:83, 1997. 26\. Hermann, KM, and Reese, CS: Relationships among selected measures of impairment, functional limitation, and disability in patients with cervical spine disorder. Phys Ther 81:903, 2001. 27\. Triffi tt, PD: The relationship between motion of the shoulder and the stated ability to perform activities of daily living. J Bone Joint Surg 80:41, 1998. 28\. Wagner, MB, et al: Assessment of hand function in Duchenne muscular dystrophy. Arch Phys Med Rehabil 74:801, 1993. 29\. Waddell, G, et al: Objective clinical evaluation of physical impairment in chronic low back pain. Spine 17:617, 1992. 30\. World Health Organization: International Classifi cation of Functioning, Disability and Health: ICF. World Health Organization, Geneva, 2001. 31\. Scalzitti, DA: Examination of Function. In O'Sullivan, SB, Schmitz, TJ, and Fulk, GD (eds): Physical Rehabilitation, ed 6. FA Davis, Philadelphia, 2014. 32\. Moore, ML: Clinical Assessment of Joint Motion. In Basmajian, JV (ed): Therapeutic Exercise, ed 3. Williams & Wilkins, Baltimore, 1978. 33\. Miller, PJ: Assessment of Joint Motion. In Rothstein, JM (ed): Measurement in Physical Therapy. Churchill Livingstone, New York, 1985. 34\. Lea, RD, and Gerhardt, JJ: Current concepts review: Range-of-motion measurements. J Bone Joint Surg Am 77:784, 1995. 35\. Williams, MA, et al: A systematic review of reliability and validity studies of methods for measuring active and passive cervical range of motion. J Manipulative Physiol Ther 33:138, 2010. 36\. van de Pol R, et al: Inter-rater reliability for measurement of passive physiological range of motion of upper extremity joints is better if instruments are used: A systematic review. J Physiother 56:7, 2010. 37\. van Trijffel E, et al: Inter-rater reliability for measurement of passive physiological movements in lower extremity joints is generally low: A systematic review. J Physiother 56:223, 2010. 38\. Grohmann, JE: Comparison of two methods of goniometry. Phys Ther 63:922, 1983. 39\. Hamilton, GF, and Lachenbruch, PA: Reliability of goniometers in assessing fi nger joint angle. Phys Ther 49:465, 1969. 40\. Boone, DC, et al: Reliability of goniometric measurements. Phys Ther 58:1355, 1978. 41\. Pandya, S, et al: Reliability of goniometric measurements in patients with Duchenne muscular dystrophy. Phys Ther 65:1339, 1985. 42\. Bovens, AM, et al: Variability and reliability of joint measurements. Am J Sport Med 18:58, 1990. 43\. Hellebrandt, FA, Duvall, EN, and Moore, ML: The measurement of joint motion. Part III: Reliability of goniometry. Phys Ther Rev 29:302, 1949. 44\. Low, JL: The reliability of joint measurement. Physiotherapy 62:227, 1976. 45\. Greene, BL, and Wolf, SL: Upper extremity joint movement: Comparison of two measurement devices. Arch Phys Med Rehabil 70:299, 1989. 46\. Tucci, SM, et al: Cervical motion assessment: A new, simple and accurate method. Arch Phys Med Rehabil 67:225, 1986. 47\. Youdas, JW, Carey, JR, and Garrett, TR: Reliability of measurements of cervical spine range of motion: Comparison of three methods. Phys Ther 71:2, 1991. 48\. Fitzgerald, GK, et al: Objective assessment with establishment of normal values for lumbar spine range of motion. Phys Ther 63:1776, 1983. 49\. Nitschke, JE, et al: Reliability of the American Medical Association Guides' model for measuring spinal range of motion. Spine 24:262, 1999. 50\. Mayerson, NH, and Milano, RA: Goniometric measurement reliability in physical medicine. Arch Phys Med Rehabil 65:92, 1984. 51\. Watkins, MA, et al: Reliability of goniometric measurements and visual estimates of knee range of motion obtained in a clinical setting. Phys Ther 71:90, 1991. 52\. Riddle, DL, Rothstein, JM, and Lamb, RL: Goniometric reliability in a clinical setting: Shoulder measurements. Phys Ther 67:668, 1987. 53\. Ekstrand, J, et al: Lower extremity goniometric measurements: A study to determine their reliability. Arch Phys Med Rehabil 63:171, 1982. 54\. Rothstein, JM, Miller, PJ, and Roettger, RF: Goniometric reliability in a clinical setting: Elbow and knee measurements. Phys Ther 63:1611, 1983. 55\. Solgaard, S, et al: Reproducibility of goniometry of the wrist. Scand J Rehabil Med 18:5, 1986. 56\. Lovell, FW, Rothstein, JM, and Personius, WJ: Reliability of clinical measurements of lumbar lordosis taken with a fl exible rule. Phys Ther 69:96, 1989. 57\. Bartlett, JD, et al: Hip fl exion contractures: A comparison of measurement methods. Arch Phys Med Rehabil 66:620, 1985. 58\. Jonson, SR, and Gross, MT: Intraexaminer reliability, interexaminer reliability, and mean values for nine lower extremity skeletal measures in healthy naval midshipmen. J Orthop Sports Phys Ther 25:253, 1997 59\. Elveru, RA, Rothstein, JM, and Lamb, RL: Goniometric reliability in a clinical setting. Phys Ther 68:672, 1988. 60\. Diamond, JE, et al: Reliability of a diabetic foot evaluation. Phys Ther 69:797, 1989. 61\. MacDermid, JC, et al: Intratester and intertester reliability of goniometric measurement of passive lateral shoulder rotation. J Hand Ther 12:187, 1999. CHAPTER 3 Validity and Reliability of Goniometric Measurement 63 62\. Armstrong, AD, et al: Reliability of range-of-motion measurement in the elbow and forearm. J Shoulder Elbow Surg 7:573, 1998. 63\. Boon, AJ, and Smith, J: Manual scapular stabilization: Its effect on shoulder rotational range of motion. Arch Phys Med Rehabil 81:978, 2000. 64\. Horger, MM: The reliability of goniometric measurements of active and passive wrist motions. Am J Occup Ther 44:342, 1990. 65\. Ellis, B, Bruton, A, and Goddard, JR: Joint angle measurement: A comparative study of the reliability of goniometry and wire tracing for the hand. Clin Rehabil 11:314, 1997. 66\. Pellecchia, GL, and Bohannon, RW: Active lateral neck fl exion range of motion measurements obtained with a modifi ed goniometer. Reliability and estimates of normal. J Manipulative Physiol Ther 21:443, 1998. 67\. Nilsson, N: Measuring passive cervical motion: A study of reliability. J Manipulative Physiol Ther 18:293, 1995. 68\. Williams, R, et al: Reliability of the modifi ed-modifi ed Schober and double inclinometer methods for measuring lumbar fl exion and extension. Phys Ther 73:26, 1993. 69\. Defi baugh, JJ: Measurement of head motion. Part II: An experimental study of head motion in adult males. Phys Ther 44:163, 1964. 70\. Balogun, JA, et al: Inter- and intratester reliability of measuring neck motions with tape measure and Myrin Gravity-Reference Goniometer. J Orthop Sports Phys Ther 10:248, 1989. 71\. Capuano-Pucci, D, et al: Intratester and intertester reliability of the cervical range of motion. Arch Phys Med Rehabil 72:338, 1991. 72\. LaStayo, PC, and Wheeler, DL: Reliability of passive wrist fl exion and extension goniometric measurements: A multicenter study. Phys Ther 74:162, 1994. 73\. Macedo, LG, and Magee DJ: Effects of age on passive range of motion of selected peripheral joints in healthy adult females. Physiother Theory Pract 25:145, 2009. 74\. Mayer, TG, et al: Spinal range of motion. Spine 22:1976, 1997. 75\. Cobe, HM: The range of active motion at the wrist of white adults. J Bone Joint Surg Br 10:763, 1928. 76\. Hewitt, D: The range of active motion at the wrist of women. J Bone Joint Surg Br 10:775, 1928. 77\. Palmer, ML, and Epler, M: Clinical Assessment Procedures in Physical Therapy, ed 2. JB Lippincott, Philadelphia, 1998. 78\. Clarkson, HM: Musculoskeletal Assessment: Joint Range of Motion and Manual Muscle Strength, ed 2. Williams & Wilkins, Baltimore, 2000. 79\. Robson, P: A method to reduce the variable error in joint range measurement. Ann Phys Med 8:262, 1966. 80\. Mullaney, MJ, et al: Reliability of shoulder range of motion comparing a goniometer to a digital level. Physiother Theory Pract 26:327, 2010. 81\. Goodwin, J, et al: Clinical methods of goniometry: A comparative study. Disabil Rehabil 14:10, 1992. 82\. Petherick, M, et al: Concurrent validity and intertester reliability of universal and fl uid-based goniometers for active elbow range of motion. Phys Ther 68:966, 1988. 83\. Brown, A, et al: Validity and reliability of the Dexter hand evaluation and therapy system in hand-injured patients. J Hand Ther 13:37, 2000. 84\. Weiss, PL, et al: Using the Exos Handmaster to measure digital range of motion: Reliability and validity. Med Eng Phys 16:323, 1994. 85\. Clapper, MP, and Wolf, SL: Comparison of the reliability of the Ortho Ranger and the standard goniometer for assessing active lower extremity range of motion. Phys Ther 68:214, 1988. 86\. Ellison, JB, Rose, SJ, and Sahrman, SA: Patterns of hip rotation: A comparison between healthy subjects and patients with low back pain. Phys Ther 70:537, 1990. 87\. Rheault, W, et al: Intertester reliability and concurrent validity of fl uid- based and universal goniometers for active knee fl exion. Phys Ther 68:1676, 1988. 88\. Rome, K, and Cowieson, F: A reliability study of the universal goniometer, fl uid goniometer, and electrogoniometer for the measurement of ankle dorsifl exion. Foot Ankle Int 17:28, 1996. 89\. Reynolds, PM: Measurement of spinal mobility: A comparison of three methods. Rheumatol Rehabil 14:180, 1975. 90\. Miller, MH, et al: Measurement of spinal mobility in the sagittal plane: New skin distraction technique compared with established methods. Br J Rheumatol 11:4, 1984. 91\. Gill, K, et al: Repeatability of four clinical methods for assessment of lumbar spinal motion. Spine 13:50, 1988. 92\. Lindahl, O: Determination of the sagittal mobility of the lumbar spine. Acta Orthop Scand 37:241, 1966. 93\. Mayer, RS, et al: Variance in the measurement of sagittal lumbar range of motion among examiners, subjects, and instruments. Spine 20:1489, 1995. 94\. Chen, SP, et al: Reliability of the lumbar sagittal motion measurement methods: Surface inclinometers. J Occup Environ Med 39:217, 1997. 95\. Breum, J, Wilberg, J, and Bolton, JE: Reliability and concurrent validity of the BROM II for measuring lumbar mobility. J Manipulative Physiol Ther 18:497, 1995. 96\. Colton, T: Statistics in Medicine. Little, Brown, Boston, 1974. 97\. Fetters, L, and Tilson, J. Evidence Based Physical Therapy. FA Davis, Philadelphia, 2013. 98\. Di Fabio, RD. Essentials of Rehabilitation Research: A Statistical Guide to Clinical Practice. FA Davis, Philadelphia, 2013. 99\. Bland, JM, and Altman, DG: Measurement error and correlation coeffi - cients \[statistics notes\]. BMJ 313:41, 1996. 100\. Lahey, MA, Downey, RG, and Saal, FE: Intraclass correlations: There's more there than meets the eye. Psychol Bull 93:586, 1983. 101\. Shrout, PE, and Fleiss, JL: Intraclass correlations: Uses in assessing rater reliability. Psychol Bull 86:420, 1979. 102\. Weir, JP: Quantifying test-retest reliability using the intraclass correlation coeffi cient and the SEM. J Strength Cond Res 19:231, 2005. 103\. Stratford, P: Reliability: Consistency or differentiating among subjects? \[letters to the editor\]. Phys Ther 69:299, 1989. 104\. Bland, JM, and Altman, DG: Measurement error \[statistics notes\]. BMJ 312:1654, 1996. 105\. Rothstein, JM: Measurement and Clinical Practice: Theory and Application. In Rothstein, JM (ed): Measurement in Physical Therapy. Churchill Livingstone, New York, 1985, p 41. 106\. DuBois, PH: An Introduction to Psychological Statistics. Harper & Row, New York, 1965, p 401. 107\. Riddle, DL, and Stratford, PW: Is This Change Real? Interpreting Patient Outcomes in Physical Therapy. FA Davis, Philadelphia, 2013. 108\. Bartko, JJ: Rationale for reporting standard deviations rather than standard errors of the mean. Am J Psychiatry 142:1060, 1985. 109\. Stratford, P: Use of the standard error as a reliability index of interest: An applied example using elbow fl exor strength data. Phys Ther 77:745, 1997. 110\. Eliasziw, M, et al: Statistical methodology for the concurrent assessment of interrater and intrarater reliability: Using goniometric measurement as an example. Phys Ther 74:777, 1994. 111\. Stratford, PW, and Riddle, DL: When minimal detectable change exceeds a diagnostic test-based threshold change value for an outcome measure: Resolving the confl ict. Phys Ther 92:1338, 2012. 112\. Haley, SM, and Fragala-Pinkham, MA: Interpreting change scores of tests and measures used in physical therapy. Phys Ther 86:735, 2006.

Chapter 3: Goniometry Validity & Reliability PDF

Document Details

Tags

Related

Summary

Full Transcript