RELIABILITY and VALIDITY of MEASUREMENTS of RANGE of MOTION and MUSCLE LENGTH TESTING of the LOWER EXTREMITY

Chapter 15


RELIABILITY and VALIDITY of MEASUREMENTS of RANGE of MOTION and MUSCLE LENGTH TESTING of the LOWER EXTREMITY



RELIABILITY AND VALIDITY OF LOWER EXTREMITY GONIOMETRY


Chapters 11 through 14 presented techniques for measuring range of motion of joints and length of muscles in the lower extremities.> When selecting appropriate techniques for measuring range of motion and muscle length, one must consider whether the technique selected has been shown to be reliable and valid.15 This chapter presents information regarding the reliability and validity (when available) of techniques used for measuring lower extremity range of motion and muscle length. In accordance with the discussion of the preferred methods of analyzing reliability presented in Chapter 2, only those studies that examined reliability using the intraclass correlation coefficient (ICC) or Pearson product moment correlation coefficient (Pearson’s r) with a follow-up test are included.


As is apparent from the information and tables that follow, seldom is one method of goniometry or muscle length testing shown to be clearly preferable in terms of reliability as demonstrated by more than one investigator. In fact, many studies are so vaguely described as to be unrepeatable by others, and studies that are repeated in some form often produce conflicting results. Therefore, unless obvious conclusions can be made regarding the efficacy of one technique over another, no interpretive comments are made regarding the information presented in this chapter. Rather, the chapter serves as a reference to the reader and, it is hoped, makes obvious the areas of research in lower extremity range of motion and muscle length testing that have yet to be addressed.



Hip Flexion/Extension


Several studies that examined the reliability of hip flexion and extension range of motion have been published. Using a combination of the Thomas and Mundale techniques (see Chapters 11 and 14 for a description of the Mundale and Thomas techniques), Stuberg and colleagues51 measured the reliability of measurements of passive hip flexion with the knee extended (straight leg raise) and passive hip extension in 20 children, aged 5 to 21 years, with moderate to severe hypertonicity. To examine inter-rater reliability, three pediatric physical therapists repeated each of the measurements three times on each child in one testing session, using a blinded goniometer. Measurements were repeated 5 to 7 days later on five of the subjects to determine intrarater reliability. A two-way analysis of variance (ANOVA) for repeated measures was used to determine intrarater and inter-rater reliability for each motion. Analysis of intrarater reliability showed no significant difference between the three measures taken by a single examiner in one session, and intrarater error was calculated at less than or equal to 5 degrees for most measurements, based on the 95% confidence interval. Conversely, significant inter-rater variation was found in hip flexion and extension measurements.


Active hip flexion and extension, along with 26 other motions of the upper and lower extremities, were measured in 60 adults, aged 60 to 84 years, by Walker and colleagues.53 Techniques recommended by the American Academy of Orthopaedic Surgeons4 (AAOS) were used for all measurements. Before data were collected, intrarater reliability was determined using four subjects. Although the exact number of motions measured to determine reliability is unclear from the procedure, the authors reported a Pearson’s r for intrarater reliability greater than .81 for all hip motions (Table 15-1). Mean error between measurements was calculated to be 5 degrees + 1 degree.



In a study designed to compare reliability of the Orthoranger (an electronic, computerized goniometer) and the universal goniometer, Clapper and Wolf13 examined intrarater reliability of active hip flexion and extension goniometry, in addition to eight other motions of the lower extremities. Twenty healthy adults were included in the examination of reliability. The specific technique for measuring hip flexion and extension was not delineated in the article, so comparison with other studies is difficult. Intraclass correlation coefficients reported for hip flexion and extension were .95 and .83, respectively (see Table 15-1).


Owing to the variation in measuring techniques for hip flexion and extension, reliability of measurement of these two motions would be expected to vary, depending on the technique used. Two different groups of investigators compared reliability characteristics of different methods of measuring hip flexion or extension. Bartlett et al7 measured hip extension in healthy children and in children with meningomyelocele or spastic diplegia. All subjects were between the ages of 4 and 20 years. Four different positioning techniques were compared: AAOS (contralateral hip flexed), Mundale, pelvifemoral angle, and Thomas (see Chapters 11 and 14 for a description of techniques). Intrarater and inter-rater reliability was reported using Pearson’s r. Values for intrarater reliability ranged from .63 for the Mundale test in the group with spastic diplegia to .93 for the AAOS test in the group with meningomyelocele (see Table 15-1). Single-rater error in the group of healthy children was reported as 5 degrees when the AAOS and Thomas techniques were used, and 10 degrees when the Mundale and pelvifemoral angle techniques were used. Inter-rater reliability was generally lower than intrarater reliability, and correlation values ranged from .70 for the Thomas test in patients with spastic diplegia to .92 for the AAOS technique in patients with meningomyelocele (Table 15-2). Rater error was calculated based on the 95% confidence interval for the mean difference between raters, and was reported as 10 degrees for all techniques except the Mundale (14 degrees) in children with meningomyelocele, 10 degrees for the Mundale and pelvifemoral angle techniques in healthy children, 3 degrees for the AAOS and Thomas techniques in healthy children, and 11.5 degrees and 12.2 degrees, respectively, for the AAOS and Thomas techniques in patients with spastic diplegia.



A second group of investigators2 measured hip flexion in 20 healthy adults of unstated age using both the AAOS technique (but with the contralateral hip extended) and the pelvifemoral angle technique, as well as hip extension in the same 20 healthy adults using the pelvifemoral angle technique. Two examiners performed the same measurements in each subject to examine variability between raters (intrarater reliability was not considered). Although the investigators did not use inferential statistics to report inter-rater reliability, raw data were reported, allowing the reader to calculate the ICCs for inter-rater reliability for each test. Intraclass correlation coefficients and the standard error of the measurement (SEMm) were calculated by the author of this text for each set of data (hip flexion, AAOS technique with contralateral hip extended; hip flexion, pelvifemoral angle technique; hip extension, pelvifemoral angle technique). Intraclass correlation coefficients were calculated using a two-way random effects model with absolute agreement. The ICCs, which are reported in Table 15-2, indicate higher inter-rater reliability for measuring hip flexion when the AAOS technique rather than the pelvifemoral angle technique was used in this group of examiners. Reliabilities for measuring hip extension using the pelvifemoral angle technique were similar to those obtained in measuring hip flexion using the same technique. The SEMm for hip flexion was 4.2 degrees using the pelvifemoral angle technique and 5.2 degrees using the AAOS technique with the contralateral hip extended. When hip extension was performed using the pelvifemoral angle technique, the SEMm was 1.9 degrees.



Hip Abduction/Adduction


As was true in hip flexion and extension, several studies have examined the reliability of hip abduction and adduction range of motion measurements. Intrarater and inter-rater reliability of active hip abduction measurement, along with five other motions of the upper and lower extremities, was examined in a group of 12 healthy adult males aged 26 to 54 years.9 All motions were measured in each subject three times per session by each of four different physical therapists. Values were reported as .75 for intrarater reliability and .55 for inter-rater reliability (ICC) (Tables 15-3 and 15-4). Repeated measures ANOVA revealed significant intrarater variation for two of the four examiners, and significant inter-rater variation among all four examiners, for measurements of hip abduction.




Other investigators who have examined the reliability of active hip abduction and adduction include Clapper and Wolf13 and Walker et al,53 although these investigators examined only intrarater reliability. Both studies have been described previously, and each used a different statistical method for reporting reliability. Clapper and Wolf 13 reported ICC levels of .86 and .80 for hip abduction and adduction, respectively, whereas Walker et al53 used Pearson’s r and reported values “greater than .81” for both hip abduction and hip adduction (see Table 15-3) and a mean error between repeated measures of 5 degrees.


The reliability of passive hip abduction and adduction range of motion has been studied primarily in the pediatric population. However, the intrarater reliability of measurement of passive hip abduction, as well as several other hip motions, also has been examined in a group of patients with osteoarthritis of the hip.43 Of the hip motions measured, only hip abduction was measured with a goniometer, specifically a Lafayette Gollehon extendable goniometer. Twenty-two patients (aged 50 to 84 years) participated in this study, in which measurements were taken on two separate occasions by a single examiner with 7 years of experience as a physical therapist. Data analysis included calculation of ICCs (model 2.2), standard error of the measurement, 95% confidence intervals, and minimum detectable change. For hip abduction measurements, intrarater reliability was .94 with a standard error of measurement of 3.2 degrees (see Table 15-3).


Several studies involving the reliability of passive hip abduction and adduction range of motion in children have been published. Inter-rater reliability was examined for hip abduction and adduction measurements in a subgroup of 54 healthy infants aged 12 hours to 6 days old.16 The subgroup consisted of nine infants in whom passive hip abduction and adduction were measured. Abduction was measured twice, once with the hip in 0 degrees of extension, and once with the hip flexed to 90 degrees. Adduction was measured with the hip in 0 degrees of extension. Seven other motions of the lower extremities also were examined in this study (see the remainder of this chapter for other motions of the lower extremity). Specific goniometric alignment and techniques were difficult to discern from the description of the study. Inter-rater reliabilities (Pearson’s r) ranged from a high of .97 for hip abduction with the hip extended in the left lower extremity, to a low of .57 for hip abduction with the hip flexed in the same extremity (see Table 15-4). The SEMm from the Drews et al16 study (calculated by the author of this text from data provided) ranged from 1.7 degrees for left hip abduction with the hip extended to 6.4 degrees for left hip abduction with the hip flexed.


Much lower reliability for the measurements of hip abduction and adduction was reported by Owen and colleagues,40 who examined the reliability of goniometric measurements of all motions of the hip in a group of 82 children (aged 4 to 10 years) at 15 and 24 months post femoral shaft fracture. Subjects from four separate clinical sites were included in the study. Measurements were taken of both hips of each subject by an undefined number of examiners with undefined levels of experience, using AAOS measurement techniques and the examiner’s choice of goniometer. Reliability was calculated using ICCs and 95% confidence intervals. Inter-rater reliability for hip abduction ranged from .28 to .43 with 95% confidence intervals of 24 to 32 degrees. For hip adduction, ICCs ranged from .19 to .20, with 95% confidence intervals ranging from 15 to 20 degrees (see Table 15-4).


At least three groups of investigators have examined the reliability of range of motion measurements of the hip and other joints in children with spastic cerebral palsy. Fosang and colleagues19 investigated the reliability of selected range of motion measurements, including hip abduction, as part of a larger study of reliability of various lower extremity clinical measures. Eighteen children (aged 2 to 10 years) participated in the study, in which six experienced physical therapists measured the spastic hip of each of the subjects twice over 6 days. All examiners received a day of training in measurement techniques before the time of data collection. Intrarater and inter-rater reliability was calculated using ICCs. Standard errors of the measurements and 95% confidence intervals also were calculated. Intraclass correlation coefficients ranged from .58 to .83 for intrarater reliability (see Table 15-3), and from .62 to .73 for inter-rater reliability (see Table 15-4). Standard errors of measurements for individual raters ranged from 3.7 to 6.9 degrees, and between examiners, ranges were from 5.3 to 5.6 degrees. McWhirk and Glanzman35 investigated the ability of an experienced and an inexperienced examiner to achieve similar range of motion measurements of five lower extremity motions in children with spastic cerebral palsy. Forty-six lower extremities were measured in 25 children by two physical therapists—one with 10 years and one with a single year of experience. Standardized measurement techniques were reviewed and used during the data collection process. In addition, each examiner served as an assistant for the other by stabilizing the extremities as needed during the measuring process. Inter-rater reliability for hip abduction measurements was high with an ICC of .91 (see Table 15-4) and a 95% confidence interval of 3.57 ± 1.35 degrees.


Stuberg et al51 examined intrarater and inter-rater reliability for hip abduction and adduction using a two-way ANOVA for repeated measures (see the Hip Flexion/Extension Reliability section of this chapter). No significant difference was found between the three measures of hip abduction or adduction taken by a single examiner, and intrarater error was calculated at less than or equal to 5 degrees for most measurements, based on the 95% confidence interval. Significant within-session inter-rater variation was noted for hip adduction but not for abduction, although across-session inter-rater variation was significant for both measures.



Hip Medial-Lateral Rotation


Intrarater reliability of hip rotation measurements has been reported by two groups of investigators, whose studies have been described previously (see the Hip Flexion/Extension and Hip Abduction/Adduction sections of this chapter).13,53 One of the studies indicated that goniometric measurements were performed as described by the AAOS; the second study did not describe the goniometric techniques used.13 However, in neither of the above studies can the relative flexed or extended position of the hip be determined, as the AAOS guidelines describe techniques for measuring hip rotation with the hip flexed or extended.4,26 Intrarater reliability of hip medial and lateral rotation measurements was reorted as “greater than .81” by Walker et al,53 with a mean error between repeated measures of 5 degrees. The study by Clapper and Wolf13 demonstrated lower reliability for hip lateral rotation measurements (.80) than for measurements of hip medial rotation (.92) (Table 15-5).



Aalto and colleagues1 investigated the reliability of passive hip internal rotation range of motion goniometry as part of a larger study designed to examine the effects of passive stretch on the reliability of hip range of motion measurements. Measurements were taken in 20 healthy adults (aged 18 to 45 years) by two experienced physical therapists, using standardized techniques. During the initial measurement session, each examiner measured hip internal rotation range of motion twice following a single passive stretch of the internal rotators, and twice again following eight passive stretches of the internal rotators. During the second measurement session, each examiner repeated the prior measurements a single time per stretching session. Measurements were taken with subjects placed in a sitting position. Reliability was determined through the calculation of ICCs. Intrarater reliability ranged from .93 to .97 within session and from .72 to .97 between sessions (see Table 15-5). Inter-rater reliability ranged from .83 to .91 within session and from .75 to .82 between sessions (Table 15-6).



Two studies that investigated the inter-rater reliability of hip rotation have been described previously16,40 (see the Hip Abduction/Adduction section). Drews et al16 measured passive hip rotation with the hip and knee flexed to 90 degrees and the patient in the supine position; Owen et al40 measured hip rotation with the child supine, the hip in neutral, and the knee flexed to 90 degrees. Drews et al16 reported correlation values (Pearson’s r) for inter-rater reliability of hip medial rotation as .78 on the right and .91 on the left, and for hip lateral rotation as .63 on the right and .79 on the left (see Table 15-6). The SEMm from the Drews et al16 study (calculated by the author of this text from data provided) ranged from 2.8 degrees for medial rotation of the left hip to 7.0 degrees for lateral rotation of the right hip. Inter-rater reliability in the study by Owen et al40 ranged from .06 for measurements of lateral rotation of the right hip to .41 for measurements of medial rotation of the right hip (see Table 15-6).


Simoneau et al49 compared the influence of hip position and sex on active hip rotation in 60 college-age individuals. Hip medial and lateral rotation was measured in each individual by two examiners with the subject in the seated and the prone position. Inter-rater reliabilities were calculated using ICCs and were reported to range from .90 to .94 for all measurements of hip rotation (see Table 15-6), regardless of whether the hip was flexed or extended when the measurement was taken. Calculation of the SEMm from the data provided in the Simoneau et al49 study revealed SEMm values between 2.1 and 2.6 degrees for all measurements of hip rotation, again regardless of whether the hip was flexed or extended during the measurement.



Knee Flexion/Extension


Various investigators have examined the reliability of goniometric measurement of knee flexion and extension. Intrarater reliability of active knee flexion and extension range of motion was examined by several groups,9.–11,13,53 some of whose studies have been described previously (see the Hip section of this chapter). Brosseau et al11 compared the reliability of the universal goniometer with that of the parallelogram goniometer for measuring active knee flexion in 60 healthy college-age adults. Measurements were made with the universal goniometer, using standard landmarks, with subjects positioned supine and with the knee in two separate positions, slightly flexed and flexed at a larger angle. Intraclass correlation coefficients for intrarater reliability for the two positions of knee flexion ranged from a low of .86 to a high of .97 (Table 15-7); inter-rater reliability ranged from a low of .62 to a high of .94 (Table 15-8). Intrarater error ranged from 3.8 to 5.5 degrees, and inter-rater error ranged from 7.3 to 18.1 degrees. The actual level of reliability and the measurement error obtained depended on the examiner performing the measurement, which measurement was used for the analysis, and the position of the knee (less or more flexed). Intrarater and inter-rater reliability levels were higher with the knee more flexed and were lower with the knee in the less flexed position.




In a follow-up investigation, Brosseau and colleagues10 repeated their study of active knee range of motion with slight modifications in a group of 60 subjects (average age, 52 years) with knee restrictions. Reliability of the universal goniometer versus that of the parallelogram goniometer again was compared, but in this study, measurements and radiographs were taken in each subject’s maximally flexed and maximally extended positions. Standard positioning and measurement techniques were used by both examiners, and the range of motion of the knee also was measured on radiographs for the purpose of establishing validity. Reliability of the goniometric measurements was calculated using ICC values, and intrarater reliability was .99 for knee flexion and .97 to .98 for knee extension (see Table 15-7). Inter-rater reliability was .98 for knee flexion measurements and .89 to .93 for knee extension measurements (see Table 15-8). Criterion validity was examined through the calculation of Pearson’s product-moment correlation coefficients, which between the radiograph and the universal goniometer were .98 for knee flexion and between .39 and .44 for knee extension.


Boone et al,9 Clapper and Wolf,13 and Walker et al53 also have examined the intrarater reliability of active knee flexion range of motion using the universal goniometer. Exact positioning of subjects in the Clapper and Wolf13 and Walker et al53 studies was not described in sufficient detail to determine whether subjects were positioned prone or supine, nor were the landmarks that were used listed. Subjects in the Boone et al9 study were positioned supine for knee measurement, but the distal arm of the goniometer was aligned with the tibia rather than with the fibula. Other details of each of these studies have been described previously (see the Hip section of this chapter). Two of these groups of investigators used ICCs for determining intrarater reliability and obtained values ranging from .85 for knee extension to .95 for knee flexion.9,13 Repeated measures ANOVA performed on data for measurements of knee flexion in the Boone et al9 study revealed significant intrarater variation for one of the four examiners and significant inter-rater variation among all four examiners. Walker et al53 calculated reliability using Pearson’s r and obtained values for intrarater reliability of greater than .81 (see Table 15-7) and a mean error between repeated measures of 5 degrees.


Other groups of investigators have examined the reliability of measuring passive, rather than active, knee flexion range of motion. Both Rothstein et al48 and Watkins et al55 examined the reliability of passive knee flexion and extension measurements on patients in a clinical setting. The two groups of 12 and 43 patients, respectively, had been given a variety of diagnoses. No standardization of patient positioning or landmarks was used in either study. Patients in the study conducted by Rothstein et al48 had measurements of knee motion taken with three different goniometers, and reliability using each instrument was compared. Data were analyzed using both Pearson’s r48 and ICCs.48,55 Intrarater reliability for all measurements was high (see Table 15-7), regardless of the type of goniometer used.48


Three additional groups of investigators have examined the reliability of passive knee extension measurements in children. One group measured passive knee extension in a sample of 150 children with Duchenne’s muscular dystrophy;41 the other two groups measured the same motion in children with spastic cerebral palsy.35,51 All studies have been described previously (see the Hip section of this chapter). Intrarater reliability for the measurement of passive knee extension in children with Duchenne’s muscular dystrophy was .93 (ICC)41 (see Table 15-7). Inter-rater reliability for measurement of the same motion was lower (ICC = .78; see Table 15-8), although measurements in this case were taken in a group of children with spastic cerebral palsy.35 Finally, Stuberg et al51 reported no significant differences among the three measurements of passive knee extension taken by a single examiner in each of 20 children with cerebral palsy based on a two-way ANOVA for repeated measures.


Most studies of inter-rater reliability of goniometric measurement of knee motion demonstrate much higher reliability for knee flexion than for knee extension measurements (see Table 15-8). Inter-rater reliabilities at or above .90 (ICC) were obtained in three studies that measured knee flexion range of motion, all of which have been described previously.11,48,55 These studies included measurements of passive and active knee flexion in healthy adults and in adult patients with varied diagnoses.


Gogia et al25 examined inter-rater reliability and validity of flexion and extension measurements of the knee joint in 30 healthy adults between the ages of 20 and 60 years. Subjects were positioned passively in some arbitrarily determined degree of knee flexion, then goniometric measurement of the knee position was taken separately by two examiners. An x-ray was taken of each subject’s knee before the subject was allowed to move. Inter-rater reliability and validity of goniometric measurements were calculated using both the ICC and Pearson’s r. Reliabilities ranged from .98 (Pearson’s r) to .99 (ICC) for inter-rater reliability, and from .97 (Pearson’s r) to .99 (ICC) for validity. This study, as did the study by Brosseau et al,11 as described previously, provided support for the reliability and validity of goniometric measurements of knee flexion (see Table 15-8).


High inter-rater reliability of knee flexion measurements taken with a universal goniometer also was reported by Mitchell and colleagues.38 This group of investigators measured active knee flexion in a group of 20 adults who were healthy or who had a diagnosis of rheumatoid arthritis. A standardized technique was used for aligning the goniometer that involved positioning the proximal and distal arms of the instrument parallel to the anterior aspect of the thigh and the tibia and the axis parallel to the lateral knee joint line. Despite the fact that neither examiner had previous clinical experience in using a goniometer, inter-rater reliabilities (Pearson’s r) were high (.96), with a standard error reported of 0.16 degree (see Table 15-8).


Only three studies were found in which inter-rater reliability levels for knee flexion in adult subjects fell below .90. One study involved examination of inter-rater reliability of knee flexion range of motion in a group of 20 healthy adults.46 Data were analyzed using Pearson’s r to determine correlation and paired t tests to determine whether a significant difference could be discerned between data obtained by the two examiners. Although a Pearson’s r of .87 was obtained, indicating good reliability, paired t tests revealed a significant difference between examiners.46 A second study, in which inter-rater reliability of knee flexion measurements was low, involved the measurement of active knee flexion in a group of 12 healthy adult males aged 25 to 54 years.9 Standardized patient positioning and landmarks for goniometry were used. Inter-rater reliability was calculated using ICCs, and reliability for knee flexion equaled .50.


Although both of the previous studies examined reliability in healthy subjects, Lenssen at al34 looked at reliability of measurements of active and passive knee range of motion in patients in the first few days following total knee arthroplasty. Thirty patients (aged 51 to 77) had both active and passive knee motions measured by two experienced examiners, who used standardized techniques. Inter-rater reliability was calculated using ICCs and ranged from a low of .62 for passive knee flexion to a high of .89 for active knee flexion (see Table 15-8).


In general, values for inter-rater reliability for knee extension goniometry are less than those reported for knee flexion (see Table 15-8). Most of the studies encountered in the literature have examined reliability of passive knee extension measurements.16,34,35,41,48,55 Reports of inter-rater reliability for knee extension goniometry ranged from a low of .58 to a high of .86 when ICCs were used to analyze the data, regardless of whether standardized testing positions and techniques were used during measurement. In fact, the highest inter-rater reliability for knee extension measurements was obtained when examiners were allowed to use their own techniques for measurement,55 although Rothstein et al48 did find that inter-rater reliability of knee extension measurements improved “dramatically” when standardized patient positioning was used. In the single study in which Pearson’s r was used to analyze the data,16 inter-rater reliability for knee extension goniometry was reported as .69 for the left knee and .89 for the right knee. The SEMm from this study (calculated by the author of this text from data provided) was 2.2 degrees for the right knee and 3.7 degrees for the left knee.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Aug 10, 2016 | Posted by in PHYSICAL MEDICINE & REHABILITATION | Comments Off on RELIABILITY and VALIDITY of MEASUREMENTS of RANGE of MOTION and MUSCLE LENGTH TESTING of the LOWER EXTREMITY

Full access? Get Clinical Tree

Get Clinical Tree app for offline access