RELIABILITY and VALIDITY of MEASUREMENT of RANGE of MOTION for the SPINE and TEMPOROMANDIBULAR JOINT

Chapter 10


RELIABILITY and VALIDITY of MEASUREMENT of RANGE of MOTION for the SPINE and TEMPOROMANDIBULAR JOINT


Chapters 8 and 9 described techniques for measurement of the spine and the temporomandibular joint. The purpose of this chapter is to present information on the reliability and validity of these techniques of measurement of the spine. After an extensive review of published literature, each study related to reliability and validity was screened. Inclusion in this chapter was dependent on the study comprising appropriate statistical analyses that included the use of an intraclass correlation coefficient (ICC) or Pearson product moment correlation coefficient (Pearson’s r) with appropriate follow-up procedures (refer to Chapter 2 for further discussion of reliability and validity). In a few instances in which only one study was performed using a specific technique, an article that did not meet the established criteria was nevertheless included in this chapter, but exceptions to the criteria were rare and are specifically noted in the text.


No attempt was made to rate one measurement technique as better or worse than another technique. As was indicated previously, the purpose of this chapter is to present information on the accuracy and reproducibility of measurement techniques of the spine. This information, with the accompanying tables, will enable the reader to make an educated decision as to the most appropriate measurement technique for a particular clinical situation.



THORACIC AND LUMBAR SPINE



TAPE MEASURE



Flexion



Schober Method: Methods for using the tape measure for measuring range of motion of the lumbar spine are numerous. The earliest technique used was the Schober method, in which the distance between the lumbosacral junction and a point 10 cm above the lumbosacral junction was measured before and after the patient flexed and extended his or her spine.34,42 The original Schober method has been modified by changing the landmarks used when range of motion of the spine is measured. These changes in landmarks include measuring the distance between points 5 cm inferior and 10 cm superior to the lumbosacral junction (known as the modified Schober34) and measuring from a point in the center of a line connecting the two posterior superior iliac spines to a mark 15 cm superior to this baseline landmark (the modified-modified Schober).62Chapter 8 provides detailed descriptions of these measurement techniques.


In a study that examined lumbar range of motion of 172 individuals, Fitzgerald et al15 used the original Schober method. Before data collection, reliability of the Schober technique was determined by two independent testers, who used as subjects 17 college-age students not involved in the larger study. Inter-rater reliability of the original Schober technique was reported to be 1.0 (Pearson’s r). Although no follow-up statistical test was performed after the Pearson correlation analysis, as is appropriate (refer to Chapter 2), this study was included in this chapter because it is the only reliability study performed by using the original Schober technique.


Before collecting values of back mobility in 282 children without disability, Haley et al20 established reliability in a pilot study. In one of the few studies conducted to examine intrarater reliability of the modified Schober test, one tester measured six children between the ages of 5 and 9 years. Intrarater reliability was analyzed statistically by using an ICC, yielding results of .83. The authors reported that the test was not only accurate but was “relatively easy and quick to perform on young children.”


Inter-rater reliability of the modified Schober technique for measuring lumbar flexion was reported by Burdett et al,9 who measured 23 individuals between the ages of 20 and 40 years. The authors reported inter-rater reliability of .72 using an ICC and .71 using Pearson’s r. Follow-up testing with an analysis of variance (ANOVA) indicated no significant difference between testers.


A comprehensive study by Hyytiainen et al26 provided intrarater and inter-rater reliability on the modified Schober test administered to measure lumbar flexion. After examining 30 males using the modified Schober method, the authors reported intrarater reliability of .88 and inter-rater reliability of .87 (Pearson’s r). Follow-up testing using a paired t test indicated no significant difference related to intrarater or inter-rater reliability. The authors concluded that the tape measure “was easy to use and required no expensive equipment.”


Williams et al62 examined the intrarater and inter-rater reliability of the modified-modified Schober method for measuring lumbar flexion by using three clinicians whose clinical experience ranged from 3 to 12 years. Examination of 15 patients with low back pain resulted in intrarater reliability using Pearson correlation coefficients of .89 for clinician #1, .78 for clinician #2, and .83 for clinician #3. An ICC performed across all three clinicians resulted in an overall inter-tester reliability coefficient of .72.


Also examining the intratester reliability and validity of the modified-modified Schober, Tousignant et al59 examined 31 adults with low back pain. Results indicated intratester reliability for two testers to be .79 and .81; inter-tester reliability between the two testers was .91. Examining validity by establishing the relationship of the range found by the modified-modified Schober and x-ray exam, a Pearson’s correlation of .67 was found. (Note: This study is included despite the fact that only a Pearson correlation was performed, because of the few studies that are related to validity of the Schober technique.)


Macrae and Wright34 tested their contention that the modified Schober was a better test than the original Schober by comparing the correlations of lumbar flexion measurements obtained by both methods versus measurements obtained radiographically (x-rays). The correlation coefficient (Pearson’s r) between the original Schober and the x-ray (validity) was .90 (standard error = 6.2 degrees), and between the modified Schober technique and the x-ray (validity), .97 (standard error = 3.3 degrees). Although data on test-retest reliability were not obtained, the authors concluded that “the proposed modification was an improvement over the original Schober’s.”


In a second study that compared the modified Schober versus radiographic examination of lumbar flexion in an attempt to determine validity, Portek et al52 evaluated 11 subjects. The reliability correlation between the modified Schober technique and x-ray (validity) was reported as .43 (Pearson’s r). However, a t test revealed no significant difference between measures obtained with the modified Schober and with x-rays. In contrast to the study by Macrae and Wright,34 this study demonstrated little correlation between clinical and radiographic techniques. The authors concluded that the modified Schober “only gave indices of back movement which did not reflect true intervertebral movement.”



Summary: Tape Measure for Measurement of Lumbar Flexion: Tables 10-1 to 10-3 provide a summary of reviewed studies that related to the reliability and validity of using a tape measure to measure lumbar flexion. As is indicated in these tables, intrarater reliability ranged from .78 to .89 (Table 10-1), and inter-rater reliability ranged from .71 to1.0 (Table 10-2), for all techniques in which a tape measure was used. Correlation between measurements made with a tape measure using either the Schober or the modified Schober technique and radiographic examination yielded reliability coefficients of greater than .90 for one study and below .70 for two additional studies (Table 10-3).






Extension


After a modification of the Schober technique was used to measure extension in two studies, Williams et al62 examined the intrarater reliability of three clinicians using the modified-modified Schober technique on 15 subjects with low back pain, reporting correlation coefficients ranging from .69 to .91 (Pearson’s r and ICC). Using a similar measurement technique in the examination of 100 patients with low back pain and 100 individuals without low back pain, Beattie et al4 reported slightly higher intrarater reliability than was reported by Williams et al.64 Test-retest reliability for individuals with low back pain was .93, and for those without low back pain, reliability was .90 (ICC). Beattie et al4 also examined inter-tester reliability in 11 subjects without low back pain, reporting a correlation coefficient of .94 (ICC).


Using a technique that was slightly different technique from the Schober method, Frost et al18 used a tape measure to examine the changed distance between the spinous process of C7 and the posterior superior iliac spine during spinal extension. After examining 24 subjects, Frost et al18 reported an intrarater reliability of .78 and an inter-rater reliability of .79 (Pearson’s r). ANOVA performed to analyze the difference between first and second measurements (intrarater) indicated no significant difference. However, ANOVA performed to analyze the difference between examiners (inter-rater) revealed that a significant difference existed (p < .05).


Tables 10-4 and 10-5 provide a summary of reliability studies that used the tape measure to examine extension of the spine. As is indicated in these tables, intrarater reliability ranged from .69 to .93 (see Table 10-4), and inter-rater reliability was reported as .79 and .94 (see Table 10-5).




A unique method for using a tape measure to measure lumbar extension range of motion was investigated by Bandy and Reese.3 By using a prone press up and measuring the perpendicular distance from the sternal notch and the support surface, the author found that the technique was reliable, irrespective of whether or not the examiner was experienced. Intrarater reliability for the experienced examiner was .90 and for the inexperienced examiner, .82. Inter-rater reliability between the experienced and the inexperienced was .85. The author included that the “prone press up appears to be a reliable method to measure lumbar extension.”



Lateral Flexion



Fingertip to Floor: The fingertip-to-floor method measures the distance from the third fingertip to the floor after the patient laterally flexes the spine (a detailed description is presented in Chapter 8). Frost et al18 examined right lateral flexion in 24 individuals using the fingertip-to-floor method. Both intrarater reliability and inter-rater reliability were reported as .91. However, follow-up ANOVA revealed a significant difference (p < .01) between measurements for both intrarater and inter-rater reliability.



Marks at Lateral Thigh: A second technique for measuring lateral flexion is to place marks at the points on the lateral thigh that the third fingertip touches during erect standing and after lateral flexion (a detailed description is presented in Chapter 8). After measuring 18 subjects, Rose56 reported intrarater reliability of .89 for right lateral flexion and .78 for left lateral flexion (Pearson’s r). The least significant difference (defined as the extent to which repeated measures must differ for significant difference to occur) was reported as 3.0 cm and 4.0 cm for right and left lateral flexion, respectively.


Hyytiainen et al26 examined 30 subjects and reported intrarater reliability of .85 and inter-rater reliability of .86 (Pearson’s r). Follow-up testing using an ANOVA for both intrarater and inter-rater reliability revealed no significant differences between measurements taken. Slightly higher intertester reliability was reported by Alaranta et al,2 who reported a correlation of .91 (Pearson’s r) in the measurement of 24 individuals. Follow-up testing in which a paired t test was used revealed no significant differences between testers.





Rotation


A unique method of measuring rotation of the thoracolumbar spine using a tape measure was described by Frost et al,18 who measured the distance between the ipsilateral acromion and the contralateral greater trochanter before and after the subject rotated the spine (a detailed description is presented in Chapter 8). Only one study has attempted to document the use of the tape measure to examine the amount of spinal rotation.Frost et al18 not only provided a description but also determined the reliability of the rotation technique using the tape measure. Intratester reliability on 24 subjects was reported as .71; inter-tester reliability was extremely low, with a reliability coefficient of .13. Follow-up testing using ANOVA indicated no significant difference between measurements related to intrarater reliability, but a significant difference (p < .05) between testers related to inter-tester reliability. The authors indicated that the inability of the two testers to accurately define the landmarks was a limiting factor in this measurement technique and was the cause of the low correlation for inter-rater reliability.



GONIOMETER


Goniometry is a relatively quick and easy method of measuring spinal mobility. In addition, goniometers are readily accessible to the clinician and are commonly used.15



Flexion and Extension


Burdett et al9 examined inter-tester reliability by using goniometry to measure flexion and extension in 23 subjects. These authors reported inter-tester reliability coefficients of .85 (ICC and Pearson’s r) for flexion and .75 (ICC) and .77 (Pearson) for extension. Testing using ANOVA indicated no significant difference between testers for measurements of lumbar flexion or extension.


Although similar results for inter-tester correlation coefficients were reported by Nitschke et al,46 the authors’ interpretation of the findings was very different. After examining inter-tester reliability in measuring flexion and extension in 34 patients with low back pain, Nitschke et al46 reported correlations of .84 (ICC) and .90 (Pearson’s r) for flexion. The 95% confidence interval (CI) for flexion was 30.37 degrees, and the t test showed no significant difference. For extension, the correlation reported was .63 (ICC) and .76 (Pearson’s r) (95% CI = 18.34 degrees; t test not significant). In addition, this study examined these 34 patients for test-retest intrarater reliability, reporting correlations of .92 (ICC and Pearson’s r) for flexion (95% CI = 29.12 degrees; t test not significant), and .81 (ICC) and .82 (Pearson’s r) for extension (95% CI = 17.15 degrees; t test not significant). Nitschke et al46 suggested that although the t test performed did not indicate systematic error, the large 95% CI indicated the presence of random error, revealing that “the measurement with a long arm goniometer had poor reliability.”


Tables 10-8 and 10-9 present a summary of studies related to use of the goniometer to measure lumbar flexion and extension. As indicated in the tables, only one study reported intratester reliability (see Table 10-8), and intertester reliability ranged from .63 to .90 (see Table 10-9).





Lateral Flexion


Fitzgerald et al15 examined inter-tester reliability for lateral flexion using two testers and 17 subjects. Inter-tester correlations reported were .76 for right lateral flexion and .91 for left lateral flexion (Pearson’s r). Although the Pearson correlation was not followed up with an appropriate test to analyze random or systematic error (refer to Chapter 2), this study was included because only one other study explored the reliability of the goniometer in measuring lateral flexion. The authors suggested that the goniometer was “an objective and reliable method for measuring spinal range of motion.”15


Nitschke et al46 also established inter-tester reliability for lateral flexion as part of their study, which was previously described. Inter-tester reliability correlations were .62 (ICC and Pearson’s r) for right lateral flexion (95% CI = 14.23 degrees; t test not significant) and .80 (ICC and Pearson’s r) for left lateral flexion (95% CI = 10.33 degrees; t test not significant). In addition to examining inter-tester reliability, Nitschke et al46 examined these same 34 patients with low back pain to establish intratester reliability. The authors reported intratester reliabilities of .76 (ICC and Pearson’s r) for right lateral flexion (95% CI = 10.91 degrees; t test not significant) and .84 (ICC and Pearson’s r) for left lateral flexion (95% CI = 9.43 degrees; t test not significant). On the basis of these results, the authors suggested that the use of the goniometer for measurement of spinal range of motion “is inadequate.”


A summary of inter-tester reliabilities for use of the goniometer for measurement of lateral flexion is presented in Table 10-10. As indicated in the table, inter-tester reliability ranged from .62 to .91.




INCLINOMETER


Expressing the concern that “joint movements in the spine are still being assessed largely by clinical observation and subjective impression” and not by objective measurement, Loebl33 in 1967 described the use of the inclinometer, which he referred to as “a new, simple method for accurate clinical measure of spinal posture and movement.” Although his study was descriptive in nature, with no reliability data to support any contention of accuracy, Loebl33 was one of the first to describe the use of the inclinometer.


Since Loebl’s33 article appeared, much needed research has been published on the reliability and validity of the inclinometer in measuring spinal mobility. In contrast to the reliability reported for the tape measure procedures, which is relatively consistent and high, the reliability of the accuracy of measurement using the inclinometer reported in the literature varies widely.



Flexion and Extension


Several studies used a test-retest design, with one tester performing the inclinometer technique to determine intrarater reliability for measurements of flexion and extension. Other studies used two testers to perform the inclinometer technique, comparing the results obtained by the two testers to determine inter-rater reliability. Because of the number of publications related to the reliability of using inclinometers to measure flexion and extension, this section is divided into the following subsections for clarity: studies dealing with intrarater reliability, investigations related to inter-tester reliability, and research comparing results obtained with the inclinometer versus data derived from radiographic (x-ray) examination (validity).


Techniques used for each study vary, with some authors placing the inclinometer at locations similar to those used with the Schober technique, previously described in the tape measure section of Chapter 8. This inclinometer technique involves designated measurement of lumbar flexion and extension. Other authors placed one inclinometer at the sacral base and a second inclinometer at the level of the C7-T1 spinous process. This measurement is designated as thoracolumbar flexion and extension.


Finally, some studies reported not only reliability of flexion and extension, but also reliability of “total” movement. Total movement is the measurement of maximal flexion added to maximal extension, with a correlation performed on the sum.



Intrarater Reliability: Using an inclinometer, Mellin39 reported intrarater reliability coefficients in the examination of 10 subjects as .86 for lumbar flexion, .93 for thoracolumbar flexion, .93 for extension, and .98 for thoracolumbar extension (Pearson’s r). However, matched t tests comparing the first measure versus the second measure for each motion indicated that a significant difference (p < .05) existed for each motion. A second study, in which Mellin was involved, provided somewhat different results. Mellin et al40 examined 27 subjects, resulting in an intratester reliability of .91 for lumbar flexion, .94 for thoracolumbar flexion, .79 for lumbar extension, and .87 for thoracolumbar extension (Pearson’s r). In this study, a matched t test comparing the first measurement versus the second measurement resulted in no significant difference. The authors concluded that “the accuracy of the methods described (inclinometer) makes them useful for measurement of thoracolumbar mobility.”


Nitschke et al46 and Rondinelli et al55 reported reliability coefficients similar to those reported in the studies just presented but came to different conclusions in their analysis of data. Measuring lumbar flexion and extension in 34 individuals with low back pain, Nitschke et al46 reported correlations of .90 (Pearson’s r and ICC) for flexion and .70 (ICC) and .71 (Pearson’s r) for extension. Although no systematic error was found (as determined by t tests between measurements that were not significant), the authors suggested that the large random error (95% CI = 28.46 degrees for flexion, 16.52 degrees for extension) indicated “poor intrarater reliability.” Establishing the intrarater reliability of two testers using three different inclinometer techniques, Rondinelli et al55 measured flexion in eight subjects. The authors reported correlations ranging from .70 to .90 for intrarater reliability for flexion (ICC) and concluded that “these findings appear to undermine the expectations that clinicians can reliably apply surface inclinometry.”


Establishing intratester reliability, Williams et al62 examined lumbar flexion and extension in 15 patients with low back pain using three testers. Results for intratester reliability for each examiner ranged from .13 to .87 for flexion and from .28 to .66 for extension (ICC). The conclusion reached by the authors was that the “inclinometer technique needs improvement.”


Higher reliability than that found by Williams et al62 was reported by Ng et al44 Examining test-retest intratester reliability in 12 adults with back pain, Ng et al44 reported reliability of .87 for flexion and .92 for extension for each of the two examiners. Very similar results were described by Lee et al,30 who reported intratester reliability of .84 and .88 in the measurement of flexion in 35 healthy adults measured by two examiners. However, Lee et al30 did not find the inclinometer to be as accurate for the measurement of extension, reporting intratester reliability among the two examiners at .79 and .48.


The back range of motion (BROM) device is a specialized measurement tool that consists of two separate plastic frames that are secured to the individual with elastic straps. Within the plastic frames, inclinometers are mounted, allowing measurement of flexion, extension, lateral flexion, and rotation. A detailed description of the BROM device is presented in Chapter 8. Using the BROM device to analyze intrarater reliability in two testers who measured lumbar flexion in eight subjects, Rondinelli et al55 reported reliability correlations of .81 and .90 (ICC). Expanding the study by Rondinelli et al55 to include not only flexion but also measurement of intratester reliability for extension in 47 subjects, Breum et al8 reported correlation coefficients (ICC) of .91 for flexion and .63 for extension. Breum et al8 concluded that the “BROM was found to be a reliable instrument in the measurement of lumbar mobility.” Using the same basic design as was employed in the study by Breum et al,8 Madson et al35 analyzed the reliability of the BROM device in measuring lumbar range of motion in 40 subjects. Intrarater reliability was .67 for flexion and .78 for extension. The 95% CI was 5.0 degrees for both flexion and extension measurements. After examining 91 adults with low back pain, Kachingwe and Phillips28 reported intratester reliability of .84 and .79 for flexion in two examiners. However, Lee et al30 did not find the inclinometer to be as accurate for the measurement of extension, reporting intratesters reliability of two examiners as .79 and .48.


Tables 10-11 and 10-12 provide a summary of studies investigating intrarater reliability for the measurement of flexion and extension using the inclinometer. As indicated, reliability coefficients across all studies ranged from .13 to .94 for measurement of flexion (see Table 10-11) and from .28 to .98 for measurement of extension (see Table 10-12).





Inter-rater Reliability: Several groups of investigators who examined intrarater reliability also studied inter-rater reliability of measuring spinal flexion and extension using the inclinometer. Mellin39 examined inter-tester reliability in 15 subjects, reporting correlation coefficients of .97 for lumbar flexion, .95 for thoracolumbar flexion, and .89 for both lumbar and thoracolumbar extension (Pearson’s r). Matched t tests comparing the first tester versus the second tester for each motion indicated that a significant difference (p < .001) existed for each motion. Nitschke et al46examined 34 patients with low back pain and by using the inclinometer reported inter-tester reliability of .52 (ICC) and .67 (Pearson’s r) for flexion (95% CI = 28.46 degrees; t test = significant difference at p < .05) and .35 (ICC and Pearson’s r) for extension (95% CI = 16.52 degrees; t test not significant). In their study examining 35 healthy adults, Lee et al30 reported inter-rater reliability of .83 for flexion and .75 for extension. Inter-tester reliability in measuring eight subjects was reported by Rondinelli et al55 as correlations (ICC) of .76 for lumbar flexion using a single inclinometer, .69 using a double inclinometer, and .77 when the BROM device was used.


Identical correlations (.77) to those of Rondinelli et al55 were reported by Breum et al8 for inter-tester reliability of the BROM device in measuring lumbar flexion in a study of 40 subjects (ICC). The reliability correlation reported when lumbar extension was measured with the BROM device was .35 (ICC). Also with the BROM, similar results to those of Breum et al8 were described by Kachingwe and Phillips,28 who reported that inter-tester reliability (ICC) for flexion was .74 and for extension was .55. The conclusions and opinions proposed by the authors of these studies as to the use of various types of inclinometers for the measurement of flexion and extension based on their data collection are exactly the same as the information already presented in the previous section, which discussed intratester reliability.


Other groups of investigators examined only inter-tester reliability of spinal measurements with use of the inclinometer. Burdett et al9 reported reliability coefficients of .91 (ICC) and .93 (Pearson’s r) for lumbar flexion and .71 (ICC) and .72 (Pearson’s r) for lumbar extension in their single-inclinometer examination of 23 subjects. Follow-up testing using ANOVA indicated no significant difference between testers in inter-rater reliability for extension, but a significant difference between testers for flexion (p < .05).


Slightly lower results were reported in a study performed by Chiarello and Savidge12 of 12 subjects without back pain and six patients with back pain.12 Correlations (ICC) were reported as .74 for lumbar flexion for subjects without back pain, .64 for lumbar flexion for patients with low back pain, .65 for lumbar extension for subjects without back pain, and .83 for lumbar extension for patients with low back pain. The authors concluded that these results indicated “acceptable reliability,” and that use of the inclinometer “in a clinical setting to document lumbar spine range of motion represents a vast improvement over observational methods.”


Newton and Waddell43 examined inter-tester reliability for lumbar flexion and extension in 20 patients with low back pain. Reported reliability correlations (ICC) were good (.98) for flexion but relatively poor (.48) for extension. After examining 24 normal individuals for inter-tester reliability of lumbar flexion, Alaranta et al2 reported a correlation of .61 (Pearson’s r). A t test between measurements by the two testers indicated a significant difference (p < .05). In a study investigating inter-rater reliability in only flexion of the spine, Sullivan et al57 reported test-retest reliability of .75.


Tables 10-13 and 10-14 summarize studies performed on inter-rater reliability for the use of the inclinometer in measuring flexion and extension. As indicated, inter-tester reliability ranged from .52 to .98 (see Table 10-13) for flexion and from .35 to .89 for extension (see Table 10-14).


Stay updated, free articles. Join our Telegram channel

Aug 10, 2016 | Posted by in PHYSICAL MEDICINE & REHABILITATION | Comments Off on RELIABILITY and VALIDITY of MEASUREMENT of RANGE of MOTION for the SPINE and TEMPOROMANDIBULAR JOINT

Full access? Get Clinical Tree

Get Clinical Tree app for offline access