Pre-employment and Preplacement Screening for Workers to Prevent Occupational Musculoskeletal Disorders



Fig. 13.1
Relationship between job demands and worker’s capabilities and limitations within a work system (Dempsey et al., 2000)



A304628_1_En_13_Fig2_HTML.gif


Fig. 13.2
Applications or interventions having an impact on successful job placement (Armstrong et al., 2001)


One of the most important aspects of the FCE is the measurement of capacity specific to the actual physical demands as required by the job. While most capacity measures are highly specific to the task (i.e., whether it is a measure of aerobic capacity or muscle strength), it is implied that the test may need to consider multiple tasks within a job. Therefore, prior to administering any FCE, a job analysis should be performed. This is a crucial point, both from a legal and scientific perspective. This is further complicated by the fact that the task-specific nature of human capacity may change because of injuries, aging, other complications, etc.

While the concept of functional capacity relative to specific job demands appears to be straightforward, the actual evaluation of functional capacity is a technically challenging process that often occurs within a complex legal and medical context. Because most physical job demands are both dynamic and complex in nature, in addition to the dynamic of capacity caused by morbidity, functional capacity is inevitably dynamic as well. The potential for variation in functional capacity presents a challenge to another conceptual basis for the use and interpretation of FCE results—that of scientific certainty. FCEs are often erroneously regarded as capable of providing data that are definitive in both measurement of capability, as well as sincerity of effort, with accurate projections to actual ability to return to specific jobs. However, following the evaluation, questions about sincerity of effort and work capacity are appropriate. Therefore, the validity of FCE results and associated conclusions can present important limitations to the application of such results (Harten, 1998).



The Functional Capacity Evaluation: An Assessment Tool


The use of FCEs as an evaluation tool necessitates continued scrutiny of the FCE process, in order to ensure that it provides a useful measure in a particular situation. This section discusses validity and reliability of the FCE, which are two important measurement properties.


Validity


Validity is generally defined as the extent to which a test measures what it is intended to measure (Reneman, Wittink, & Gross, 2009). Furthermore, validity reflects the credibility of the test results. A valid test has dependable results, and inferences made from the results must be trustworthy, because a valid test measures what it is supposed to measure. There are several types of validity based on the theory of measurement, as depicted in Table 13.1. To date, the type of validity mostly researched in the FCE literature is criterion validity.


Table 13.1
Types of validity





























Criterion validity

• The extent to which a performance on the test is related to a set of criterion. The criterion validity involves in comparing the test with external criterion or other measures (usually with the gold standard in the related area) proven to be valid

• There are two subtypes of criterion validity:

Concurrent validity: the extent to which a performance on test is related to the benchmark/gold standard test at the same time. Higher correlation indicates better criterion validity for the test. For example, how well the ERGOS™ Work Simulator correlates with conventional FCE in respect to dynamic lower and upper lifting (Dusik et al., 1993)

Predictive validity: the extent to which performance on the test is accurately able to predict performance in the future. For example, performance on short-form FCE predicts time to recovery, but does not predict sustained return to work (Branton et al., 2010)

Construct validity

• The extent to which a test measures a theory-derived construct. For example, poor convergent validity of the five Ergo-kit FCE lifting tests with reported sleep pain intensity and disability suggests a poor construct validity of these lifting tests (Gouttebarge et al., 2004)

Content validity

• The extent to which a test covers domains that the test is intended to measure

• For example, job-specific FCE

Face validity

• The extent to which a test measures what it supposes to measure at “face value”


Adapted from Anastasi and Urbina (1997); Nunnally and Bernstein (1994); Reneman et al. (2009)

Work return and the termination of a disability claim are criteria sometimes used to assess the predictive validity of a FCE [along with performance on various individual FCE tasks and overall performance during evaluation as the predictors (Kuijer, Gouttebarge, Brouwer, Reneman, & Frings-Dresen, 2012)]. For example, Mayer et al. (1986) evaluated the ability of an individual FCE task to predict work return. Results indicated that a positive change in trunk strength, as measured by a Cybex trunk strength tester (Lumex Corp, Ronkonkoma, NY), was associated with an increased likelihood of return to work, relative to negative change or no change. However, other factors are equally predictive of return-to-work outcomes. In addition to performance on the trunk strength test, performance on lifting tasks is also used as a predictor for work return. For example, greater ability on floor-to-waist lift, but not on shoulder-to-overhead lift, was associated with improved likelihood of return to work (Matheson, Isernhagen, & Hart, 2002). On the other hand, a lower amount of floor-to-waist lift and a lesser maximum ability were associated with a decreased likelihood for work return and increased likelihood for non-return to work (Streibelt, Blume, Thren, Reneman, & Mueller-Fahrnow, 2009; Vowles, Gross, & Sorrell, 2004). In addition, more weight lifted on an FCE’s lift task was associated with a faster suspension of workers’ benefits and claim closure (Gross & Battié, 2006). Also, better overall FCE performance, as measured by a lower number of failed tasks or passing all FCE tasks, was associated with increased likelihood of being employed, decreased likelihood of non-work return, and faster termination of disability claim (Branton et al., 2010; Gross & Battié, 2005; Streibelt et al., 2009). Similar conclusions may be applied to short-form FCEs: a better performance on the short-form FCE (consisting of floor-to-waist lifting task, crouching, and standing) has been associated with faster claim benefit suspension in a chronic musculoskeletal condition population (Gross, Battié, & Asante, 2006, 2007).

Even though the FCE is predictive of work return (as found in past studies), the contribution of an FCE to increasing the prediction accuracy for work return and disability claim closure is modest (Gross & Battié, 2004; Gross, Battie, & Cassiday, 2004; Matheson et al., 2002). The modest contribution might be due to the multidimensionality of work return, including economic and psychosocial factors (He, Hu, Yu, & Liang, 2010; Krause, Dasinger, Deegan, Rudolph, & Brand, 2001; MacKenzie et al., 1998), but is most likely due to the poor ability of generic FCE evaluations to predict job performance, especially in jobs with complex and variable work tasks.

It is a common practice to extrapolate expected ability to perform frequent lifting on the job, based on the maximal ability while performing occasional lifting. However, this practice lacks a well-founded scientific basis (Jones & Kumar, 2003). Therefore, caution needs to be exercised because performing low-frequency, high-load lifts “taxes” the musculoskeletal system, whereas performing such lifts also brings the cardiopulmonary system into the equation. The cardiovascular system, in turn, may limit performance due to fatigue. Thus, the ability to perform frequent lifting, based on the extrapolation from the maximal ability, may not always give a true estimate of repetitive-lifting ability. It should also be noted that the role of psychosocial factors is especially important in evaluating the predictive ability of the FCE. Fishbain, Cutler, Rosomoff, and Steele-Rosomoff (1999) found that the completed number of FCE tasks was predictive for return to work only when combined with pain intensity level and other factors. This result was affirmed by a recent review in which pain intensity was established as an influential confound in FCE validity research (Cutler, Fishbain, Steele-Rosomoff, & Rosomoff, 2003; Kuijer et al., 2012).

Overall, the evidence supporting the FCE as a prediction tool for work return is, at best, mixed. Dusik, Menard, Cooke, Fairburn, and Beach (1993) evaluated the validity of the FCE by using return to work as a criterion outcome. The investigators compared FCE results using a standardized protocol versus a job simulation. They followed the return-to-work outcomes after participants were discharged from rehabilitation. Results indicated that the FCE was just as accurate as a job simulation (in predicting return to work) that involved a very simple repetitive job without any accommodation potential or flexibility. However, the FCE was much less accurate than the job simulation in predicting ability on a more complex job. In addition, Gross and Battié (2005) found that performance on the FCE was not predictive of sustained work return as indicated by opening a claim on old and new injuries, although it was somewhat predictive of recurrence soon after return to work (Gross & Battié, 2004). These findings cast more doubt on the validity of the generic FCE, partly because the issue related to characterization of job demands has not been satisfactorily resolved without actual job simulation. They also question the predictive value of any type of FCE over time.

However, recent research suggests that an FCE, combined with isoinertial and isokinetic testing, may improve the validity of the FCE. Fore et al. (in press) examined whether FCE scores were responsive to functional restoration treatment, predictive of 1-year socioeconomic outcomes, and predictive of physical demand levels (PDL) 1 year after treatment. Results indicated that 89 % of patients demonstrated improvements on their PDL from pre- to posttreatment and 78 % of patients had returned to work. In addition, posttreatment FCE results predicted return to work 1 year later.

Christian and colleagues (2002) examined whether persons judged to be employable after a formal work capacity assessment related to indemnity compensation benefits in New Zealand. Of those participants who were judged to be employable but not working at follow-up (57 % of the 141 participants in the study), some had repeated or reopened insurance claims. This suggested the possibility of a return to work at jobs that placed them at risk for further injury. Similar findings have also been observed from other studies. For example, it was found that limitations documented in the evaluation setting do not correlate with the ability to return to work. These discrepancies appear to be most problematic with static tasks, but less so with dynamic tasks or job simulation (Dempsey, Ayoub, & Westfall, 1998; Ferguson, Marras, & Gupta, 2000). Only the Physical Work Performance Evaluation (PWPE-FCE) had acceptable documentation of validity for a narrow range of jobs among the commercially available FCE protocols (Innes & Straker, 1999).

A generic FCE purports to assess functional job capacity by comparing performance on various structured, general tasks with categories of physical job demand. The categories of job requirements are determined through job analyses. Often, the job requirements are extrapolated from the job title and work classification provided by the Dictionary of Occupational Titles (DOT) or it successor, the US Department of Labor O*NET database (Pransky & Dempsey, 2004). These systems classify jobs through categorization of physical requirements for each generic occupational title and were not intended to serve as a basis for evaluation of work capabilities—discrepancies between actual job requirements and those listed in the DOT or O*Net are presumed to be the rule, not the exception. Furthermore, these systems do not provide specific measures of activity required (e.g., weights lifted, miles walked, etc.). The performance on an FCE is influenced by personal factors (i.e., motivation and beliefs) and by environmental factors (i.e., assessor and testing condition; Genovese & Galper, 2009). Thus, “direct” comparisons between performance on a generic FCE and the required physical demands based on occupational title are likely to result in an inaccurate representation of an individual’s functional ability relative to a specific job. An accurate job simulation, though, has the potential to increase the predictive ability of test results.

For FCEs projected to measure working ability at a specific job, a formal job assessment is desirable. Several job assessment systems, designed to interface with FCE protocols, are available. However, accurate assessment of job demands can be challenging. There are several threats to FCE validity, including formal and informal job modifications, and a variety of alternatives to perform complex tasks (Chan, Tan, & Koh, 2000; Hoffman & Pransky, 1998). Related to job modifications, workers often alter how a job is executed. Furthermore, workers also utilize informal accommodations in order to perform a job despite physical limitations. Discussion with the examinee regarding job requirements may be helpful, but workers may not always be able to provide reliable data about physical job demands (Lindstrom, Ohlund, & Nachemson, 1994). Standard job descriptions from employers can be equally inaccurate. FCEs based on a job simulation examine only the physical components of the job; however, they fail to simulate the environmental (hot, cold, vibration) or psychosocial components (time pressure, working in isolation) (Mazanec, 1996). Thus, validation is difficult in some situations without some strong evidence for job performance linkage around physical tasks (Schonstein & Kenny, 2001). In instances when FCEs are successful in measuring physical job demands and properly simulating the job environment, return to work is also a function of many other factors, including physical demands and capacity, skill, motivation, workplace, and psychosocial attributes. Therefore, the validation of a particular FCE method is impossible without taking into account all the other factors that may affect a successful return to work (King, Tuckwell, & Barrett, 1998).

When an FCE is being performed to assess ability to perform a broad class of jobs, a high degree of job-specific validity may not be required. However, evaluators should note that results could easily be misleading. For example, the authors have observed multiple employees within a facility who have a job title such as “material handler” or a similarly vague title but who have very different job demands in terms of the loads handled and the frequency of lifting. Thus, the validity of an FCE across workers in the same job title could vary (Chan et al., 2000; Hoffman & Pransky, 1998; Lindstrom et al., 1994).


Reliability


Reliability is related to the consistency of a measure. In general, a test is considered reliable if it produces a relatively similar result over time. The reliability coefficient refers to the degree of consistency of results (Anastasi & Urbina, 1997), with a higher reliability coefficient indicating a higher consistency of a measure. There are several types of reliability including test-retest reliability (consistency over time), inter-rater reliability (consistency between different raters), intra-rater reliability (consistency by the same rater over time), and internal consistency (between equivalent parts in the same test) (Anastasi & Urbina, 1997; Nunnally & Bernstein, 1994). In the context of the FCE, the inter-rater reliability and the test-retest reliability are considered important. The test-retest reliability is important because it ensures that changes in the FCE results are due to the person, rather than a variation of the FCE itself. In the context of illness management, the inter-rater reliability is valuable because it ensures that the test produces consistent results despite the influence of a patient’s and a rater’s subjectivity.

To date, the evidence concerning the reliability of the FCE has demonstrated a large and undesirable degree of variability. A systematic review on the validity and reliability of the Blankenship System FCE (BS-FCE), ERGOS Work Simulator FCE (ES-FCE), Ergo-Kit FCE (EK-FCE), and Isernhagen Work System FCE (IWS-FCE) concluded that the inter-rater reliability of the IWS-FCE was good. However, these studies on inter-rater reliability were not rigorous enough to draw any firm conclusions (Gouttebarge, Wind, Kuijer, & Frings-Dresen, 2004). No definitive reliability studies were found for BS-FCE, ES-FCE, and EK-FCE. In sum, sufficient reliability studies of these standardized FCE approaches are lacking.


Functional Capacity Evaluation: Its Utility of FCE


The FCE consists of a wide range of activities designed to estimate a person’s functional ability, whether it is specifically related to the job or if it depicts a general picture. The activities performed during a general, more extensive FCE range from simple to complex and attempt to fulfill these purposes. The FCE activities are typically categorized into nonmaterial handling and material handling activities. Nonmaterial handling activities include positional tolerance activities, such as sitting, standing, climbing, balancing, and walking (Coupland, Miller, & Galper, 2009). The material handling activities include carrying, floor-to-waist lifting, and waist-to-shoulder lifting (Innes, 2009). The material handling assessments involve a series of standardized tasks with weights and distances that are supervised by a trained professional (e.g., an occupational or physical therapist). The material handling assessments may involve the evaluation of velocity, peak force, and isokinetic lift, using computerized devices from Cybex (Cybex Inc, Medway, Ma) and Biodex (Biodex Medical Systems, Inc, Shirley, NY) or standardized weights. Some computerized devices also assess range-of-motion activities, including trunk flexion and extension. Overall, the choice of activities included in an FCE depends on the purposes and contexts of the evaluation. The following section will discuss the utility of FCE based on two main goals: illness management and injury prevention.


Illness Management


Illness management spans a wide range of conditions and situations, from simple to complex and from acute to chronic. The type and purpose of FCEs are slightly different in each condition and situation. Nevertheless, the main goal remains the same: to provide the patients, physicians, employers, benefit adjudicators, insurance companies, etc., with information on physical and functional abilities relative to job demands. For instance, people receiving treatment for acute illnesses are presumed to still be active employees. The purpose of administering an FCE for this group is to identify the job tasks that can be safely performed and also to identify whether adjustment to the workers’ tasks is necessary (Genovese & Isernhagen, 2009). Hence, it is expected that the results of an FCE will facilitate an employee to keep working, or to assist an employee on sick leave to return to work early (resulting in a shortened length of disability). However, there is insubstantial evidence to conclude that an FCE is important in establishing safe alternative duty for return to work.

In the context of chronic illness management, FCEs have been utilized in work hardening/conditioning programs. Work hardening/conditioning programs are a form of tertiary prevention, aimed at preparing the individual to return to work. It is an interdisciplinary program which uses real or simulated work tasks and progressively graded conditioning exercises. The patients entering these programs usually have not reached maximal medical improvement (MMI), meaning that the patient has not reached the point at which a damaged body part or organ system is not likely to achieve further improvement (Civitello & Carter, 2010). Upon admission to these types of programs, the patient may undergo a series of assessments, one of which is the FCE. The purpose of the FCE in this situation is to provide the patient, employer, physicians, therapists, insurance agency, etc., with information on the patient’s residual abilities (Genovese & Isernhagen, 2009). This purpose is achieved by assessing the patient’s functional abilities related to the job and his/her general physical abilities. Thus, the FCE in this situation might be moderate in length, with a combination of generic and job-specific tasks. The FCE result will then be incorporated with the physician’s report in order to set up a rehabilitation program and expected goal. It is important to note that many programs currently evaluate and rehabilitate injured workers without these sorts of structured FCE evaluations. Patients experience improvement during their time in rehabilitation. Even though patients improve, it is not practical to perform a reevaluation every time a change in function or work demands occurs, [see Fig. 13.2 (Armstrong et al., 2001)]. Changes on patients’ physical capacity and pain tolerance may still happen, especially to those who are early in their recovery. Thus, obtaining repeated functional measurements during the course of physical rehabilitation may represent an unnecessary expense that is not required to achieve optimal outcomes (Rainville, Sobel, Hartigan, Monlux, & Bean, 1997). There is also little justification to conduct formal FCEs when the full range of available job accommodations has not been explored. Rather, the goal of rehabilitation is to increase the functional abilities and work tolerance so that they “match up” with the physical demands of the job. The FCE is often repeated at least once during the program in order to monitor improvement. At the conclusion of the program, another FCE may be administered to assess the patient’s physical and functional abilities. The physician may incorporate FCE results with a medical evaluation in order to generate a recommendation. The recommendation includes a job-specific PDL and tasks that the patient is able to perform safely.

Structured FCEs, administered in conjunction with a rehabilitation program, usually incorporate a judgment of sincerity of effort. The purpose of incorporating sincerity of effort is to increase the accuracy in interpreting the FCE results. Sincerity of effort generally refers to an individual’s conscious motivation to perform optimally during assessment (Lechner, Bradbury, & Bradley, 1998). There is an underlying assumption that sincere effort leads patients to demonstrate their maximal effort. The evaluation of a patient’s sincerity of effort depends on an evaluator’s perception. There are several methods commonly used to determine the sincerity of effort. Among the methods are the Waddell Nonorganic Signs and Coefficient of Variation (COV) (Matheson & Dakos, 2000; Waddell, McCulloch, Kummel, & Venner, 1980). Unfortunately, though, there is weak supporting evidence concerning the sincerity of effort evaluations. For example, there is more than one variable of performance influencing a painful condition, even when a subject is attempting to provide a maximal effort (Robinson & Dannecker, 2004). FCEs are often promoted as a method of “objectively” identifying conscious attempts to reduce effort. However, the scientific proof of its discrimination ability across a range of injured subjects is inconclusive (Hazard, Reid, Fenwick, & Reeves, 1988). One study reported high sensitivity and specificity of tests used to determine sincerity of effort, but only in subjects who were instructed to provide a very significant (50 %) reduction of maximal force (Jay et al., 2000). However, it did not specify the factors utilized to determine sincerity. Other studies have demonstrated that subjects can reproducibly perform at voluntarily reduced strength levels (Robinson, Geisser, Hanson, & O’Conner, 1993). Little evidence, though, exists for an unacceptable COV threshold; the suggested levels range from 5 to 29 % (Lechner et al., 1998).

The variability in performance observed in people with chronic low back pain may be determined by the variation in pain and function typically associated with that particular disorder. It also applies even to persons who are consistently providing a maximal tolerated effort. Reliability can be poor due to many factors, including variations in pain, position, self-limitation to avoid injury, equipment function, testing protocols, subject comprehension, or ability to follow specific directions (Innes, Tuckwell, Straker, & Barrett, 2002). Poor performance may also be influenced by failure to understand the degree of effort required, anxiety related to the test situation, depression, pain, fear avoidance, unconscious or conscious illness behavior or exaggeration, or malingering (Hirsch, Beach, Cooke, Menard, & Locke, 1991). Reliability can also be affected by training and acclimation. Significant reactivity (learning effect) has been demonstrated in low back pain patients with an isokinetic protocol, resulting in variations of 17–28 % (Grabiner, Jeziorowski, & Divekar, 1990). Patients may have reasonable fears about overexerting themselves that might lead to re-injury. Thus, insincere effort may not be the only factor behind the occurrence of the significant performance variability (Croft, Macfarlane, Papageorgiou, Thomas, & Silman, 1998; van den Hoogen, Koes, van Eijk, Bouter, & Deville, 1998).

Conversely, patients may demonstrate self-limitation that can be interpreted as valid given consistent occurrence; and overexertion (effort in a range that is unsafe for the individual) is also a possibility. FCE performance can be greatly hindered by pain. In this situation, testing may actually provide a measure of pain tolerance instead of peak functional capacity (Beimborn & Morrissey, 1988). Thus, changes over time may reflect changed psychosocial or behavioral factors affecting pain tolerance, and not muscle strength (Cooke, Menard, Beach, Locke, & Hirsch, 1992). In point of fact, Hazard et al. (1988) compared several indices of subject effort, including isokinetic force/distance curve patterns, peak force variations, blood pressure, and heart rates. They concluded that even the best physiologic measures and force curve analyses are not as reliable as an expert observer in detecting voluntary self-limitation. Thus, in essence, determining the underlying cause of limitation is a challenging task. The limitations demonstrated by patients may be due to their inability, or it may be due to their unwillingness to perform or put forth maximum effort. Unfortunately, the mislabeling of underperformance as insincere may lead to pervasive adverse consequences for workers, including misdiagnosis, improper treatment, increased litigation, and increased cost of care (Lechner et al., 1998). Therefore, it is important to clarify the distinction between validity as a scientific concept and attempts to measure sincerity of effort (the latter term is preferred). For practical purposes, FCEs appear to be effective in detecting submaximal efforts only when variation is high and the lack of full effort is obvious.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Sep 24, 2016 | Posted by in MUSCULOSKELETAL MEDICINE | Comments Off on Pre-employment and Preplacement Screening for Workers to Prevent Occupational Musculoskeletal Disorders

Full access? Get Clinical Tree

Get Clinical Tree app for offline access