Epidemiology and the Work-Relatedness of Regional Musculoskeletal Disorders
Epidemiology and the Work-Relatedness of Regional Musculoskeletal Disorders
As described in Chapter 10, there are three prerequisites for redress for work incapacity under the Workers’ Compensation Insurance scheme as constituted in the United States and most democracies:
To be compensated under Workers’ Compensation Insurance schemes, the worker must have experienced a personal injury that arose by accident out of and in the course of working. To qualify as an “injury” must there be damage and must the precipitant be a violent and extraordinary event?
Workers’ Compensation Insurance indemnifies wages and clinical costs until healing has occurred. Is healing always a definable, discrete endpoint?
Workers’ Compensation Insurance indemnifies wages. Therefore, awards are based on residual work capacity. How can disability be quantified?
These questions, and many related questions, are inherently contentious:
They beg consensus as to definitions: When is an individual difference in biology a health effect rather than an unavoidable consequence of living? Are “well” and “healed” synonyms? Is disability defined by the details of the task, of the pathobiology, or both? In the latter case, are both equal in importance?
They beg consensus as to the meaningfulness of measured differences: If there is a 2% difference between exposed and unexposed workers in exposure or health effect, is it meaningful? Would it matter if the difference was in repetitiveness, monotony, force, anxiety, and so forth? Would it matter if the difference is in a measurement of skin temperature as opposed to a difference in mortality? Would it matter if the difference was 5% or 10%?
They beg consensus as to the weight given to perception: How necessary is it to validate a worker’s assertion that she or he feels poorly or incapable? How necessary is it to validate the assessment of a treating physician or of a physician contracted to assess disability? How necessary is it to validate the worker’s perception of task demands or of the psychosocial context of working? How valid must the validating measures be?
The answers to these questions are no longer solely a reflection of the inability of the advocates of any belief system to withstand the onslaught of their critics. Epidemiology has been brought to bear on all these issues. In this chapter we will focus on the fashion in which epidemiology can probe the relationship between exposures at work and disabling regional back. In Section II the reader can find summary discussions of the relationship between the physical demands of tasks and disorders of the upper and lower extremity, including such neurovascular disorders as carpal tunnel syndrome, digital vasospasm, and reflex sympathetic dystrophy.
Epidemiology is the discipline that recruits statistical methodologies to the task of testing whether we are fooled by beliefs in particular influences that perturb our well-being or the well-being of others. Epidemiology can never prove us right; it is a discipline that can only prove us wrong. If you have a belief that withstands the onslaught of epidemiology, it can still be wrong. Only now your level of uncertainty can diminish to whatever degree the epidemiologic testing was powerful and compelling. This is the best epidemiology can do for us! It can tell us how likely it is that our belief, our explanation, and our hypothesis about well-being are lies. For many contentious issues in clinical medicine, but not all, rigorous and structured epidemiologic analysis can be brought to bear to discard the chaff and sometimes to define the limits of certainty regarding the wheat. Clinical medicine is growing accustomed to such exercises under the banner of evidenced-based clinical decision making, a trend prodded by a call for cost/benefit analyses. Occupational medicine is further charged with the exercise in the contexts of workplace safety, employer culpability, and indemnification. In these contexts, the exercise is prodded by regulatory agencies and litigation.
THE AGE OF THE DAUBERT TEST
Until recently, contentious scientific issues were litigated by a process that reiterated the bickering between belief systems. “Experts” brandished ideas with “general acceptance” to do battle, and the court decided the victor. No longer. The fashion in which any scientific legacy is to be presented in the judicial context was redefined by the Supreme Court of the United States in 1993. This action represents a watershed for the interface of science and society. Not surprisingly, contentious issues in occupational musculoskeletal disorders are providing a lightning rod for this transition.
The upshot of the ruling in Daubert v Merrell Dow Pharmaceuticals, Inc. (113 S. Ct. 2786, 2799 [1993]) is generally referred to as the “Daubert test.” The Supreme Court was attempting to ensure that the science recruited to the cause of seeking and finding the truth in the legal setting was of quality. The Daubert test supersedes 70 years of the “Frye test,” according to which scientific evidence was admissible if it was based on a scientific technique generally accepted as reliable within the scientific community. The minimal constraint on “general acceptance” countenanced hubris. Now, the Daubert test burdens the court with “the task of ensuring that an expert’s testimony both rests on a reliable foundation and is relevant to the task at hand.” This is no small distinction.
Having thus declared, the Supreme Court returned the Daubert case to the Ninth Circuit Court of Appeals to put this new principle into practice (95 C.D.O.S. 131). The fashion in which the Ninth Circuit Court tackled the “complex and daunting task” of establishing whether an experts’ testimony reflects “scientific knowledge … derived by the scientific method” and therefore amounts to “good science” is not binding. The criteria this court devised for making these determinations will prod legal scholarship for years to come; revisions to Federal Rules of Evidence 701 and 702 are already before the court. The Daubert test requires the expert to base assertions on a science that, to be credible, must have the attributes such as listed in Table 11.1.
There was dissent among the justices as to whether such criteria should be stipulated. Chief Justice Rehnquist, joined by Justice Stevens, wanted the court to restrict its opinion to the particular issues raised by the Daubert case (regarding a certain drug’s toxicity). Rather than attempt to formulate general rules, Justice Rehnquist wanted to “leave further development of this important area of the law to future cases.” Had Justice Rehnquist been heeded, American courts might not find themselves deluged with “Daubert hearings,” which Crump describes as “more lengthy, technical and diffuse than anything that preceded them.”1 Instead of judges and juries being subjected to the persuasiveness of experts regarding their beliefs, judges are expected to determine the admissibility of science after being treated to the persuasiveness of experts regarding the degree to which the relevant science satisfies the criteria. The contentious issues I raised previously regarding compensability are typical of the issues that are debated in Daubert hearings in other contexts.2 These are issues that test the limits of biostatistics and epidemiology, and the patience and intellect of the uninitiated.3 Furthermore, if one is able to see through the clouds of jargon and mathematics, many of the analyses pivot on the values and judgments of the experts whose focus is on methodology just as they did for the experts whose focus is on the particular clinical event.4 No wonder many a judge will take the fallback position and declare that the science is uncertain, rather than admissible or not, and defer to wisdom of the jury.
The recent legal scholarship exploring the limitations of the Daubert test should prod science to reexamine “peer review” with at least as much rigor. Do not imagine that “science” is pursued in the abstract while law is subject to realpolitik; science in general and clinical science in particular are as subject to the political pressures of the day in the formulation of the hypotheses to be tested and in the frame of reference for interpreting the results. The attributes in Table 11.1 are only a first cut. To do justice to the justices, one needs to assess whether the “generally accepted methods” are applied appropriately so that their error rates are not spurious. That turns out to be quite exceptional; the experience of the Cochrane Collaboration bears witness (Chapter 4). Furthermore, “publication in reputable scientific journal” guarantees peer review but does not guarantee that the “peers” are capable of the objectivity that might question the premises they have in common with the author.
TABLE 11.1. ATTRIBUTES THAT RENDER SCIENTIFIC ASSERTIONS CREDIBLE IN THE DISPUTATIVE ARENA AS PERCEIVED BY THE U.S. COURT OF APPEALS FOR THE NINTH CIRCUIT (95 C.D.O.S. 131)
They are based on generally accepted methodologies with known error rates.
They have been subjected to the scrutiny of peers as evidenced by publication in reputable scientific journals.
They are based on personal independent investigative experiences of the individual making the assertions particularly if the investigations predated and were independent of the litigation.
Lacking the above, corroborating opinions of learned bodies, or opinions expressed in learned treatises may hold sway.
IDENTIFYING AND ABROGATING MUSCULOSKELETAL HAZARDS AT WORK
I am not suggesting that every one of us becomes a statistician or even an epidemiologist. Both disciplines have flourished in the past few decades, and our country is teeming with practitioners of varied competence. But if the judge and the jury can be called on to evaluate epidemiology, then we are also called on. In the context of occupational musculoskeletal disorders, epidemiology serves two masters: It is recruited to the task of discerning whether particular exposures in the workplace are hazardous. It has been recruited occasionally and will be frequently in the future to discern whether alterations in exposures at work are salutary.
This monograph is an exposition on the regional musculoskeletal disorders as they relate to the workplace. By definition, we are not considering disorders that result from violent events. It has long seemed good common sense that regional disorders result from usage of the musculoskeletal region that is hurting. After all, because using the musculoskeletal region that is hurting makes it hurt more, it follows that such usages made it hurt in the first place. This common sense, this belief, is a hypothesis. The hypothesis suggests that a particular exposure, such as materials handling or some repetitive upper extremity usage, leads to particular health effects, often termed “injuries.” Epidemiology can be recruited to discern the likelihood that this causal inference, so intuitively seductive, is wrong. There are three elements necessary to test the hypothesis: define exposure, define health effects, and develop some systematic way to determine the likelihood that the exposures and health effects are not cause and effect in some way that minimizes biases and renders it unlikely that we are not lulled into ignoring more important exposures by our zeal to support our preconceived notions.
Flawed Studies
Any study that compromises on even one of these three prerequisites should be read as hypothesis generating at best. For example, observations on convenience samples are of limited value because they are so subject to biases (i.e., systematic errors) in referral, selection, reporting, and observing. Thus, if a hand surgeon works in a town where the principal employer is a textile mill, that surgeon might very well discover that all employed patients with carpal tunnel syndrome are exposed to cotton dust. The association is real, but it reflects bias rather than causation. Observational studies, also called ecologic studies, are part of the clinical lexicon and should be. After all, they are the fodder for the peer review of clinical inferences and the fountain of hypotheses for clinical investigation. But they do not test hypotheses.
Observational studies are not the bane of the relevant literature. That distinction falls on the studies that purport to be systematic, to be “science,” but are not. Sometimes an overtly prejudicial research design is used. More typically the author asserts causality without directly measuring either the exposure or the health effect. Often the exposure or the health effect is just presumed, or some surrogate measure is assumed to be valid. The ergonomic literature is wont to assume health effect or use either physiologic measurements such as surface electromyelograms or insurance claims as surrogate measures. The clinical literature is wont to assume a relationship to work based on compensability. How many “ergonomically” designed contraptions have been heralded, purveyed, or mandated because junk science “predicts” that they assuage risk? How many vertebral laminae and volar carpal ligaments have been sacrificed on the alter of indemnification? How many job applicants have submitted to radiographic or electrodiagnostic screening on the basis of ungrounded assumptions?
Some overtly flawed studies will always escape the “peer review” filter of journals and funding agencies; reviewers and authors too often wear the same rose-colored glasses. Therefore, practicing physicians and surgeons must be responsible for determining the quality of any “literature” they transpose into bedside inferences, as must anyone else responsible for ensuring that the citizenry is not duped.5 We must see the primary data if we are to be a match for junk science. If the judiciary is now expected to face this challenge, shouldn’t medicine lead the way?6 In this spirit, I will not provide a compendium of the observational studies or junk science that relates to occupational musculoskeletal disorders in this chapter or elsewhere in the monograph. Rather, I will defend my assertions by displaying details of the defensible and interpretable science. In each instance, I am aware of no substantive science to the contrary. If you think there is some, show me the data that convince you.
Defining Exposure
The causal hypotheses that relate to occupational musculoskeletal disorders are not inclusive of the extremes of the context in which jobs are performed or the extremes of task content; neither outrageous management practices such as those that characterize the lot of some migrant laborers nor violent events are encompassed whether the violence is a result of external forces or extremes of volitional usage (such as by a professional athlete). The hypotheses we wish to test relate to whether ordinary exposures are hazardous over time. The exposures at issue include work contexts that are not patently evil and musculoskeletal usages wherein the elements are neither exceptional nor uncomfortable. Defining either category of exposure is challenging, indeed.
The “context” of work is clearly multidimensional. It includes elements of worker acceptance and acceptability that can be idiosyncratic and vary over time. It includes aspects of macro- and micromanagement that also vary over time. And it includes reality issues such as fear of redundancy, redundancy, and level of fiscal reward. The need for capturing this multidimensionality as a measurement is driving investigative industrial psychology.
Measuring ergonomic exposures is no less challenging. Many such exposures are common outside the workplace where they may be assumed to be examples of normal, perhaps healthful, biomechanics. But delineating “normal,” particularly for musculoskeletal usage, is an insurmountable challenge given the myriad events in the lives of myriad people. So the only feasible compromise in experimental design is to contrast individuals who are performing one relatively stereotypical vocational or avocational task over time with individuals performing either another stereotypical task or tasks that are not so stereotypical. Obviously, defining “stereotypical” is operational. It assumes some notion that “normal” usages are more varied. Even this assumption is tenuous. For example, training and practice in hand surgery may call for upper extremity usages that are as stereotypical as keyboard tasks and may be as repetitive as a gentle keyboard task that calls for 10,000 keystrokes per day. Likewise, general orthopedic surgery practice has many elements that rival materials handling in industry.
Reality intrudes on the experimental design in other ways. How many contemporary workers stay at the same job, let alone the same task, over time? Defining ergonomic exposures, by their very nature, is a contrivance. Every study must be examined closely to see whether exposure is meaningfully defined, is defined a priori (defining the exposure after the study is a classic form of data massaging), and whether the definitions generalize.
Defining Health Effects
As was discussed at length in Chapter 10, regional occupational musculoskeletal disorders are often labeled “injuries.” If they were not, neither clinical care nor disability could be indemnified under Workers’ Compensation Insurance plans. There is circularity in this reasoning. For most of us in medicine, the assumption would be that the health effect reflects some special, demonstrable pathoanatomic outcome before we would apply the “injury” label. Society has a broader definition of injury, one that holds pain, dysfunction, and any assault on one’s joie de vivre to be “injuries” even in the absence of tissue damage.
For some of the regional disorders, notably carpal tunnel syndrome and reflex sympathetic dystrophy (Chapter 9), there is specific quantifiable pathophysiology. But that is the exception. For most regional disorders, there is either no demonstrable pathoanatomy or pathophysiology, or whatever is demonstrable is not specific; it can be found readily in individuals who have no symptoms and will persist in the individuals with symptoms once they have returned to health. As a result, defining health effects requires consideration of the options presented in Table 11.2.
TABLE 11.2. OPTIONS FOR MEASURING HEALTH EFFECTS
Current regional pain/restriction in motion.
Recall of regional pain/restriction in motion.
Need to report regional pain/restriction in motion to a provider
outside the context of the workplace.
to a provider within the context of the workplace.
Registration of an insurance claim for health care
under health insurance.
under Workers’ Compensation Insurance.
Time lost from work
covered by sick leave.
indemnified by Workers’ Compensation Insurance.
All of these are health effects that depend on perception and process for quantification. Therefore, all are subject to biases, that is, systematic influences that change the likelihood of recall (recall bias), or change the likelihood of reporting (reporting bias), and so forth. We will see, time and again, how the inferences to be drawn from epidemiologic studies of the occupational musculoskeletal disorders vary dramatically depending on which of these health effects is selected as the outcome measure.
Systematic Investigations
As we discussed, the first level of epidemiologic investigation is observational. Someone has to have an idea that is usually based on some personal observation that there seems to be an association of an exposure and a health effect. Such observations are hypothesis generating but no more than that. The reason they are of such limited value is that they are so subject to biases of selection, referral, reporting, and observing. No such observations, no matter who makes them or how often they appear in the literature, should be viewed as scientific support for the existence of a causal association, or lack thereof. To test any such hypothesis, one must go on to systematic studies. There are three designs that are feasible, each with strengths and limitations.
Cross-Sectional Studies
Cross-sectional studies look at populations of individuals at a given point in time (Table 11.3). This design allows one to measure the prevalence of a health outcome in those individuals in a group with the exposure at issue and compare it with the prevalence in individuals in the same or other groups who are as similar as possible but are without the exposure. Cross-sectional studies are the most feasible of the systematic studies and therefore relied on by all as the first test of observational hypotheses. Although they are the most feasible, they are also most subject to systematic error, biases, and skew, and therefore offer the greatest challenge to design, data analysis, and interpretation. The following are a few of the crucial considerations in this regard relevant to musculoskeletal disorders in the workplace.
TABLE 11.3. FEATURES OF A CROSS-SECTIONAL STUDY
Description:
A “Snapshot of Life” in that it measures the prevalence of health effect in exposed versus unexposed populations at a given point in time.
Advantages:
Most feasible
First test of observational hypotheses
Disadvantages:
Survivor bias or “healthy worker effect” is inherent to the design
Cannot establish temporality; the health effect might predate exposure
Point prevalence is quantifiable, but it is difficult to assess health effects retrospectively
Likewise, it is difficult to assess exposure retrospectively
It is difficult to avoid presumptions regarding choice of “control” or comparison variables
This design is to measure exposure and health effect at one point in time, a “snapshot” of life, if you will. It is usually impossible to determine whether individuals who experienced an untoward outcome had been selected out in the past so that the population understudy is enriched for “healthy workers.” This is termed survivor bias. It is similarly difficult to establish whether the health effect predates the exposure. This intrinsic uncertainty about temporality compromises inferences about causation.
It is difficult, in a “snapshot,” to define health, particularly if the outcome to be measured is the experience of discomfort. Do we ask if the worker is hurting now, or within the last year, or repeatedly? We know that the fashion in which this information is elicited can change the response rate, maybe even change the experience of discomfort itself; informational, reporting, and conceptual biases operate. Furthermore, although some cross-sectional studies seek the prevalence of discomfort today, or right now, most ask about morbidity that occurred over some greater interval of time. Our memory of such events depends on a number of influences generally termed “recall biases.” Epidemiology is developing tools to measure these and other biases with some promise of measuring their influence on the results of cross-sectional studies. Today, this is an inexact science, rendering the measurement of symptoms as tenuous as the outcome for cross-sectional studies.
Measuring any exposure, including biomechanical exposure, is equally demanding. How does one take into account nonvocational exposures and exposures in the distant past? Tasks such as those at issue are repetitive, but often complex. Which element or combination of elements should be analyzed? Should deviations from the usual fashion of performing the task by a given worker be ignored? The musculoskeletal system is highly integrated in function. Where in the body should an ergonomic health hazard be assessed? For example, it is hard to lift a heavy object without bringing to the task the forceful operation of all joints of the arm, as well as the trunk and lower extremities in bracing and counterbalancing.
The most demanding aspect of the measurement of exposure is to avoid presumption. That is not so easy: The ergonomists think in ergonomic terms, the industrial psychologists think in psychologic terms, the physiotherapists think in anthropometric terms, and so on. If you do not measure the exposure in the first place, you will certainly miss its influence on the prevalence of the health effect. This error in design plagues much of the early literature that seeks only an influence of ergonomic exposures on upper extremity or axial symptoms. We have known for some time that such a design is inadequate. To gain insight into causation, one must account for all the influences known to be important and hope there are none about which we are unaware. The design must control for the confounding influence(s) by comparing groups matched for the presence of confounding influences yet distinguished by the absence of the putatively important influence in one and not the other. Alternatively, one can measure each influence. Then, using the statistic known as multivariate analysis, one can order the magnitude of influence of a number of exposures on a particular health effect.
Longitudinal or Cohort Studies
Longitudinal or cohort studies are far more demanding of the investigator (Table 11.4). Here, a population is assessed as to the health effect, then exposed for some period of time, and then assessed again. Each individual is his or her own control, which eliminates many of the confounders that plague the cross-sectional design. And if the cohort is intact over time, the longitudinal design has no survivor bias. Of course, the human predicament thwarts the ideal longitudinal study. People move and change jobs or tasks, and jobs and tasks come and go. Any investigator who sets out to perform a longitudinal study in modern American industry needs considerable courage, determination, and funding.
TABLE 11.4. FEATURES OF A LONGITUDINAL (COHORT) STUDY
Description:
A population is assessed about prevalence of health effect and then followed over time to determine whether those with the exposure are more likely to develop the health effect de novo than those not so exposed.
Advantages:
Only way to measure incidence of effect
Exposure is defined a priori, without bias inherent in already knowing the outcome
No survivor bias, in theory
Each individual is own control
Can establish the temporal relationship between exposure and health effect
Comparison groups are necessary to establish the spontaneous incidence rate
Disadvantages:
Results not available for a long time
Demands that all exposures and effects to be measured are established at outset; secondary and subset analyses are tenuous at best
Mobility, redundancy, and other reasons to drop out compromise the cohort
Changes in jobs, tasks, and industries compromise assessment of exposures
Still cannot prove causation, although more powerful than cross-sectional designs
Expensive to perform
TABLE 11.5. FEATURES OF AN EXPERIMENTAL STUDY
Description:
Experimental population is randomly assigned to be exposed or unexposed and then followed longitudinally.
Advantages:
Comes closest to establishing causation
Need not wait long for results if the effect is sizable
Can use small numbers of subjects if the effect is sizable
Disadvantages:
If the health effect is small, randomization errors can prove overwhelming
Hawthorne (placebo) effects are difficult to avoid
The more restricted the experimental design, the more powerful the experiment, but generalizability is compromised
Often costly
Experimental Studies
Neither cross-sectional nor longitudinal studies can prove causation; they are better at asserting the lack thereof. Experimental studies are more powerful in enhancing one’s certainty that an association is causal and that lack of association argues against causation (Table 11.5). Here one establishes an experimental population and matches that to a control population. One group is experimentally exposed to the putative hazard and the other is not. If the populations are similar in all ways except the exposure and yet they differ in outcome to a meaningful degree, the causal inference is compelling.
THE NEVER-ENDING SAGA OF ERGONOMIC REGULATIONS
Epidemiology, Ergonomics, and Politics
There is an exercise in evidence-based occupational medicine that is central to the theme of this monograph and that happens to be a contentious cause célèbre in the politics of workplace health and safety. It is an object lesson for this chapter. It displays all the inherent limitations of modern epidemiology, as well as the pitfalls that await any who approach science to support their preconceived notions rather than to test them. The exercise has played out over the past two decades with no closure in sight. After generations of varying advice about the safety of materials handling tasks, it is now argued that the science is ripe to define limits, much as limits are promulgated for noise in the workplace. To this end, there has been a concerted effort to promulgate a “standard” to define the limits for acceptable physical demands in the modern workplace. Furthermore, the proponents are seeking regulatory clout. The latest iteration of this exercise is ongoing. This offers a priceless opportunity to dissect the science, the motivations that underlie the exercise, and the fashion in which the limits of certainty are molded to fit a political agenda. There is nothing new about such dialectic, nor is it peculiar to the realm of workplace health and safety. That just happens to be our thesis.
On November 23, 1999, the Occupational Safety and Health Administration (OSHA) of the U.S. Department of Labor proposed a rule for an “Ergonomics Program Standard.” The document is available in the Federal Register (1999; 64[225]:65768-66078). This initiated a “rules-making” process that eventuated in the Proposed Standard becoming regulation in the waning days of the Clinton administration. It was a short-lived standard because the executive fiat was to be reversed by an act of Congress early in the Bush Administration. Like Lazarus, it has risen yet again in the guise of a “Comprehensive Plan to Reduce Ergonomic Injuries” put forward by OSHA in the spring of 2002 (United States Department of Labor 02-201). The plan has two elements; it will promulgate “targeted guidelines” and follow that with “tough enforcement.” To serve the first element, a National Advisory Committee on Ergonomics is charged with deliberating for 2 years. The charge to the National Advisory Committee on Ergonomics is the third attempt to make sense of the relevant science promulgated by agencies of the U.S. Federal Government in recent years. The first involved an intramural exercise by the National Institute of Occupational Safety and Health (NIOSH). For the other, Congress charged the National Academy of Sciences (NAS) with the task of convening a “consensus” panel. The resulting NIOSH7 and NAS8 documents were relied on by the Secretary of Labor in the Clinton Administration in promulgating the Ergonomics Standard. Scientists in other countries have undertaken similar exercises in the past several years.9, 10, 11, 12 In any such exercise, the participants must reach methodologic consensus as to which articles will be considered to be high in quality and used as such, and which are marginal and to be ignored. The process invokes small group psychology, perhaps much more group psychology than rigorous science. No wonder these bodies of scientists examine the same literature and reach very different conclusions. For example, whereas NIOSH discerned “strong” evidence that non-neutral posture was a risk factor for disabling neck pain, another group found the evidence “inconclusive.” But none of these interactive groups of systematic literature reviewers were willing to step back and ask whether the literature tests the ergonomic social construction in toto. None are willing to ask whether physical demands are the sole, or even the predominant, explanation for disablement consequent to regional musculoskeletal disorders. That narrow-mindedness is not a surprise to those of us who are students of the history of science.
There were two engines driving the push for a standard during the Clinton Administration and for guidelines under Labor Secretary Chao. One was the agenda of organized labor. The other was the documentation of the incidence of disabling regional musculoskeletal disorders in the workforce. There is no debate about the importance of disabling regional musculoskeletal disorders for the worker who is experiencing pain. Clearly, large numbers of workers who are otherwise well continue to experience the illness of work incapacity as a consequence of musculoskeletal pain that is occasioned by no discrete traumatic or even uncustomary happenstance at work. For example, there were nearly a quarter of a million “repeated trauma” cases (upper extremity regional musculoskeletal disorders) reported to the Bureau of Labor Statistics (U.S. Department of Labor) in 2001. There is no debate about the importance of disabling illness for an economy that generated $1000 per worker in direct disability lost-time costs and $5000 per worker in total disability costs, half of which were medical expenditures. Approximately two thirds of these moneys flow under the rubric of one or the other of the regional musculoskeletal disorders.
The debate relates to cause and remedy. For 50 years, there has been a concerted effort to define physical hazards that might account for these illnesses, with very limited success. For 50 years we have been disappointed as one empiric attempt after another failed to reduce the incidence of disabling regional musculoskeletal disorders by diminishing putatively hazardous physical exposures. Neither modifying the task content nor promulgating alternative body mechanics for materials handling has proven effective. Nonetheless, Federal policy is being crafted to reduce the putative physical hazards by virtue of a new “standard” or new “guidelines” that are but an extension of these same approaches, now to be termed “ergonomics.” It is my position that basing redress on “ergonomics” is ill advised. Furthermore, this focus is impeding the progress that might be forthcoming from the science that has superseded “ergonomics.” In the discussion that follows, I will demonstrate the fallacy in the “ergonomic” idée fixe.
The Fallacy in the Occupational Safety and Health Administration Standard Regarding “Health Effects”
Most regional musculoskeletal pain is exacerbated by usage of the particular musculoskeletal region that is hurting. Often, there is no discomfort without such usage. The association between usage and exacerbation of symptoms is reliable and predictable so as to render the causal nature of this association incontrovertible. Swayed by this association, generations of observers and people in pain have presumed a corollary association; the usage that exacerbates the pain must be the usage that caused it in the first place. Similar reasoning has long been applied to occupational musculoskeletal disorders and is the cornerstone of the drive for an ergonomic standard; when a worker declares incapacity for particular tasks because of regional musculoskeletal symptoms, those tasks, pari passu, are hazardous. These corollary inferences are at issue. An analogy to angina is instructive. If you have angina, you are likely to experience chest pain in climbing stairs. However, does that association impugn stair climbing as the cause of coronary artery disease or suggest that an escalator is the remedy for angina?
Although the association between exacerbation of most regional musculoskeletal symptoms and particular musculoskeletal usage is incontrovertible, how can one generate confidence in the corollary inferences? Could they represent just coincidence? Or, more daunting, could we be overlooking some association other than usage that is more likely to be primary, even causal? Could it be that the worker whose back hurts worse when bending in the warehouse would have the same backache if he or she had a desk job or was a homemaker and would hurt even more bending to get into an automobile or caring for a toddler? Could it be that most workers experiencing regional back or arm pain do not find the condition incapacitating regardless of the content of their tasks? There is no reason to be certain that any association between regional musculoskeletal pain and either biomechanical exposure or work incapacity represents cause and effect. There is every reason to be cautious in leaping to any such causal inference. History offers far too many examples of false associations that were seductive so as to become systems of firmly held belief with consequences that are often bleak, to say the least. Science offers nearly as many. When unmeasured associations are stronger than those measured, there is confounding, and the unmeasured variables are termed confounders.
Fear of spurious associations has driven some of the keenest intellects for centuries, from Francis Bacon in the 17th century, to David Hume in the 18th century, to John Stuart Mill in the 19th century, and to Karl Popper in the 20th century. All have tried to reason a solution to this Sisyphean quest for certainty. None has emerged. Rather there are two categories of compromise. The approach championed by Popper is deduction. Essentially, no one should feel or promulgate a comfortable degree of certainty unless and until the association has been put to some systematic testing and has emerged unscathed, repeatedly. Only then is the association less likely to be false. I ascribe to this philosophy and take my place among the refutationists and Bayesian scholars of like mind. We all admit that many associations do not readily lend themselves to systematic testing, and even those that do may first have to be honed so that the results of the testing may not generalize beyond the honing. But without testing, associations are but associations, and any causal inferences drawn from the associations are nothing but leaps of faith. The basic tenets of ergonomics standards have not survived systematic testing, although the relevant studies are barely alluded to by the scientists convened to formulate the NIOSH and NAS documents. I will remedy that shortly. For our purposes in learning epidemiology, the “Proposed Standard” offers an ideal straw man. I will set out the major premises of the “Proposed Standard” and then demonstrate the fallacies.
CONFOUNDING
When I was a medical student, epidemiologists observed that the risk for Down syndrome (Trisomy 21) was not uniform among siblings. The youngest child was more likely to be afflicted with this congenital disorder. That led to hypotheses and research about what caused the fertilized egg to divide abnormally in the multiparous uterus.
Several years later, epidemiologists returned to this issue to test whether they had missed the real association. The younger the child, the older the mother. Could it be that the likelihood of bearing a child with Down syndrome is associated more with maternal age than birth rank? The answer proved to be yes. The old hypothesis was superseded, and research shifted to the biology of the aging ovary.
Several years later, epidemiologists returned to this issue to test whether they had missed the real association. The older the mother, the older the father. Could it be the father was the cause of the malady? The answer was yes and no. The likelihood of bearing a child with Down syndrome is associated with both maternal and paternal age. The old hypothesis was superseded, and research shifted to the biology of the aging ovary and testis.
Such is the scientific method. We learn from the old hypotheses and the old false starts. And we move on. Today, no one would consider studies of the microenvironment of the multiparous uterus relevant to the pathogenesis of Down syndrome. Scientists who were only prepared or willing to continue in that previous vein of research could no longer effectively study Down syndrome. To advance the research and our understanding of Down syndrome, they had to retrain and redirect their efforts.
Because deductive reasoning is so demanding in a methodologic sense and often simply not feasible, inductivism has been championed as an expedient. The authors of OSHA’s Proposed Ergonomic Standard assert that inductivist philosophy is the foundation of the proposal. They cite Sir Austin Bradford Hill’s famous 1965 President’s Address before the Section of Occupational Medicine of the Royal Society of Medicine in London titled “The Environment and Disease: Association or Causation?”13 But they misquote and misinterpret this article in culling a list of “criteria” that they claim they have satisfied to a such a degree as to “strongly” argue “for a causal relationship between the risk factors presented in this section and MSDs.” By “this section” they mean the “Health Effects” section of the Proposed Standard, and by “MSDs” they mean the regional musculoskeletal disorders.
In the first instance, neither Hill nor most philosophers of science would term this list “criteria.” Hill goes to lengths to term the list “viewpoints” and to offer them as “aspects” of the association that we should “especially consider before deciding that the most likely interpretation of it is causation…. None of my viewpoints can bring indisputable evidence for or against the cause-and-effect hypothesis and none can be required sine qua non. What they can do is help us make up our minds on the fundamental question—is there any other way of explaining the set of facts before us, is there any other answer equally, or more, likely than cause and effect?” The “viewpoints” are presented in Table 11.6.
The first and most egregious error in logic that the authors of the proposed Ergonomic Standard made relates to their definitions of the association they are touting. We will say much more about their definition of health effects, that is, “MSDs,” shortly. For the moment, let us examine their definition of exposure. They talk of “multifactorial causation” and “multifactorial etiology,” but these rubrics are further defined to exclude all exposures except “biomechanical exposures” in the workplace. The exercise of induction is to question whether there are other exposures that associate even more strongly than “biomechanical exposures.” If other exposures are excluded from consideration, then logic is subjugated to presupposition. More particularly, the authors declare that they have excluded from consideration a highly relevant literature. “Although there is a growing body of evidence linking psychosocial and work organization factors with the development of MSDs, those factors are not addressed here (other than the obvious impact of work organization on work pace).” I will demonstrate next how this particular literature, on inductive and deductive grounds, supersedes the literature on “biomechanical exposures” and renders unreasonable any standard that treats ergonomic variables as the primary in the contexts of cause or remedy.
TABLE 11.6. A. B. HILL’S VIEWPOINTS
Strength
Consistency
Specificity
Temporality
Biologic gradient
Plausibility
Coherence
Experimental evidence
Analogy
Only gold members can continue reading. Log In or Register to continue