Reason Versus The Therapeutic Menu: Conservative, Empiric, Aggressive, Alternative, And Complementary Therapies



Reason Versus The Therapeutic Menu: Conservative, Empiric, Aggressive, Alternative, And Complementary Therapies





“Conservative” is one of the more beleaguered, if not abused, terms in the contemporary clinical lexicon. Terminology, such as “conservative care for,” was insinuated into clinical jargon in the mid-20th century to draw a distinction between radical and heroic therapies. “Conservative therapy” was not a pejorative term; if anything, it was to connote reasoned intervention using remedies in which the profession had confidence. It implied a cautious, moderate approach with less risk. Perhaps, for illness in general and back pain in particular, that explains why the term “conservative” does not appear in clinical writing at the turn of the last century.1,2 It would be difficult to advocate any but conservative remedies. More to the point, one had to argue that even a patently heroic intervention was really conservative to practice within the bounds of acceptance; heroic interventions without such acceptance were the business of quacks. All these distinctions were based on the leading opinion, that is, the convictions based on inferential reasoning of the medical Pooh-Bahs of the day. Scientific testing of clinical inferences had to wait for the epidemiology of the last half of the 20th century. Conservative therapy was the accepted and acceptable method.

These are no longer the implications of the conservative rubric. Today the distinction is drawn more with “aggressive” or “empiric” therapies than with those that might be considered heroic, radical, or out of bounds. With this evolution, the term “conservative” has accrued some baggage. Conservative therapy is no longer that which we all concur is reasonable. It is no longer the best we can do; it is the least we can do! By 1970, even the British had accepted this precept for back pain, “the results of conservative treatment are by no means satisfactory.”3 The implication is that we need an alternative to conservative therapy rather than more or new conservative therapies.

Contemporary medicine even applauds “empiric” therapy. The term is bandied about in its Lockean sense; the implication is that wisdom has been gained or will be gained through personal experience. If this is the rationalization for trying something on a patient, the act is termed “empiric therapy,” and pride is taken in exercising the prerogative. However, empiric therapy has not always been a source
of pride. For the classic Greek physicians it was an act of hubris. In fact the term “empiric” was reserved for the charlatan, and “empiric therapy” signified quackery. The contemporary inversion of the label is fascinating from the perspective of semiotics and disquieting from the perspective of ethics. Although empiricism is on the rise, conservatism is suffering aspersion.

Conservative therapy in America has acquired overtones of nihilism; Americans look to aggressive therapy for solutions. I am singling out American perceptions because they are not those of other comparably advanced industrial countries. This duality (conservative therapeusis approaches nihilism while aggressive therapeusis approaches curative) has become ingrained in the minds of the American people and entrenched in the profit centers of the American health insurer. The American physician has been at the vanguard of advocacy for this syllogism. In the 19th century, medical professionals waved their banner of aggressive intervention to denigrate the ministrations of the many, often popular schools of alternative therapy.4 The profession remained on shaky ground until, at the turn of the 20th century, it embraced with zeal the German reductionistic notions of medicine; “adopting such a concept of medicine rejected patient-oriented medicine in favor of disease-oriented or, worse still, theory-oriented medicine.”5 Today, doing something, particularly something dramatic, that places the patient at risk while rendering the patient utterly dependent has cachet in America. Other industrialized countries are inherently more skeptical and uniformly more tentative in their adoption of such interventions.


THE INHERENT RISKS OF AN ETHOS OF CLINICAL AGGRESSIVENESS

“To cure”—not just to manage, treat, minister to, or even heal. “To cure” is the quest of every physician since Hippocrates extricated medicine from the temples of religion. “To cure” is succor to clinical science and balm for the despair of any sufferer. If “to cure” had been realized through hygiene and antibiotics, Camus would never have had to write The Plague. Mortality from erysipelas no longer approaches 90% thanks to penicillin. The condition, its diagnosis, and its treatment have become almost trivial. And undertreatment or tardiness in treatment of bacterial meningitis is malpractice. “To cure” is worthy, glorious, and seductive. There is no argument. When cure is realized, physicians strut and society applauds.

But what price are we willing to pay in the quest? How certain do we have to be of the reality of the cure? How much error of judgment and conviction will we tolerate? The answer is that the quest for cure can commandeer temperance, and if that quest has the patina of authority, it can commander judgment and common sense. In the past century, thanks to authority, countless hysterectomies have been performed to remove retroverted uteruses held responsible for such ills as backache. Thanks to authority, “floating kidneys” and “ptotic colons” have been subjected to one or another form of surgical violence for the same reason. For other reasons, authority has conspired to remove countless tonsils, breasts, and teeth while administering an array of nostrums, all in the past century. Every authority has held a conviction; every conviction has had theoretic underpinnings; every authority’s
personal experience has been interpreted as confirmatory. In the 20th century, medicine advanced in the quest to cure, but the path was peripatetic.

Today, authority can be redefined. As will be discussed in Chapter 10, no theory should hold sway in the clinical arena unless it has been put to the test and escaped unscathed. Aggressive interventions backed only by theory, conviction, and zeal are disquieting at best. At worst, they are harmful. Yet America learns this lesson painfully. We refuse to demand a demonstration of benefit and a quantification of risk whenever the intervention is nonpharmacologic and heralded as curative. We are taught to fear death with a vehemence that allows us to take risk with our life. We compensate our physicians and reward our interventionists commensurate with their promise of cure regardless of its reality.6

Modern American cardiology provides the most disturbing illustration. So much of the enterprise persists in the face of study after study that fails to demonstrate benefit. Yet, the nation accepts the arguments that proponents base on theory and promise. When studies document small benefit, the nation accepts egregious marketing. The contract between the American cardiologic enterprise and the American people is disturbing if not unconscionable. The contract with the American community of spine surgeons is more so given that it is perpetrated without the specter of death. Since World War II, Americans have learned to think of backache as an injury reflecting forms of intervertebral pathology that are potentially surgically remedial. The concept gained firm footing when it was accepted by Workers’ Compensation Insurance programs.7,8 What followed was predictable. Patients in America were primed and ready to expect specific diagnoses and attempts at aggressive intervention. Surgeons leaped to the challenge. It is estimated that more than 2.0% of Americans have already availed themselves of the surgical option,9,10 and their need continues apace. More than 250,000 lumbar spinal operations are performed in the United States annually, for a total of more than 1000 operations per million inhabitants.11 Contrast this with the 100 operations per million in Great Britain12 and the 350 operations per million in Finland.13

Aggressive or not, an unproved remedy is still an unproved remedy. Estimating its risk/benefit ratio or its cost/benefit ratio is fatuous; if the remedy has no benefit, it is worthless and no risk is tolerable. Yet, to varying degrees, industrial nations remain wedded to the unproved remedy if it is aggressive, offered dressed in scientific inference, and championed by authorities. Furthermore, in fee-for-service medical systems, in which the authorities who offer unproved remedies are handsomely rewarded for operating on their convictions, unproved remedies become the standard of practice. Even unprovable remedies remain the standard of practice when the enterprise is entrenched. And neither “managed care” nor “evidence-based medicine” is a match for these high-ticket, aggressive interventions. The interventions represent an enormous cash flow and transfer of wealth; the economics seem to perpetuate the belief that they are “good” for us. All managed care is accomplishing is a shift in the transfer of wealth from the providers of the putative cures to the administrators and stockholders.14 It is time to reeducate society and to educate physicians; there is no inherent value in aggressiveness. Aggressiveness per se is anathema; the ill deserve reasoned, proven interventions, aggressive or not.



CONSERVATIVE CARE FOR LOW BACK PAIN: CAVEAT EMPTOR

“Regional back pain” is the backache experienced by someone of working age who would otherwise be well if it were not for the low back pain and whose pain was not precipitated by an overtly traumatic event involving external force.15 Regional back pain is the appropriate diagnosis for the majority of episodes of back pain in this age group and is all that is being considered in this chapter. The diagnosis and management of regional low back pain are considered in depth in Chapter 6. That very rare younger individual whose backache is a consequence of metastatic infection, neoplasia, or primary neoplasia is not a candidate for the conservative therapy that is appropriate for the care of a patient with a regional backache. The patient with a systemic backache is deserving of conservative care appropriate for the particular cause of the back pain.

Regional backache is a ubiquitous remittent and intermittent experience.16 No doubt it has always been so. And no doubt some patients have chosen medical advice in quest of a remedy throughout the history of the profession. The dawn of medical interest in back pain was auspicious. The relevant hieroglyphics of the Edwin Smith Papyrus, the famous clinical document written around 1500 BCE and unearthed in Thebes in 1862, are translated as follows17:

“If thou examinest a man having a sprain of the vertebra of his spinal column, Thou should say to him: Extend your legs and contract them both. He contracts them both immediately because of the pain he causes in the vertebra of the spinal column. Thou should say to him: One having a sprain in the vertebra of his spinal column, an ailment I shall treat. Thou should place him prostrate on his back. …”

Medicine in Egypt in that time was preeminent. Practitioners were men of learning. Specialization flourished, and back pain commanded attention in a fashion that has returned to vogue today. The pain was held to emanate from the vertebral body itself. The scribe is recording a description of a diagnostic maneuver that harkens to some form of Lasègue’s sign. If only the next phrases of the papyrus had survived, perhaps the Egyptian physicians of old could have provided the therapeutic insight that we still lack. More likely, the next phrases were unfounded therapeutic assertions. So were the assertions of Hippocrates (400 BC) in “Peri Arthron” and “Mochlikon,” Avicenna (1073 AD), Charef-Ed-Din (1465 AD), Antoine Paré (1590 AD), and others in the medical pantheon who have pontificated on the diagnosis and management of backache.18 No wonder that in the 3500 years since the inscription in Thebes, patients have gravitated to other remedies more often than to those offered by physicians.

They do so today. The choice to be a patient with a backache is, nearly always, just that, a choice. Other options have been considered and discarded or tried and found to be lacking. The volitional aspect of seeking a physician’s care must be appreciated to do justice to the choice; often the choice is driven by life’s issues
that confound the backache and that must be addressed if therapy is to succeed.19 This truth, which seldom finds its way into medical textbooks on any topic, is the secret to diagnosing and managing low back pain. It is the subliminal force that has driven patients to seek cures high and low for millennia. The “shepherd’s hug” and the “trampling cure” persist from antiquity, and “bonesetters” long plied their trade.20 To come to grips with the evolution of recourse to and acceptance of the conservative management of backache, it is important to realize that “common sense” is not simply wisdom gained from personal experience or the experience of others. It has long been tempered by advice, offered by both physicians and alternative practitioners, based on theory, sometimes substantive data, and always conviction. Such advice has seldom been offered without an element of self-service.21 Consequently, the “common sense” of one stratum of society and of one geographic region may have little in common with that of another. The following epidemiologic studies illustrate this maxim.

In 1978 Verbrugge and Ascione undertook to assess the everyday experience of morbidity.22 The “Health in Detroit” survey was based on a probability sample of 589 white households in the Detroit metropolitan area. An initial interview was conducted with one adult in each household who then maintained a daily diary for 6 weeks, after which there was a closing interview. The average adult had 16 symptomatic days during the 6 weeks; only 11% of men and 5% of women remained free of symptoms. Respiratory symptoms were the most common, followed by musculoskeletal symptoms. The incidences of musculoskeletal morbidities are summarized in Table 4.1, and their quality and outcome are summarized in Table 4.2. More than half the people were coping with musculoskeletal symptoms for an average of 8 days every 6 weeks. The majority of the morbidity was backache. Only 3% of the people in this sample sought medical advice; only 0.3% actually received medical care.

Contrast this prospectively followed small sample with the recalled experience and behavior revealed in a national survey conducted at about the same time. The National Health and Nutrition Examination Survey (NHANES) II was performed between 1976 and 1980 on a probability sample of 27,801 non-institutionalized civilians. Of this sample, 10,404 adults were also subjected to a physical examination and formal interview. Of these, 1763 recalled “pain in your back on most days for at least 2 weeks”23; in 1516 adults it was primarily in their low back, resulting in a 13.8% cumulative lifetime prevalence of memorable backache lasting more than >2 weeks. The national experience regarding accession of health care for more
prolonged backache contrasts strikingly with that revealed in the Health in Detroit survey for brief episodes. The majority of Americans with more than 2 weeks of memorable backache visit health professionals (Table 4.3). In fact, backache was the second leading symptom engendering physician visits at the time of the NHANES II.24 Even more remarkable than just the professional contact is the array of remedies that were introduced to treat backaches (Table 4.4).








TABLE 4.1. INCIDENCE OF MUSCULOSKELETAL MORBIDITIES DURING THE 6-WEEK COURSE OF THE HEALTH IN DETROIT SURVEY22





























Men


Women


Total


Percentage with any musculoskeletal symptoms


44


56


51


Percentage of all days patients had musculoskeletal symptoms


8


12


11


Days with musculoskeletal symptoms as percentage of all days


26


30


29


Average number of days of musculoskeletal symptoms experienced by those with any such symptoms


7


9


8









TABLE 4.2. QUALITY AND OUTCOME OF MUSCULOSKELETAL MORBIDITY IN THE HEALTH IN DETROIT SURVEY22































For the majority:


They were otherwise asymptomatic


They thought they had “arthritis”


They talked to their spouse


They took OTC analgesics


They thought their symptoms were “not very serious”


They experienced back or leg pain


For <10%:


They experienced neck pain (9%)


They experienced hand pain (6%)


They thought their symptoms were “very severe” (8%)


They sought medical care (3%)


They received medical care (0.3%)


OTC, over-the-counter.


Before attempting to draw inferences by contrasting the NHANES II with the Health in Detroit survey, we have to come to grips with yet another data set collected at about the same time. Biering-Sorensen managed to enroll 928 adults, representing 82% of all 30-, 40-, 50- and 60-year-old residents of Glostrup, Denmark, in a 1-year survey focusing on backache.25 At entry in the study, 62% were
experiencing back pain or recalled having experienced back pain within the prior 12 months. These entry criteria combine the point prevalence of the Health in Detroit survey with the considerable recall uncertainties26 of the design used in the NHANES II. Nonetheless, this is an extraordinary level of awareness of past episodes of backache, far greater than one would predict from the American surveys. Part of the explanation is that Biering-Sorensen accepts “insufficientia dorsi” as a qualifying illness; this is a “feeling of weakness, fatigue and/or stiffness in the lower back” that accounts for approximately 25% of the recalled morbidity. But this still does not explain the extraordinary excess of morbidity ascribed to the low back that is recalled by these Danes. Recourse to health professionals because of backache was as likely as in the NHANES II (Table 4.3), although the reservoir of morbidity in Glostrup exceeded that in America by more than fivefold! In Glostrup, 60% of the patients consulted their general practitioner, 25% consulted a specialist, and 15% consulted a chiropractor.27 The therapies prescribed in Glostrup (Table 4.5) differ
from the American experiences particularly in the reliance on some physical modalities such as injection and manipulative therapies while deemphasizing rest and analgesics. The menu of practitioners and modalities is the same, just the proclivities differ.








TABLE 4.3. THE USE OF PROFESSIONAL CARE BY PATIENTS WITH MORE THAN 2 WEEKS OF MEMORABLE BACKACHE IN THE NHANES II23





























Health Professional


%


General practitioner


58.6


Orthopedist


36.9


Chiropractor


30.8


Osteopath


13.8


Internist


7.6


Rheumatologist


2.5


Any


84.6


NHANES, National Health and Nutrition Examination Survey.









TABLE 4.4. REMEDIES USED BY THE NHANES II PARTICIPANTS IN THE MANAGEMENT OF THEIR BACKACHE24































































Treatment Ever Used


Percentage of Those Using


Percentage Who Thought It Helpful


Of Those Who Thought It Helpful, Percentage Still Using


Rest


80.8


85.5


48.5


Heat


73.9


80.4


32.1


Aspirin


58.2


76.7


48.1


Stiff mattress


57.9


84.8


89.2


Exercises


40.5


78.1


43.0


Bedboard


36.1


84.8


63.6


Back brace


27.0


70.8


28.1


Traction


20.7


62.9


9.6


Diathermy or paraffin


16.7


75.3


3.9


Cold


7.2


55.9


15.5


Splints/casts


3.6


73.9


5.1









TABLE 4.5. REMEDIES USED BY THE GLOSTRUP SURVEY PARTICIPANTS IN THE MANAGEMENT OF THEIR BACKACHE29






























Treatment received


Percentage subjected


Bed rest


27


Physiotherapy


49


Local muscle injection


20


Exercise program


15


Lumbar traction


12


Manipulative therapy


20


Spinal support


3


Analgesics


43


And that, indeed, is the compelling message of these and many other surveys of this sort. We will all experience backache, repeatedly and sometimes intensely. We will all be forced to cope with our discomfort. There is nothing reflexive about our coping. How we cope, what we do or do not do is learned! And the lessons differ across socioeconomic strata, sociopolitical boundaries, and time. The reason no best way has emerged is that there is no best way.28 Rather there is a cacophony of heuristic pathogenetic inferences playing on our anxieties and a plethora of unproved and marginal remedies vying for our patronage.

This reproach pertained to the pharmaceutical industry until 50 years ago. Requiring quantification of the benefit/risk ratio of prescription agents is one of the triumphs of reason and science. It is an object lesson in terms of its success—and its vulnerability.


PHARMACEUTICAL SAFETY

The contemporary marketplace is alive with keyboards, chairs, tools, and workpractice consultants all barking “ergonomically sound” with the implicit promise of abatement of regional musculoskeletal “injuries,” rendering the intervention cost-effective and advisable, if not mandatory. The scene harkens back to the wanton purveyance of putatively effective medicinals in the 19th century. President Theodore Roosevelt signed the Food and Drug Act into law in 1906, banning any traffic in misbranded or adulterated drugs from interstate commerce. A “drug” was defined as any preparation recognized by entry in the United States Pharmacopoeia or National Formulary and therefore intended for the treatment of afflictions of human or animal. By stipulating “misbranding,” the Act gave notice regarding labeling of such agents. A bureau in the Department of Agriculture, the predecessor of the Food and Drug Administration (FDA), was charged with the execution of this statute with the power to seize adulterated or misbranded agents and condemn the purveyor. This marshaled in the century of the regulatory agency and therefore was an auspicious event. It recognized that the federal government had some responsibility in terms of consumer protection. However, consumers were provided no reassurance that the concoction they consumed was effective or safe, only that it was what it was labeled. In fact, a 1911 Supreme Court decision found that the Act only prohibited mislabeling of contents, not false claims about therapeutic benefits!

President Taft led the charge on quackery. With his urging, the Shirley Amendment was passed in 1912 prohibiting fraudulent therapeutic claims on the label. The Amendment created more problems than any degree of consumer protection it fostered. Now the government had to devise a method for proving that the claims for benefit were not only wrong but also fraudulent; to prove “fraud” the
government had to prove an intent to deceive rather than simply misstatement or overstatement. Both the Act and the Amendment were no match for the purveyors.

And so it remained caveat emptor until the summer of 1937. The scientific community was aglow with the benefits of the newly introduced sulfa antibiotics. The pharmaceutical industry was faced with the task of providing these wonder drugs for a needy populace. An inventive chemist employed by the Massingill Company of Bristol, Tennessee, devised a way to prepare sulfanilamide as a palatable elixir; he dissolved the drug in dilute diethylene glycol and flavored the solution with raspberry extract. Quality control was delegated to his own palate and nares. Massingill was soon shipping hundreds of gallons of Elixir of Sulfanilamide. By the fall of 1937 the first reports of the death of patients who imbibed the elixir surfaced, and distribution ceased. Even though the FDA instituted a search for every bottle, more than 100 people died of ethylene glycol poisoning. The owner of the company, Dr. S. E. Massingill, publicly defended the distribution of the elixir without prior testing; his inventive chief chemist committed suicide.

A homeopathic physician in Congress, Senator Royal S. Copeland, introduced the revised Federal Food, Drug, and Cosmetic Act, which was passed into law in 1938. This will not be the only example of tragedy forcing the passage of enlightened legislation. The 1938 Act stipulated that any future drug to be marketed must first be demonstrably safe. The manufacturer must submit data to that effect to the FDA. Furthermore, the label must reflect any degree of deviation from a standard of purity and quality for each agent. The FDA could inspect manufacturing facilities and hold the manufacturer responsible for adulteration even if there was no intent to defraud.

We were making progress, but we had a long way to go. The impetus was another disaster 24 years later. Thalidomide is a remarkably effective and relatively safe soporific in adults. It was licensed and widely used in Europe. In America, probably because of bureaucratic inefficiency rather than prescience, it had not been released for use. However, the manufacturer anticipated such release, and the sales representatives of the day managed to convince physicians to enroll almost 4000 pregnant American women in an investigative program designed, I would surmise, to improve physicians’ familiarity with the agent with a view toward impending licensure. (Using so-called clinical trials as marketing ploys has become commonplace today.) Sadly, the agent places the fetus of any pregnant mother at risk, particularly of phocomelia, deformity from arrest of embryologic limb development. When this came to light, the world was enraged.

From 1960 to 1962, Senator Estes Kefauver of Tennessee became the champion of regulatory reform targeting the pharmaceutical industry. His hearings exposed the flaws in evaluation, the lack of informed consent on the part of participants in drug trials, and the use of “investigation” as a promotional technique. Congress responded with the Kefauver-Harris amendments of 1962 expanding the purview of the 1938 Act. The critical advance was that “substantial evidence” of benefit must be forthcoming before the FDA could license the distribution of any new drug. The FDA, in essence, must generate some confidence that there is a favorable risk/benefit ratio before marketing, and the basis for that confidence was to be
“adequate and well-controlled investigations.” Today, Section 355 Title 21 of the U.S. Code of Regulations stipulates that no drug may be introduced into interstate commerce until investigations have been performed that “show whether or not such drug is safe for use and whether such drug is effective in use.”

Nearly 20 years after the thalidomide disaster, Congress provided the FDA with the last of the authorities necessary to serve the protection of the consumer. Pharmaceutical firms were required to monitor consumers after the drug had been approved for marketing based on the studies we will discuss shortly. In that way, unanticipated toxicities can be detected; if the risk/benefit ratio turns unfavorable, the agent can be withdrawn from the marketplace as an “imminent hazard” or even an “unreasonable risk.”


The Food and Drug Administration’s Approval Process

To “show whether or not such drug is safe for use and whether such drug is effective in use,” the purveyor must place before the Secretary of Health and Human Services an application for approval that includes “(1) full reports of investigations which have been made to show whether or not such drug is safe for use and whether such drug is effective in use; (2) a full list of the investigations which have been made to show whether or not such drug is safe for use and whether such drug is effective in use; (3) a full list of articles used as components of such drug; (4) a full statement of the composition of such drug; (5) a full description of the methods used in, and the facilities and controls for, the manufacture, processing, and packing of such drug; (6) such samples … as the Secretary may require; and (7) specimens of the labeling to be used for such drug.” The responsibility for oversight, on behalf of the Secretary, belongs to the FDA. The task was daunting in 1962 when 46 new drugs were introduced into the United States market; each cost $2 million to research during a 2-year development period. By 1980 it took 5 times as long and cost 35 times as much for a drug to reach approval. Much of this reflects the evolution in the sophistication and validity of the process. The FDA is faced with the mountain of “substantial evidence” it demands and the mandate by both patient advocates and industry to decide expeditiously. This latter mandate has led to pathways around the customary algorithm for drugs thought to be of value in desperate situations. However, the algorithm will seem familiar. After all, the charge by the FDA to provide substantial evidence of a favorable risk/benefit ratio was not aimed just at the pharmaceutical industry; it was this charge that promulgated great advances in biostatistics and clinical epidemiology. These same advances pertain to the definition of workplace hazards that will be discussed in Chapter 10.


Investigational New Drug Application

Before a new drug can be tested on humans, the sponsor or responsible investigator must submit a “Notice of Claimed Investigational Exemption for a New Drug.” This is the vaunted Investigational New Drug Application (IND). This document details drug composition and manufacturing processes, safeguards, and quality control.
Short-term toxicities and pharmacokinetics are described based on animal experiments. In addition, explicit protocols for the performance of trials in humans are presented. These protocols include drafts of the “informed consent” documents to be used and guarantees that any untoward event will be reported to the FDA in a timely fashion. Finally, the protocols must receive prior approval by an institutional review board (IRB). The IRB is a local body composed of individuals with appropriate expertise and perspectives to objectively review the protocols from both a technical and ethical perspective and to monitor the trials with some periodicity (usually by annual report). Most IRBs are based in hospitals or medical schools; however, several are based in industry. For example, there are IRBs in companies that profit from the performance of clinical trials. This is considered to be within the intent of the legislation, although I am always bothered by a circumstance that has such potential for conflict of interest. Because of inadequacies on the part of some IRBs (some in august academic institutions) in meeting their responsibilities, oversight has been stepped up considerably.


The New Drug Application

Once the IND is approved, the protocols are activated. The object is to generate sufficient data to evaluate the benefits and risks of using the new agent in a specified population of patients. When, in the opinion of the investigators and sponsors, sufficient data are accumulated and analyzed, a New Drug Application (NDA) is submitted to the FDA. This is a ponderous document detailing further animal data and the customary human trials, which the FDA is to digest rapidly to render an opinion within 180 days. Usually the opinion on the first application is to request an important revision, so the average time from first submission to approval is 2 years. This lag is remarkably brief given the challenge faced by the FDA. The FDA reviewing staff of physicians, chemists, pharmacologists, and consumer safety officers numbers several hundred. Every year, the FDA handles more than 1000 new INDs, hundreds of original NDAs, thousands of supplements to the NDAs, and thousands of amendments. Nonetheless, the agency has protected the American consumer for decades in a fashion that is the gold standard for the industrial world. All involved are wary of streamlining the process; it is hard to argue with the agency’s track record.


Guidelines for Clinical Trials

For most pharmaceutical classes, the FDA Guidelines call for three types of clinical trial in sequence:


Phase I Trial

A Phase I trial is an exercise in toxicology. It represents the initial introduction of the new agent into a healthy volunteer. This usually involves only a small number of subjects, always healthy, often young, and always well paid. The exercise is anxiety
provoking for all involved. After all, the only reassurance available regarding toxicity and appropriate dosing is extrapolation from the animal experimentation. The study is usually performed on an in-patient basis so the volunteers can be closely monitored. The intent is to screen for overt toxicity at particular doses; efficacy and subtle toxicity are not at issue.


Phase II Trial

A Phase II trial recruits informed patients, not healthy volunteers. The intent is to see whether the Phase I insights generalize to patients for whom the drug is intended. In addition, there is a quest for some indication of efficacy and some idea of the dose-response parameters of the agent. The Phase II studies are both uncontrolled and open-label comparisons with placebo agents or active nonsteroidal antiinflammatory drugs (NSAIDs) in modest numbers of patients for a brief interval (usually <2 months). Essentially, Phase I and II trials collect observational data.


Phase III Trial

If an agent has gained approval in the Phase I and II protocols, it is subjected to a Phase III trial. Phase III trials are the crowning glory of clinical epidemiology. Methodologists abound; much effort is directed at avoiding bias and confounding. The trials recruit hundreds of patients and are lengthy, controlled, and often double-blind. The intent is to test efficacy and probe for toxicity in such a fashion that the FDA can arrive at a risk/benefit assessment on which it can base a decision regarding approval for marketing. For that reason, the FDA Guidelines require a comparison with a placebo as well as with an agent of proven benefit. The placebo comparison is necessary to prove that the drug is effective. However, the comparison with the agent of proven benefit is a different exercise. A new drug does not have to be more effective than aspirin or even be shown to be more effective; it will be released if it is indistinguishable and is better tolerated either in terms of toxicity or convenience of dosing. Phase III trials are all experimental in design.


Phase IV Trial

The FDA currently has no guidelines for formal postmarketing surveillance. It is hoped that physicians and pharmaceutical firms will report side effects to the FDA in some timely fashion. However, what is missing is some structured attempt to monitor consumers: a Phase IV trial. The reason relates to the design of the Phase III trial. After all, Phase III trials enroll hundreds of patients, rarely a few thousand. There is some likelihood that they might detect major toxicities that occur with a frequency of a few percent. However, they are not powerful enough to detect major toxicities that occur with less frequency or minor toxicities that occur with some frequency. Realize that both eventualities can have considerable impact when marketing is widespread and consumers number in the tens or hundreds of thousands. Furthermore, Phase III trials are designed to sample a specific subset of patients;
the favorable risk/benefit ratio necessary to gain FDA approval may not generalize to other subsets who might very well be exposed to the drug once it is released; marketing is seldom as restricted as the parameters for enrolling patients in the Phase III trials, and clinical judgment is never that restricted. Here again, a Phase IV “trial,” a form of structured surveillance, might serve the public better than the current uneven method of recognition and reporting.


DRUG TRIALS AND THE LIMITS OF CLINICAL EPIDEMIOLOGY

I have long taught my students that recognizing clinical mysteries is not difficult, and further, that testing clinical inferences requires some effort in learning methodologies but also is not difficult. The genius relates to identifying the rare mystery that might actually yield to the testing. Clinical epidemiology is no match for all life’s mysteries. In fact clinical epidemiology is no match for most of them.

The poetry of life is in its heterogeneity. The magic of life is in its idiosyncrasies. Clinical epidemiology is an exercise in defining the common threads. Thankfully, and often, these threads account for so little of the observable heterogeneity. So when one is identified, really and reproducibly and meaningfully identified, that is extraordinary. All analytic epidemiologic inquiry must yield some ground to the four horsemen of this heterogeneity: chance, confounding, bias, and generalizability. Although studies are designed to minimize their influence (see Chapter 10), they are never vanquished. The challenge in reading epidemiologic inferences, let alone applying them, relates to the amount of ground yielded.


Chance

No matter how careful you are, establishing controls and comparison groups is fraught with the likelihood that important attributes will be concentrated in the experimental or the referent groups, and you will not be the wiser. For experiments, including “randomized controlled trials,” this is termed “randomization error.” It is the reason that I never put much stock in any analytic epidemiology that discerns only a small health effect, and that is most of the studies. Here is why. Modern biostatistics is more and more recruited to confront three clinical nemeses: discerning risk for a few among many, discerning benefit for a few among many, or discerning even a small benefit for most. Indefatigably, investigation after investigation is undertaken in the hopes of demonstrating a small difference to which the authors will impute meaning because it is an outcome unlikely to have occurred by chance alone. Bolstered by the magnitude of P values, results are trumpeted by the lay press, too often even before inferences pass peer review to appear in professional journals. These studies provide the underpinnings for lots of therapeutics and for policies aimed at “health promotion and disease prevention” inside and outside the workplace. My nagging doubts pale next to the wonderful assurance that authors, their peer reviewers, and editors derive from P values.


But doubt I must. Biostatistics may be the best method available, but biostatistics meets its match in these three challenges. I am bothered any time a P value is used to assert confidence in any small clinical outcome. I appreciate that this outcome probably did not occur by chance alone. But I am not as convinced as the author that I know what overcame chance to produce the small difference the author detected. Therein lies the tension; whether to ground one’s clinical confidence in the magnitude of the P value or the magnitude of the effect. Professional statisticians are aware of this tension and even warn of overinterpretation of such studies. Yet the articles pour forth. Think of all those interventions for cardiovascular disease that offer a 2% or less improvement in the likelihood of surviving another year or 5 years. Think of all those thousands enrolled in observational or experimental trials seeking survival benefits of screening for cancer, exercise, a vitamin, some dietary constituent, or aspirin. We are all aware that these investigations, as is true for all scientific experiments, are published so that they might be replicated or challenged by subsequent investigations. But the subsequent investigations require at least as much time and funding as the original. While we wait, early results begin to have a life of their own; the small effect is applied to the greater good and, prodded by the P value, the tentative assertion appears at the bedside. However, the results of “small effect, compelling P value studies” are tentative. Not “tentatively correct,” just tentative. They may be wrong!

To seek a small clinical effect, consideration of statistical power demands that the study population be sizable. It is assumed that all hidden confounding variables will distribute randomly or at least will counterbalance if the study population is sufficiently large. Here is the rub; a little randomization error in a crucial variable can burn you. Take the setting of coronary artery disease. Perhaps there is a single gene with two equally prevalent allelic forms that differentially influence the formation of collateral vessels in the ischemic myocardium. Studies are designed assuming that the alleles distribute randomly between the control and experimental groups. However, what if there is a bit of skewness? Let us imagine a survival study with a few more individuals with the allele that promotes collaterals in a group subjected to angioplasty than in the control group—a distribution of 55/45 instead of 50/50 is not too unlikely (P > .01). This unmeasurable influence on outcome can account for a small difference in survival. The investigator has the right to be confident of the P value, but no right to be comparably confident that the angioplasty was responsible. Furthermore, extrapolation to very similar populations is on the shakiest of grounds; generalization to all with coronary artery disease is delusional.

This is not to say that the clinical questions that drive these investigations are trivial, to the contrary. If only we could improve the survival of patients with cancer or ischemic heart disease even a bit, or palliate inflammation even in a few, or effect even a small decrement in the intensity of angina. Such results should be possible, but it is not possible to establish the effect, or to balance the risk, today. It is disconcerting, if not daunting, to realize that grasping at a “small effect, compelling P value” solution may even be harmful. The more I am forced to contend with the assertions from so many of these “small effect, compelling P value” studies
at the bedside, the more convinced I become that clinical investigation is degenerating.6,29

It is the promise of science that someday soon we will mark those at greatest likelihood of benefit on the basis of genetic or other individual differences. Then clinical investigations seeking larger effects in fewer subjects will become feasible. Then even I will be comfortable ascribing differences detected to measured variables. Then we will be able to derive the small increments in insight that lead to small advances in therapy that actually benefit patients. Until then, let us modulate the influence of “small effect, compelling P value” studies on health policy and clinical judgment. And while we are at it, let us lay to rest the frenzy to promulgate “me-too” pharmaceuticals, technologic gimmicks, and other examples of New Age hubris at best hiding behind imperious P values.


Confounding

Confounding is epidemiology-speak for being blindsided. As will be discussed in Chapter 10, until this decade the bulk of the systematic clinical epidemiology relating to occupational musculoskeletal disorders considered a single exposure: some aspect of work content. Associations were inconsistent and small. That is because the investigators were not prepared to consider other coexisting exposures that might confound their ability to discern an association. In this case it turns out that there are several, particularly exposures that relate to work context that associate more strongly and usually overwhelm the influence of work content. More recent studies are not so naive. For example, no one would explore whether duration of exposure to some aspect of work is associated with a health effect without taking into consideration whether just aging alone would be an explanatory confounder.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Jul 21, 2016 | Posted by in ORTHOPEDIC | Comments Off on Reason Versus The Therapeutic Menu: Conservative, Empiric, Aggressive, Alternative, And Complementary Therapies

Full access? Get Clinical Tree

Get Clinical Tree app for offline access