Designing Clinical Trials in Amyotrophic Lateral Sclerosis




Clinical trials in amyotrophic lateral sclerosis have significantly evolved over the last decade. New outcome measures have been developed that have reduced the sample size requirement as compared with survival studies. There has been increasing recognition that dose-ranging studies are crucial to full evaluation of experimental agents. While the requirements of late stage trials have not changed, many new designs have been suggested for earlier phase development. While no design achieves the perfect balance of sensitivity and efficiency, clinical trialists continue to work toward the goals of smaller and shorter trials so that more compounds can be studied concurrently.


In recent years, much has been learned about pathogenic mechanisms of amyotrophic lateral sclerosis (ALS), leading to a proliferation of new targets for disease modification. Mitochondrial dysfunction, glutamate toxicity, protein misfolding, and microglial activation are just a few mechanisms that have been proposed. For each proposed mechanism, pharmacologic manipulation is possible. Targeted drug discovery programs can lead to new compounds, and re-evaluation of existing drugs may lead to recognition of properties not previously investigated. Recently, a collaborative effort jointly funded by the National Institute of Neurological Disorders and Stroke and the ALS Association tested more than 1,000 available compounds in 29 different assays to determine activity against a variety of different aspects of neurodegeneration . Although the full results of this effort have not yet been published, individual laboratories have further investigated drugs identified by this screening program, with the first of these (ceftriaxone) entering clinical trials in 2006. At this writing, there are at least nine different compounds, either in human trials or about to be tested in human beings. All of these compounds target different aspects of the neurodegenerative process.


Given the plethora of potential therapeutic agents for a rare disease such as ALS, it is crucial that studies are performed in an efficient and effective manner. A well-designed trial can speed time to approval of a drug and identify efficacy, with the minimum number of patients exposed for the shortest possible time. In contrast, a poorly designed trial can fail to show an effect of an effective drug, can mistakenly suggest efficacy when there is no effect, can lead to patient distress and harm, or may simply delay time to approval. This article discusses a range of issues that relate to clinical trials. These include the choice of dose to be studied, the way in which disease progression is assessed, and the formal structure of the trial.


Drug dose


Of all the issues surrounding the clinical investigation of an experimental therapeutic agent, dose would seem to be one of the least problematic. In fact, however, decisions about dose are not simple, and poor dose choices have lead to problems in many previously published ALS trials. Dose finding studies should start at the bench; in preclinical assays, drug activity should be assessed from a no-effect level to a point at which the drug causes clear toxicity. While drug concentrations achieved in vitro do not directly correlate to dose in either in vivo disease models or human trials, they do provide target tissue concentrations to which initial dosing studies should be aimed. Animal models should be used to establish maximum tolerated dose (MTD); disease related activity must be assessed in models that match the disease itself to the greatest extent possible. In ALS, the most commonly used model is the superoxide dismutase (SOD)1 transgenic mouse . However, other models have been employed, including the progressive motor neuronopathy mouse , models of nerve injury such as facial axotomy , and viral diseases that result in motor neuron loss .


For many agents that have reached human trials, full dose exploration has not been performed on animal models. Full dose ranging was not performed using topiramate in the SOD1 mouse model , nor were celecoxib, creatine, or ceftriaxone studied at multiple doses . In some cases, failure to find efficacy may have been the result of incorrect dose choice, while in other cases, efficacy was demonstrated but the most effective dose may not have been found. Failure to study a full range of doses at this level of investigation has lead to trials that may have reached erroneous conclusions.


Assuming MTD has been established in experimental models, as well as a range of doses that show activity against the target disease mechanism, dose ranges in the initial studies on human beings can be appropriately chosen. In ALS, this step has been problematic. For a number of compounds that have been previously studied in efficacy trials, the MTD has not been established. Thus, negative results have been reported for creatine and celecoxib , but the lack of a known MTD leads to the question of whether higher doses of either creatine or celecoxib could have demonstrated efficacy. In other studies, attempts were made to study compounds at doses close to MTD, but lower doses were not studied as well. This may have contributed to the fact that patients treated with topiramate at 800 mg per day progressed faster than placebo patients, and may also account for similar results in the recently reported minocycline trial .




Stages of drug development


The choice of trial design depends on the stage of development of a given drug. Phase I trials are performed to determine the appropriate dose range and dosing schedule, how the drug is metabolized or excreted, and to identify any acute or high frequency adverse events. Usually, phase I trials are performed on healthy volunteers. However, if there is reason to believe that specific aspects of a disease may have an impact on tolerability of a drug, phase I studies may also be performed on patients with disease. This is often the case in ALS; for example, a drug that causes moderate dizziness in normal subjects may result in frequent and serious falls in ALS patients. Usually, phase I studies are small and involve administration of the drug for only a short time period. Thus, low frequency events or effects of chronic administration are not measured. An important goal of phase I studies is to identify a dose above which adverse events preclude use, not to pick up events that will occur in a minority of patients.


Phase II trials are performed with the goal of gathering further safety information, especially related to long-term use of the drug. Pharmacokinetic evaluation of drug accumulation with long-term use is also commonly performed, and different schedules of drug dosing may be evaluated. Although not always the primary goal, some assessment of potential efficacy is often incorporated into phase II trials. In diseases associated with markers of activity (for example, CD4 counts or viral load in HIV), the effect of differing doses on these markers can be used to determine the dose choices for a phase III trial. In ALS, no such markers have been identified, so that attempts to gauge efficacy must be based on the outcomes typically employed in larger phase III trials. For this reason, the line between phase II and phase III trials is often blurred in ALS.


Decisions about dose are often made after phase II trials, so it is essential that multiple doses be evaluated. This has often not been done in phase II ALS trials, and when dose ranging is done, it is often inadequate. In a recently reported trial of pentoxyphylline in ALS, subjects were treated either with 1.2-g of pentoxyphylline or placebo. The trial showed increased mortality in the treated group. However, without any other tested doses, it is unknown whether a lower dose might have resulted in improved mortality. Similarly, topiramate was tested in a phase II study at a dose of 800 mg per day. Although there was no statistically significant effect on mortality, treated subjects lost an average of 10 lb more weight than placebo treated subjects and performed more poorly on functional and respiratory measurements . From assessment of adverse events, it was clear that at this dose, topiramate was quite difficult to tolerate, and the results reported could easily have been a function of each subject’s weight loss and other events. The lack of a lower dose group precludes any conclusion about the true efficacy of topiramate. The most recent example of the same difficulty is the recently reported trial of minocycline in ALS . In both the phase I and II studies and the phase IIB study, more adverse events and poorer performance on outcome measures were noted in the treated group. Minocycline was intended to modulate a novel disease mechanism not previously tested in ALS; the fact that too high a dose may have been chosen means that it remains unclear whether this target is appropriate for further study.


Phase III trials are performed to determine the efficacy of a drug in treating the disease in question. Longer-term safety data are also gathered. In many diseases, phase III trials are expected to be positive, as they only are performed after clear evidence of efficacy is gathered from disease markers in phase II studies. This is obviously not the case in ALS research, as more than 10 years have elapsed and at least a dozen negative trials have been reported since efficacy of riluzole was reported in two phase III trials . Depending on the preceding phase II studies, phase III trials may or may not involve multiple dose groups; however, negative studies that study only one dose level may not adequately test the hypothesis that the drug tested is effective in treating ALS.




Stages of drug development


The choice of trial design depends on the stage of development of a given drug. Phase I trials are performed to determine the appropriate dose range and dosing schedule, how the drug is metabolized or excreted, and to identify any acute or high frequency adverse events. Usually, phase I trials are performed on healthy volunteers. However, if there is reason to believe that specific aspects of a disease may have an impact on tolerability of a drug, phase I studies may also be performed on patients with disease. This is often the case in ALS; for example, a drug that causes moderate dizziness in normal subjects may result in frequent and serious falls in ALS patients. Usually, phase I studies are small and involve administration of the drug for only a short time period. Thus, low frequency events or effects of chronic administration are not measured. An important goal of phase I studies is to identify a dose above which adverse events preclude use, not to pick up events that will occur in a minority of patients.


Phase II trials are performed with the goal of gathering further safety information, especially related to long-term use of the drug. Pharmacokinetic evaluation of drug accumulation with long-term use is also commonly performed, and different schedules of drug dosing may be evaluated. Although not always the primary goal, some assessment of potential efficacy is often incorporated into phase II trials. In diseases associated with markers of activity (for example, CD4 counts or viral load in HIV), the effect of differing doses on these markers can be used to determine the dose choices for a phase III trial. In ALS, no such markers have been identified, so that attempts to gauge efficacy must be based on the outcomes typically employed in larger phase III trials. For this reason, the line between phase II and phase III trials is often blurred in ALS.


Decisions about dose are often made after phase II trials, so it is essential that multiple doses be evaluated. This has often not been done in phase II ALS trials, and when dose ranging is done, it is often inadequate. In a recently reported trial of pentoxyphylline in ALS, subjects were treated either with 1.2-g of pentoxyphylline or placebo. The trial showed increased mortality in the treated group. However, without any other tested doses, it is unknown whether a lower dose might have resulted in improved mortality. Similarly, topiramate was tested in a phase II study at a dose of 800 mg per day. Although there was no statistically significant effect on mortality, treated subjects lost an average of 10 lb more weight than placebo treated subjects and performed more poorly on functional and respiratory measurements . From assessment of adverse events, it was clear that at this dose, topiramate was quite difficult to tolerate, and the results reported could easily have been a function of each subject’s weight loss and other events. The lack of a lower dose group precludes any conclusion about the true efficacy of topiramate. The most recent example of the same difficulty is the recently reported trial of minocycline in ALS . In both the phase I and II studies and the phase IIB study, more adverse events and poorer performance on outcome measures were noted in the treated group. Minocycline was intended to modulate a novel disease mechanism not previously tested in ALS; the fact that too high a dose may have been chosen means that it remains unclear whether this target is appropriate for further study.


Phase III trials are performed to determine the efficacy of a drug in treating the disease in question. Longer-term safety data are also gathered. In many diseases, phase III trials are expected to be positive, as they only are performed after clear evidence of efficacy is gathered from disease markers in phase II studies. This is obviously not the case in ALS research, as more than 10 years have elapsed and at least a dozen negative trials have been reported since efficacy of riluzole was reported in two phase III trials . Depending on the preceding phase II studies, phase III trials may or may not involve multiple dose groups; however, negative studies that study only one dose level may not adequately test the hypothesis that the drug tested is effective in treating ALS.




Choice of outcome measures


As previously mentioned, no tissue based biomarkers currently exist to determine drug activity in ALS. Thus, clinical assessment of efficacy is based on measurement of a variety of aspects of disease. The gold standard outcome for ALS trials currently remains survival. Survival is obviously clinically meaningful and straightforward to measure. However, there are several reasons why other measures are being sought and why many current trials use outcomes other than survival. First, survival can be manipulated by many interventions that do not clearly alter the progression of underlying disease. Good nutrition and early use of percutaneous endoscopic gastrostomy clearly prolongs life . Respiratory support with noninvasive positive pressure ventilation has been less well studied, but likely also prolongs life . Beyond these clearly defined interventions, there is emerging evidence that patients cared for at multidisciplinary ALS clinics have prolonged survival as compared with community based controls . As these interventions may not be applied uniformly across all sites in a clinical trial, conclusions based on survival may be confounded by these variables. Many trials stratify along certain treatment variables, but stratification can reduce the power of a trial to find a significant drug benefit.


In addition to the issues raised above, the use of survival as an endpoint mandates large trials that treat patients for long time periods. Unless patients are chosen late in the disease course, survival at 1 year ranges from 82% to 91% in recent studies . Thus, very few patients will experience the event being measured. To show an 80% chance of seeing a 25% difference in mortality rate requires approximately 600 patients studied over 18 months (Schoenfeld, 2007, personal communication). This sample size is consistent with what was required to demonstrate efficacy of riluzole . In the current environment, with many drugs to test in a limited population with a limited budget, other outcome measures that may demonstrate efficacy with smaller sample sizes are obviously desirable. Several of the measures currently in use are reviewed below. Table 1 summarizes the changes in these measures over time from several recently reported clinical trials.



Table 1

Rate of change of commonly used outcome measures in ALS




























Measure Mean change per month Studies surveyed
ALS functional rating scale 0.77–1.07
%Forced vital capacity 1.04–2.46
Manual muscle testing 1.18
Maximum voluntary isometric contraction 0.075–0.99
Motor unit number estimation 2.20–2.35


Muscle strength


Muscle strength is a clinically relevant measure of disease progression in ALS. There are a variety of methods available to measure muscle strength. Both quantitative (maximum voluntary isometric contraction, or MVIC) and qualitative (Medical Research Council or MRC muscle grading) measures have been employed in past trials. Previous studies using MVIC employed an apparatus initially designed by Munsat and colleagues that required many position changes on the part of the patient, was quite fatiguing, and took about 45 minutes to perform. More recently, hand held dynamometry (HHD) has been employed; with careful evaluator training, HHD can evaluate many muscles in a short period of time.


MVIC has proven useful as an outcome measure in natural history studies and clinical trials in ALS, and is a valid and reliable measure of disease progression . Rather than evaluating individual muscle strength changes, strength for each muscle is normalized and averaged with other muscles of the same limb. This allows for the averaging of strength of small and large muscle groups, reduced variability, and greater linearity .


Intra- and inter-rater reliability of MVIC have been assessed in a number of clinical trials in ALS. With rigorous training of clinical evaluators, coefficient of variation between and within evaluators is less than 15% . At least seven trials have used MVIC as the primary outcome measure. Rate of decline in MVIC was assessed slightly differently from trial to trial. However, the rate of change in MVIC was consistent in the placebo groups from these studies . Data from the placebo treatment arms of two recent clinical trials with topiramate and creatine demonstrate that the rate of decline in MVIC is essentially linear, with only a small nonlinear component .


The above data were acquired using the original apparatus described by Munsat. More recent trials are employing HHD and studying a larger number of muscle groups. HHD has been directly validated against MVIC in ALS patients, and shown to change at a similar rate, with variability that is only slightly greater than MVIC . For both upper and lower extremity muscles, correlations between MVIC and HHD measurements ranged between 0.84 and 0.92, with test retest variability that was extremely similar as well. The only strength level where correlation between HHD and MVIC did not correlate well was when very strong muscles were assessed; while problematic in testing normal subjects, this is not likely to be a problem in an ALS clinical trial.


Manual muscle testing using the MRC grading scale has also been used in a number of ALS clinical trials. It involves measurement of muscle strength by a trained evaluator using standardized patient positioning. It was recently demonstrated that if enough muscles are tested, a decline in average grade can be determined early in the disease, and the variability of measurement approximates that of MVIC . The advantages to manual muscle testing are speed, expense, and the lack of specialized equipment . However, MRC grading is by nature nonlinear; the difference in isometric strength between grade 1 to grade 3 is a small fraction of the difference between grades 3 and 5. Grade 4 spans the bulk of the isometric range, so that large changes in strength are not reflected in changes in muscle grade. Thus, MRC strength grading has lower face validity than MVIC.


Pulmonary function


Respiratory failure is the primary cause of death in ALS, so its assessment has obvious clinical relevance. Vital capacity and maximal inspiratory and expiratory mouth pressures are the methods most commonly used. These measures are widely available, noninvasive, and portable. However, patients with significant bulbar involvement show great variability of measurement and require maximal respiratory muscle activation . Bulbar or facial weakness can prevent the formation of a tight lip-seal around a mouthpiece so that a facemask or other seal must be used. Vocal cord spasms and excessive saliva and gagging can also interfere with study performance.


The forced vital capacity (FVC) measures volume of air forcefully expired in one breath. Usually, the FVC is reported as a percentage of a predicted vital capacity based on the subject’s height, gender, and age. The FVC declines with time in patients with ALS and is a sensitive measure of disease progression. Both the baseline FVC and the rate of decline in FVC are predictive of survival .


Functional rating scales


Clinical rating scales that assess the activities of daily living are useful in both natural history studies of ALS and in clinical trials of experimental agents. Early examples are the Norris scale and the ALS severity scale, and the Appel ALS rating Scale. However, the scale that has achieved widest acceptance is the ALS functional rating scale (ALSFRS-R) . The ALSFRS-R is now employed as a primary outcome measure in most ALS trials that do not employ survival as the primary endpoint.


The ALSFRS-R is a quickly administered ordinal rating scale used to determine a patient’s assessment of their capability and independence in 12 functional activities. It assesses bulbar and respiratory functions, upper extremity functions (cutting food and dressing), lower extremity functions (walking and climbing), dressing hygiene, and ability to turn in bed. The instrument can be easily administered, and the patient’s response is recorded to the closest approximation from a list of five choices. Each choice is scored from 0 to 4. The total score can range from 48 (normal function) to 0 (unable to attempt the task).


Initial validity was established by documenting that in ALS patients change in ALSFRS scores correlated with change in strength over time, as measured by MVIC, , and was closely associated with quality of life measures, and predicted survival . With appropriate training, the ALSFRS can be administered with high inter-rater reliability and test-retest reliability. The test-retest reliability is greater than 0.88 for all test items.


The ALSFRS-R can be administered by phone, again with good inter-rater and test-retest reliability , thus obviating the need for some in-person visits for disease assessment. In addition, the ALSFRS-R can be administered to the patient directly when the patient is verbal, but as communication becomes more difficult, caregivers provide increasing assistance in providing responses. The equivalency of caregiver and patient responses has also not been established.


The ALSFRS was revised in 1999 to add assessments of respiratory dysfunction, including dyspnea, orthopnea, and the need for ventilatory support . The revised ALSFRS (ALSFRS-R) was demonstrated to retain the properties of the original scale and show strong internal consistency and construct validity.


Motor unit number estimation


Although routine, nerve conduction studies and needle electromyography are essential for confirming lower motor neuron involvement in the initial diagnosis of motor neuron disease. However, they do not permit accurate measurement of motor neuron loss and compensatory reinnervation . Motor unit number estimation (MUNE) quantifies the number of surviving motor units in the living human subject and has emerged as an important potential marker in ALS and other motor neuron disorders.


All techniques for counting motor units rely on the same basic premise. A maximum muscle response is generated to an electrical stimulus. Most often, the response measured is electrical, but force measurements have also been used. Then, the response amplitude of a single motor unit is estimated. Once a single motor unit amplitude estimate is made, this value is divided into the maximum response to yield a number reflecting the number of units that made up the response.


MUNE has been employed in two multicenter ALS trials as a secondary outcome measure . Good intra-rater reliability was demonstrated and a reliable decline in MUNE was demonstrated. However, the method employed was laborious, and required the use of a particular EMG machine. A well quantified, easily performed method has been developed, and is now being employed in two ongoing trials. Coefficient of variation is less than 10%, significantly lower than the value seen in other MUNE methods.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Apr 19, 2017 | Posted by in PHYSICAL MEDICINE & REHABILITATION | Comments Off on Designing Clinical Trials in Amyotrophic Lateral Sclerosis

Full access? Get Clinical Tree

Get Clinical Tree app for offline access