Interpreting the Medical Literature: Applying Evidence-Based Medicine in Practice

Chapter 8 Interpreting the Medical Literature


Applying Evidence-Based Medicine in Practice





Key Points








Building Clinical Evidence from Published Research


Evidence-based medicine (EBM)—asking clear, relevant clinical questions, finding appropriate studies, critically appraising the literature, and implementing changes in practice behavior—has become an essential part of medical care. Most busy physicians do not have the time or the background to answer critically the questions that arise in practice. Primary care physicians identify 2.4 clinical questions for every 10 encounters (Barrie and Ward, 1997), but they spend less than 15 minutes on average with each patient. Evidence about common primary care problems is accumulating at an overwhelming pace, and the broad scope of family medicine presents important challenges. Other barriers to the use of EBM include lack of evidence that is pertinent to an individual patient, quick access to information at the point of care, and potentially negative impacts on the art of medicine (McAllister et al., 1999). How can diligent physicians narrow the gap between their current behaviors and best practices?


In this chapter, hormone replacement therapy (HRT) for postmenopausal women is used as a case example to understand the evolution of medical practice and the changing landscape of evidence and to review concepts important to interpreting the medical literature. These concepts form the basis for practical EBM tools that family physicians can use to answer important clinical questions.


In Chapter 10, information from the Women’s Health Initiative (WHI) about HRT is considered, and similar epidemiologic and statistical issues are covered. Chapter 10 emphasizes the importance of how risk data are framed or presented to the patient and the primacy of patient preference (i.e., ability to make informed decisions about therapy). Unfortunately, evidence concerning the ideal manner of presenting information to patients and clinicians and promoting informed decision-making is still scant. This chapter and Chapter 10 take a slightly different approach to similar clinical questions, and together, these two chapters provide the background for the motivated family physician to better understand concepts of risk and probability and to foster enhanced physician-patient decision-making.


Evidence for interventions such as HRT usually begins with observational studies, including unblinded case series, case-control studies, and cohort studies, and it culminates in randomized, controlled trials (RCTs) (Figure 8-1). To better understand how we arrived at the current clinical understanding of HRT and its effects on heart disease, we review the progression of research studies and evidence over the past 30 years. A series of observational studies in the 1970s and 1980s led to regular prescribing of HRT to prevent a number of significant health conditions in postmenopausal women.





Cohort Studies


Cohort studies are often the next step in building the strength of evidence regarding an association between an exposure and an outcome. Cohort studies typically look forward in time (i.e., prospective studies) and are generally more expensive and take longer to complete than case-control studies. However, they provide a more accurate estimate of the relative risk for women who take HRT and those who do not. A cohort study is also an observational study—one that observes outcomes in groups but does not assign participants to a particular exposure or treatment. In a cohort study of HRT and CHD, a researcher would identify a group of women taking HRT and a similar group of women who have chosen not to take HRT, and the researcher would then follow them over time and count the number of CHD events. Because outcome events may be uncommon in each group and may take many months to occur, cohort studies often require large numbers of participants and long follow-up periods to show significant differences between groups.


The primary statistical measure from a cohort study is relative risk. This is a ratio of the rate of CHD events among women who choose to take HRT divided by the rate among women who choose not to take HRT. A common form of bias in cohort studies related to prevention is the healthy user bias, when participants who choose one preventive measure (e.g., HRT) also tend to make healthier lifestyle decisions (e.g., diet, exercise) that may also prevent the measured outcome (i.e., CHD).


Beginning with case-control studies and then using larger cohort studies, observational research showed that HRT might reduce the incidence of CHD, fractures, and colorectal cancer. These observational studies also suggested that the same therapy might cause harm, with a slightly increased risk of breast cancer, stroke, and venous thromboembolism. On balance, however, even a small positive impact of HRT on preventing CHD was thought to far outweigh the potential adverse effects of HRT.




The Power of Randomized, Controlled Trials


In RCTs, study participants are randomly allocated to two or more groups and then assigned to receive an intervention such as HRT or to receive no active treatment (i.e., placebo or to continue with their usual care). RCTs greatly add to the confidence of measured results because the structure of an RCT helps to eliminate many of the inherent biases that are in observational studies. For example, in cohort studies of HRT and CHD, it is hypothesized that women who choose to take HRT are generally healthier and have better healthy lifestyle practices than women who do not choose to take HRT. Because participants in an RCT are randomly assigned to treatment and control groups, they are less likely to have differences in other factors that might prevent or promote heart disease.


The decreased likelihood of a healthy user bias in an RCT may explain why HRT appeared to be protective in cohort studies but later proved to be harmful. Because RCTs have this inherent ability to remove many important potential forms of bias (but are not immune to biases themselves), a physician can have more confidence that they reflect the true association between the HRT treatment and CHD outcomes. Despite decades of work, dozens of observational studies, and structured reviews that strongly suggested a protective effect of HRT for CHD, a single, large RCT trumped them all and caused a sudden reversal in physicians’ prescribing behavior.


The results of the WHI study, released in 2002, sent a shock wave through the medical community. For the first time, a large, randomized trial showed that HRT—given to otherwise fairly healthy postmenopausal women—caused a statistically significant increase in CHD events. Within days of the release of the WHI primary results, many women called their physicians to decide whether they should continue with HRT. Many physicians drastically changed their prescription of HRT based on the WHI; within 9 months, prescriptions of the most popular formulation of HRT decreased by as much as 61% (Majumdar et al., 2004). Perhaps more than any other single study in modern medical history, the WHI report dramatically changed a widespread, common medical practice.



Understanding the Statistical Significance of Study Results


Reports from RCTs such as the WHI study frequently include relative risk as a summary measure of differences between the treatment and placebo groups (Table 8-1). To arrive at the relative risk, the researcher first measures the incidence rate of an outcome in each of the two study groups (i.e., treatment and placebo). The incidence rate for each group is a ratio of the number of new outcome events, such as CHD events, divided by the number of patients at risk for the outcome in that group over a specific period. In multiyear studies, the average annual incidence rate is often reported as a summary measure. In a placebo-controlled RCT, the relative risk is then calculated as a ratio of the incidence rate for the treatment group divided by the incidence rate for the placebo group (Table 8-2).


Table 8-1 Understanding Study Results















Typical summary rates from randomized, controlled trials:
image
image
Summary measures that may be more meaningful for clinicians:
image
image

Table 8-2 Examples of Summary Rates from the Women’s Health Initiative (WHI) Study



















The following equations show how to take a summary rate commonly reported in published studies (i.e., relative risk) and calculate a summary measure (e.g., number needed to treat, number needed to harm) that may be more useful in describing the results to clinicians and patients. The example considers the average annual incidence rates and relative risk for coronary heart disease (CHD) events in the WHI study on the effects of hormone replacement therapy (HRT):
image
image
image
The relative risk describes a relative 29% increase in CHD events. It may be more useful to consider the absolute difference in incidence rates between the two groups to understand the magnitude of the potential risk for a given patient:
image
The number needed to harm (NNH) can be calculated to describe, on average, how many women must be treated for 1 year to cause one additional CHD event attributable to HRT:
image

Data from Ebell MH, Messimer SR, Barry HC. Putting computer-based evidence in the hands of clinicians. JAMA 1999;28:1171-1172.


How can a physician determine whether the reported relative risk from a study is significant enough to influence clinical decisions? Typically, the statistical significance of the summary measure is reported, which in this case is relative risk. Statistical significance is usually summarized in published studies by a p value for a given summary measure. The p value describes the statistical probability that the observed difference between the groups could have happened simply by chance alone. A p value of less than 0.05 is the arbitrary cutoff most often used for “statistical significance.” A “p <0.05” means that there is less than a 1 in 20 (5%) probability that a difference as large as that observed would have occurred by chance alone; a p = 0.04 means a 1 in 25 (4%) probability; a p = 0.06 means a 1 in 16 probability (6%).


Although frequently used, p values provide only limited information: the chance that any difference found is caused by chance, or random error. A p value alone gives no indication of the clinical significance of a finding and provides no information regarding the likelihood that a finding of “no difference” is caused by chance, or random error.


Confidence intervals are much more informative than p values. When relative risk is reported as the summary result of a study, the 95% confidence interval (CI) is often used to give an indication of the precision of the estimated relative risk. The 95% CI describes the range within which there is a 95% probability that the true relative risk (RR) is in that range. An RR of 1.0 indicates no difference. For example, if a study reported an RR of 2.5 with a 95% CI of 2.3 to 2.7, we could be reasonably certain (95% certain) that the true RR was no less than 2.3 and no greater than 2.7. Our conclusion would be that the estimated RR of 2.5 is fairly precise. However, if RR was reported as 2.5 with a 95% CI of 1.1 to 5.0, the true RR could be as low as 1.1 (almost no difference) or as high as 5.0 (a fivefold difference), an obviously imprecise estimate of the relative risk.


Confidence intervals also provide a better measure than p values of the precision for concluding that there is no difference in a relative risk. Any 95% CI that includes RR = 1.0 indicates that there may be “no difference.” However, a RR of 1.1 with a 95% CI of 0.99 to 1.11 is almost certainly a finding of no difference (i.e., a narrow confidence interval), whereas an estimated RR = 1.4 with a 95% CI interval of 0.99 to 1.7 is much less precise (i.e., a wide confidence interval). Even though the 95% CI contains 1.0, there may still be a true difference, just not detected in this study.



Interpreting Study Results: Statistical and Clinical Significance


Although the WHI showed a statistically significant increase in the relative risk of CHD events among women who were randomly assigned to take HRT, it is important to consider the absolute difference in CHD events between the two groups to understand the strength of the association and to discuss the risk of HRT treatment with individual patients. Calculating absolute risk (in addition to relative risk) is a helpful way to understand the level of risk that HRT may add for a group of women who are at risk for CHD events (see Table 8-2).


In the WHI study, the relative risk of CHD for participants who took HRT was 1.29, with a 95% confidence interval that did not cross 1.0 (95% CI, 1.02 to 1.63). This figure (RR = 1.29) can generally be interpreted as HRT being associated with a 29% increase in CHD events. This summary measure was reported widely in medical journals and the mainstream press.


When reported in terms of relative risk, the weight of the association between HRT and CHD sounds ominous (i.e., a 29% increase). However, in terms of absolute risk attributable to HRT treatment, a less portentous picture emerges (see Table 8-2). In the WHI study, women taking HRT had an average rate of CHD events of 0.37% per year, an average of 37 events per 10,000 women each year, and those in the placebo group had an annual rate of 0.30%, or 30 events per 10,000 women each year. Although the adjusted RR of CHD is 1.29 (0.37 divided by 0.30), the attributable risk or risk difference between the two groups is 0.07% (0.37 minus 0.30). In other words, approximately seven additional cases of CHD occurred for 10,000 women using HRT during each year over the course of the study. The attributable risk of the treatment group can be summarized as the number needed to harm (NNH) or, if a study reports a beneficial effect, the number needed to treat (NNT). In this case the NNH was approximately 1430; on average, for every 1430 patients treated with HRT, one additional CHD event occurred (i.e., the inverse of the risk difference, 0.07, or 10,000 divided by 7) (see Table 8-2). The NNH or NNT is often a more understandable and useful summary of study outcomes when physicians and patients weigh the risks and benefits of a particular therapy.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Oct 3, 2016 | Posted by in MANUAL THERAPIST | Comments Off on Interpreting the Medical Literature: Applying Evidence-Based Medicine in Practice

Full access? Get Clinical Tree

Get Clinical Tree app for offline access