Orthopaedic Research
Wayne E. Moschetti, MD, MS
Marcus P. Coe, MD, MS, FAOA
Dr. Moschetti or an immediate family member is a member of a speakers’ bureau or has made paid presentations on behalf of DePuy, A Johnson & Johnson Company; serves as a paid consultant to or is an employee of DePuy, A Johnson & Johnson Company; has received research or institutional support from DePuy, A Johnson & Johnson Company; has received nonincome support (such as equipment or services), commercially derived honoraria, or other non-research-related funding (such as paid travel) from Medacta and Omni Life Science; and serves as a board member, owner, officer, or committee member of the New England Orthopaedic Society. Dr. Coe or an immediate family member serves as a paid consultant to or is an employee of DePuy, A Johnson & Johnson Company and has received research or institutional support from Ferring Pharmaceuticals.
ABSTRACT
This chapter reviews the basics of orthopaedic research in regard to levels of evidence, tools for evaluating study quality, the use of clinical practice guidelines, grades of recommendations and ethical considerations. The chapter also covers the use of orthopaedic registries, their value and limitations.
Keywords: clinical practice guidelines; ethical considerations; grades of recommendations; levels of evidence; orthopaedic registries; orthopaedic research; tools for evaluating study quality
Introduction
Understanding the basics of conducting orthopaedic research is paramount when interpreting the literature, analyzing the results of studies, and applying orthopaedic research to clinical practice. There are several tools available to aid in determining the level of evidence and quality of research, which should affect how research influences one’s clinical practice. Clinical practice guidelines are an attempt to synthesize the most valuable clinical evidence available for wide distribution. Registries capture large populations of patients with similar diagnoses or similar implants and are a powerful tool to follow longitudinal results. Finally, overarching all research must be a firm foundation in ethical study design and treatment of subjects.
Levels of Evidence
Orthopaedic research—whether therapeutic, diagnostic, prognostic, or economic—aims to expand our knowledge about the field and inform clinical decision-making. To be most effective, research must reduce bias. Biases are systemic factors other than the intervention being studied that influence the results of a study. In other words, bias can deviate a study from the underlying truth it aims to reveal. The design of a study can help reduce bias, as can the quality of a study (discussed in the next section). As the level of evidence of a study becomes higher (with larger numbers representing poorer levels of evidence and “level 1” representing the highest level of evidence), so does the study’s inherent ability to decrease bias.
The following represent the most common types of studies, listed in order from the generally accepted highest level of evidence (level 1) to the lowest (expert opinion) based on the Oxford Centre for Evidence-Based Medicine (OCEBM) Levels of Evidence Working Group1 (Figure 1).
Systematic Reviews
Systematic reviews and meta-analyses aggregate the findings of multiple published works in a systematic way, while taking into account the quality of each individual study, to produce the highest level of evidence.2 In a systematic review, an a priori detailed and comprehensive search strategy is executed with the goal of reducing bias by identifying, reviewing, and analyzing all relevant studies on a particular topic. A meta-analysis utilizes statistical methods to combine the data from multiple studies (commonly a systematic review) into a larger sample, which allows the creation of a single quantitative pooled estimate.3,4,5 These studies are
thus regarded as a fundamental source of the highest quality evidence and are frequently utilized for clinical practice guidelines and recommendations.2,6 Systematic reviews of high-quality randomized controlled trials increase the level of evidence above high-quality randomized controlled trials alone, but are not without flaws and one must be careful when interpreting their findings.7,8
thus regarded as a fundamental source of the highest quality evidence and are frequently utilized for clinical practice guidelines and recommendations.2,6 Systematic reviews of high-quality randomized controlled trials increase the level of evidence above high-quality randomized controlled trials alone, but are not without flaws and one must be careful when interpreting their findings.7,8
Randomized Controlled Trials
Considered the benchmark for therapeutic trials, randomized controlled trials aim to minimize bias by evenly distributing confounding factors across a study group and a control group. Confounding factors are prognostic variables that have the potential to affect the outcome of the study. Randomization evenly distributes subjects into either the study or control group based solely on chance, so that any factors not accounted for in the selection criteria, even those that may be unknown, affect the outcome evenly between both groups. In order for randomization to be most effective, the intervention (and at times the outcome) must be blinded to as many parties as practical. Ideally, neither the patient, the caregiver, nor anyone evaluating the patient or their outcomes should know whether the patient is in the intervention or the control group. In this manner, evaluation of outcomes is free from bias. Practically and ethically, blinding all parties is not always possible, particularly in studies involving a surgical intervention.
Cohort Studies
Cohort studies longitudinally compare a study group to a control group. Cohort studies can be either prospective (following patients from one point forward) or retrospective (looking back on previously treated patients on whom you have collected data). As patients are not randomized into the study and control groups, bias can affect how patients end up in one group or another and confounders are not evenly distributed. Oftentimes, randomization is not feasible or ethical and cohort studies are the best way to evaluate a specific intervention. Ideally, inclusion criteria should limit the effect of prognostic factors on the outcomes (eg, by including only nondiabetics, you remove the effect of diabetes on an outcome).
Case-Control Studies
Case-control studies retrospectively compare a group of patients with a condition to a group of patients without that particular condition. These are observational studies and not interventional studies. Usually these
studies are designed to identify risk factors for a disease or outcome, though causality cannot be proven.
studies are designed to identify risk factors for a disease or outcome, though causality cannot be proven.
Case Series
Case series represent simple collections of patients who have undergone a specific intervention or have a specific condition. These are descriptive studies, and as there is no comparison group, it limits the conclusions that can be drawn from them. Case series can often identify venues for further study.
Expert Opinion
Expert opinions represent the thoughts and experiences of one or more person. Although anecdote and life experience can be useful “food for thought,” they do not represent a high level of evidence unless systematically studied in one of the above ways.
Tools for Evaluation of Study Quality
While the design of a study influences its ability to determine the truth, so does the quality with which it is conducted. Quality can be a difficult thing to define when evaluating a clinical trial. There is a balance between limiting bias and study feasibility. Though there are numerous methods available to help assess study quality, studies are not routinely given a quality rating at the time of publication.8 Randomized controlled clinical trials, thought to produce the highest level of clinical data, can be flawed in their execution, rendering them
less useful than a more inherently flawed study design. A review paper may provide a compelling overview of a topic, but it can be driven by opinion and have the same potential for bias as expert opinion. A well-done cohort study may give more valuable clinical information than a randomized control with high crossover rates and high loss to follow up despite good study design. Thus, tools exist to evaluate the quality of various types of studies.
less useful than a more inherently flawed study design. A review paper may provide a compelling overview of a topic, but it can be driven by opinion and have the same potential for bias as expert opinion. A well-done cohort study may give more valuable clinical information than a randomized control with high crossover rates and high loss to follow up despite good study design. Thus, tools exist to evaluate the quality of various types of studies.
The Consolidated Standards of Reporting Trials (CONSORT) guidelines aim to improve the reporting of randomized controlled clinical trials.10 These guidelines were created through consensus and collaboration between clinical trials experts with the goal of empowering readers to understand a trial’s study design, analysis, and interpretation. Readers are thus better equipped to assess the validity of a trial’s results. CONSORT achieves this using a checklist, which includes all the items deemed necessary to comply with the standard of reporting randomized clinical trials. Authors are expected to include a flow diagram outlining how the study population was recruited and handled throughout the course of the study10 (Figure 3).
Systematic reviews collect and synthesize numerous studies, but as the phrase “garbage in, garbage out” illustrates, the strength of a systematic review is predicated on the strength of the studies that it synthesizes. Executing and completing a systematic review can be a challenging endeavor. The search strategy must include all relevant studies from all sources (common sources include Medline, Embase [Excerpta Medica database], CINAHL [Cumulative Index to Nursing and Allied Health Literature], Pubmed, abstracts, references, and the Cochrane Library databases). The systematic review should be registered on an international database, which
attempts to catalog all systematic reviews and standardize their reporting (such as the International Prospective Register of Systematic Reviews—PROSPERO).11 Finally, the studies themselves must be reviewed. More than one reviewer should assess each paper, applying predefined inclusion and exclusion criteria that assure appropriate study selection and data reporting. Methodologic and reporting quality is crucial to generate unbiased results and should be presented in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement12 (Figure 4). For systematic reviews of observational studies, the use of the Meta-analysis of Observational Studies in Epidemiology (MOOSE) guideline is preferred.13
attempts to catalog all systematic reviews and standardize their reporting (such as the International Prospective Register of Systematic Reviews—PROSPERO).11 Finally, the studies themselves must be reviewed. More than one reviewer should assess each paper, applying predefined inclusion and exclusion criteria that assure appropriate study selection and data reporting. Methodologic and reporting quality is crucial to generate unbiased results and should be presented in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement12 (Figure 4). For systematic reviews of observational studies, the use of the Meta-analysis of Observational Studies in Epidemiology (MOOSE) guideline is preferred.13
As the number of systematic reviews continues to increase, they are subject to a range of biases and it is important to distinguish high-quality systematic reviews from low-quality systematic reviews. The Assessment of Multiple Systematic Reviews (AMSTAR) index was developed to evaluate systematic reviews of randomized trials. This checklist of important attributes and practices has been adapted to the AMSTAR 2 index, which includes systematic reviews based on nonrandomized trials as well. As more weight is placed on these types of
studies for clinical and policy decisions, AMSTAR is a practical appraisal tool that allows for the reproducible assessments of the quality of systematic reviews.14
studies for clinical and policy decisions, AMSTAR is a practical appraisal tool that allows for the reproducible assessments of the quality of systematic reviews.14
When evaluating individual studies, one must consider both the external and internal validity of the study. External validity can be thought of simply as whether or not the study is asking an appropriate research question that can generate applicable results. The internal validity of a study goes a step further and relates to whether the research question was answered correctly, that is, do the results represent what is “true” and were they obtained in a manner that minimizes bias?15 High-quality studies are less prone to bias and thus more likely to have “true” results. Therefore, evaluation of study quality should always include an assessment of internal validity.
Regardless of study quality and methodological strategy, studies are always susceptible to bias, which, as discussed above, is a systematic deviation from true study results.16 The most common forms of bias are selection bias, information bias, confounding bias, and publication bias. Selection bias can lead to a false association between exposure and outcome through a systematic error in patient enrollment.16 Patients in both the treatment and control groups should be drawn from similar populations and efforts should be made to minimize loss to follow up. Information bias can result from flawed data collection. Patients may not remember an exposure or outcome (recall bias) and may be more likely to report complications if they have multiple medical comorbidities (reporting bias), and data collection may be inaccurately collected if the one collecting the data knows which treatment the study participant received (measurement bias). Standardized prospective data collection, as well as patient and provider blinding, can limit the predetermined beliefs of the patient and assessor and limit bias. When a third, commonly unmeasured variable leads to a potential outcome, a noncausal relationship between an exposure and the outcome may be inappropriately drawn, leading to confounding bias. For example, a study may demonstrate a higher infection rate in patients undergoing total hip replacement with metal femoral heads compared with those with ceramic femoral heads. If there are a greater number of diabetics in the metal head group and diabetics have higher rates of infection, then diabetes could act as a confounder, biasing the results. Lastly, studies with positive and favorable results are more likely to be published. Smaller studies, studies with negative results, or results perceived to be uninteresting are less likely to be published, have a delay in being published after several rejections, or get published in lower impact journals, leading to publication bias.