Leveraging Human Genetics to Develop Future Therapeutic Strategies in Rheumatoid Arthritis




The purpose of this article is to place these genetic discoveries in the context of current and future therapeutic strategies for patients with RA. More specifically, this article focuses on (1) a brief overview of genetic studies, (2) human genetics as an approach to identify the Achilles heel of disease pathways, (3) humans as the model organism for functional studies of human mutations, (4) pharmacogenetic studies to gain insight into the mechanism of action of drugs, and (5) next-generation patient registries to enable large-scale genotype-phenotype studies.


Despite decades of research, the biologic pathways that initiate rheumatoid arthritis (RA) are unknown. Without knowing the specific pathways that lead to RA, it is very difficult to develop novel therapies. Because genetic mutations are inherited before disease onset, human genetics provides prima facie evidence that a pathway is important in pathogenesis. Moreover, because human genetic strategies can be applied genome wide, they offer an unbiased search of the human genome for an insight into RA pathogenesis.


Since the 1970s, more than 20 RA risk loci have been identified ( Table 1 ). The first locus associated with RA risk was the major histocompatibility (MHC) locus, identified by mixed lymphocyte cultures between patients and controls with RA. Subsequently, Gregersen and colleagues advanced the hypothesis that the multiple RA risk alleles within the HLA-DRB1 gene share a conserved amino acid sequence. This is now widely known as the “shared epitope” hypothesis, and the risk alleles are known as shared epitope alleles. With the sequence of the human genome and improved understanding of human genetic diversity came many additional genetic discoveries. Between 2003 and 2005, common alleles within the PADI4 , PTPN22 , and CTLA4 genes were found to be reproducibly associated with risk of RA. Genome-wide association studies (GWASs), which in the contemporary form test hundreds of thousands of single nucleotide polymorphisms (SNPs) across the genome, have been systematically performed in large case-control collections. These GWASs have identified more than 20 common alleles (where an allele is 1 of the 2 base pairs of an SNP) that confer a 10% to 20% increase in disease risk per copy of the risk allele. Collectively, these risk alleles explain approximately 15% to 20% of the overall disease burden.



Table 1

Known RA SNP associations





















































































































































































































































































SNP Locus Candidate Gene OR Allele Frequency References
rs3890745 1p36.2 TNFSF14 0.920 0.320 Raychaudhuri et al, 2008
rs2240340 1p36.13 PADI4 1.40 0.373 Suzuki et al, 2003
rs2476601 1p13.2 PTPN22 1.750 0.100 Begovich et al, 2004
rs11586238 1p13.1 CD2 , IGSF2 , CD58 1.120 0.227 Raychaudhuri et al, 2009
rs7528684 1q23.1 FCLR3 1.2 0.35 Kochi et al, 2005
rs12746613 1q23.2 FCGR2A 1.100 0.124 Raychaudhuri et al, 2009
rs3766379 1q23.3 CD244 1.31 0.53 Suzuki et al, 2008
rs10919563 1q31.3 PTPRC 0.900 0.132 Raychaudhuri et al, 2009
rs13031237 2p16.1 REL 1.207 0.340 Gregersen et al, 2009
rs934734 2p14 SPRED2 1.13 0.51 Stahl et al, 2010
rs10865035 2q11.2 AFF3 1.140 0.460 Barton et al, 2009
rs7574865 2q32.3 STAT4 1.320 0.180 Remmers et al, 2007
rs1980422 2q33.2 CD28 1.100 0.238 Raychaudhuri et al, 2009
rs3087243 2q33.2 CTLA4 1.136 0.560 Plenge et al, 2005
rs13315591 3p14 PXK 1.13 0.08 Stahl et al, 2010
rs874040 4p15 RBPJ 1.18 0.30 Stahl et al, 2010
rs6822844 4q27 IL2/IL21 1.389 0.710 Zhernakova et al, 2007
rs6859219 5q11 ANKRD55 0.85 0.22 Stahl et al, 2010
rs26232 5q21 C5orf13 0.93 0.32 Stahl et al, 2010
rs2395175 (and others) 6p21.32 MHC 1.000 0.021 Gregersen et al, 1987
rs548234 6q21 PRDM1 1.100 0.322 Raychaudhuri et al, 2009
rs10499194 6q23.3 TNFAIP3 1.220 0.220 Plenge et al, 2007
rs6920220 6q23.3 TNFAIP3 1.333 0.610 Thomson et al, 2007
rs5029937 6q23.3 TNFAIP3 1.34 0.04 Orozco et al, 2009
rs394581 6q25.3 TAGAP 0.930 0.286 Raychaudhuri et al, 2009
rs3093023 6q27 CCR6 1.11 0.43 Stahl et al, 2010
rs10488631 7q32 IRF5 1.25 0.10 Stahl et al, 2010
rs2736340 8p23.1 BLK 1.122 0.243 Gregersen et al, 2009
rs2812378 9p13.3 CCL21 1.100 0.355 Raychaudhuri et al, 2008
rs951005 9p13.3 CCL21 0.87 0.15 Stahl et al, 2010
rs3761847 9q33.1 TRAF1 1.100 0.440 Plenge et al, 2007; Kurreeman et al, 2007
rs2104286 10p15.1 IL2RA 0.92 0.28 Thomson et al, 2007; Kurreeman et al, 2009
rs706778 10p15.1 IL2RA 1.11 0.40 Stahl et al, 2010
rs4750316 10p15.1 PRKCQ 0.910 0.183 Raychaudhuri et al, 2008; Barton et al, 2008
rs540386 11p12 RAG1 , TRAF6 0.920 0.144 Raychaudhuri et al, 2009
rs1678542 12q13.3 KIF5A 0.890 0.351 Barton et al, 2008; Raychaudhuri et al, 2008
rs4810485 20q13.12 CD40 0.910 0.231 Raychaudhuri et al, 2008
rs3218253 22q12.3 IL2RB 1.110 0.730 Barton et al, 2008


The purpose of this article is to place these genetic discoveries in the context of current and future therapeutic strategies for patients with RA. More specifically, this article focuses on (1) a brief overview of genetic studies, (2) human genetics as an approach to identify the Achilles heel of disease pathways, (3) humans as the model organism for functional studies of human mutations, (4) pharmacogenetic studies to gain insight into the mechanism of action of drugs, and (5) next-generation patient registries to enable large-scale genotype-phenotype studies.


Brief overview of human genetics: from SNP to causal allele


There are approximately 10 million common SNPs in the human genome. A fundamental challenge in human genetics is to systematically test each of these 10 million common SNPs for its role in disease. Advances in genomic technology have made this feasible. Contemporary GWASs test several hundred thousand SNPs across the entire human genome, most of which are common (minor allele frequency >5%) in the general, healthy population. To test the remaining more than 9 million common SNPs, the GWAS approach relies on the correlation structure of nearby SNPs. That is, 9 of 10 SNPs are highly correlated, and testing 1 SNP serves to tag the remaining 9 nearby SNPs. This concept is known as linkage disequilibrium (LD).


But the properties of LD that make it powerful for gene mapping also underscore the challenges that remain once an SNP is associated with disease risk; it is unknown if the SNP genotyped (and associated with risk in the genetic study) is the actual causal allele or whether the genotyped/associated SNP is simply in LD with the causal allele. Here causal allele is the single genetic mutation that is responsible for disrupting gene function and giving rise to the phenotype of interest. Given the sheer number of common alleles, the genotyped/associated SNP is most likely just a proxy for the actual causal allele. An example of the correlation structure of an RA risk locus is shown in Fig. 1 .




Fig. 1


Case-control association results and LD structure in the TRAF1-C5 locus. Panel A shows results for SNPs genotyped across 1 Mb as part of the original genome-wide association scan in samples from 1522 case subjects with anti–CCP-positive RA and 1850 control subjects. Each diamond indicates a genotyped SNP; the color of each diamond is based on the correlation coefficient (r 2 ), with the CEPH (Centre d’Etude du Polymorphisme Humain) from Utah (CEU) HapMap with the most significant SNP in the study (rs3761847). The blue diamond indicates the P value for all samples in the study (the original scan plus replication samples) as determined by the Cochran-Mantel-Haenszel method in the samples of both the North American Rheumatoid Arthritis Consortium (NARAC) and the Epidemiological Investigation of Rheumatoid Arthritis (EIRA). The recombination rate (in cM/Mb) with the CEU HapMap is shown in light blue along the x axis; the red arrow indicates the block of LD shown in panel B. The blue arrows indicate gene location. Panel B shows the LD structure across 200 kb of the TRAF1-C5 locus, based on pairwise r 2 with the CEU HapMap. The intron-exon structure of each gene is at the top of the figure. Putative functional SNPs in LD with either rs3761847 or rs2900180 are indicated by hatched bars in which red indicates r 2 >0.80 and pink indicates r 2 = 0.20 to 0.80; the specific SNPs, frequency, pairwise r 2 with the CEU HapMap, and the putative annotated function are listed at the bottom of the figure. CpG denotes cytidine and guanosine joined by a phosphodiester bond.

Modified from Plenge RM, Seielstad M, Padyukov L, et al. TRAF1-C5 as a risk locus for rheumatoid arthritis—a genomewide study. N Engl J Med 2007;357(12):119–209; with permission.


There is often more than 1 gene in the region of LD that harbors the genotyped/associated SNP, which makes it difficult to pinpoint definitively which gene is the causal gene. On the other hand, there may be no nearby gene in the region of LD. Here causal gene is the single gene that is altered by a mutation to give rise to the phenotype of interest (eg, risk of RA). For convenience, the best biologic gene, based on its known function, is often nominated as the “causal gene.” Fig. 1 illustrates that there are 3 genes in a region of LD at the locus on chromosome 9, 2 of which are very strong biologic candidate genes: TRAF1 (encoding tumor necrosis factor (TNF) α receptor–associated factor 1) and C5 (encoding complement component 5). Thus, this locus is referred to as the TRAF1-C5 RA risk locus.


For most of the 20 RA risk alleles shown in Table 1 , the causal mutation and the causal gene are yet to be identified. Outside of the MHC, the 1 exception is PTPN22 in which the associated mutation alters protein structure and function. Although it may be reasonable to nominate the most likely biologic candidate gene to be the causal gene, direct evidence is not yet available.


There are at least 2 reasons why it is important to identify (or “fine map”) the causal mutation. First, knowing the causal mutation helps guide functional studies. For drug discovery, it is crucial to understand if the risk allele is a gain-of-function or loss-of-function allele. Second, knowing the causal allele provides more accurate estimates of risk that could facilitate disease prediction. If the associated SNP is highly but not perfectly correlated with the causal allele, then risk estimates will be deflated.


A limitation of contemporary GWASs is that they only test common SNPs. For every common allele (defined as having an allele frequency of >5% in the general population), there is at least 1, and likely many more, rare alleles (frequency <5%). In addition, there are other forms of genetic variation besides SNPs, including copy number variants (in which a gene may be duplicated or deleted). Next-generation sequencing and genotyping technologies are required to identify and test rare variants and structural variants.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Oct 1, 2017 | Posted by in RHEUMATOLOGY | Comments Off on Leveraging Human Genetics to Develop Future Therapeutic Strategies in Rheumatoid Arthritis

Full access? Get Clinical Tree

Get Clinical Tree app for offline access