Population Genetics and Natural Selection in Rheumatic Disease




Human genetic diversity is the result of population genetic forces. This genetic variation influences disease risk and contributes to health disparities. Natural selection is an important influence on human genetic variation. Because immune and inflammatory function genes are enriched for signals of positive selection, the prevalence of rheumatic disease-risk alleles seen in different populations is partially the result of differing selective pressures (eg, due to pathogens). This review summarizes the genetic regions associated with susceptibility to different rheumatic diseases and concomitant evidence for natural selection, including known agents of selection exerting selective pressure in these regions.


Key points








  • If untreated, rheumatic diseases can diminish reproductive potential and impair the ability to raise offspring that successfully reproduce. Thus, it is likely that the frequency of disease-risk alleles seen in populations around the world is influenced by population-specific natural selection.



  • Both autoimmune and nonautoimmune rheumatic disorders show genetic associations in regions with signatures of selection.



  • The prevalence of rheumatic disease may result, at least partially, from past events of selection that increased host resistance to infection.



  • Many of the complexities of gene effects in different rheumatic diseases can be explained by population genetics phenomena.






Introduction


Rheumatic diseases are a family of more than 100 chronic, and often disabling, illnesses characterized by inflammation and loss of function, especially in the joints, tendons, ligaments, bones, and muscles. They collectively affect more than 20% of US adults, with osteoarthritis, rheumatoid arthritis (RA), spondylarthritides, gout, and fibromyalgia being the most prevalent. Patients often endure lifelong debilitating symptoms, reduced productivity at work, and high medical expenses. Arthritis and related illnesses, as well as back or spine problems, are major causes of disability. Importantly, because many rheumatic diseases present before or during a woman’s reproductive years, they can have effects on fetal and maternal outcomes, such as pregnancy loss in women with systemic lupus erythematosus (SLE) and vasculitis, and infertility in women with RA.


Most rheumatic diseases exhibit marked gender and ethnic disparities. Most predominately afflict women (eg, RA, SLE, systemic sclerosis, fibromyalgia), but spondyloarthropathies and gout are more common in men. African American individuals are at higher risk than European American individuals for SLE and systemic sclerosis, which they tend to develop earlier in life and experience more severe disease. Despite the variation in prevalence, incidence, and disease severity that are known to vary among ethnic groups, little is known about the genetic etiology of these diseases in the different populations and the reasons for the ethnic disparities remain elusive.


Left untreated, most rheumatic diseases can affect the ability to raise offspring that successfully reproduce and result in reduced reproductive fitness. Thus, alternative forces must exist that permit the relative high frequency of risk alleles. Because immune and inflammatory responses can be highly sensitive to environmental change, evolutionary adaptation to specific environments might have driven selection on immune-related genetic variants, impacting variant frequencies and leaving signatures of selection in the genome. Given that infectious organisms are strong agents of natural selection, it is plausible that alleles selected for protection against infection confer increased risk of autoimmune and inflammatory diseases, as the “hygiene hypothesis” postulates. It is thought that the adaptation to pathogen pressure through functional variation in immune-related genes conferred a specific selective advantage for host survival, including protection from pathogens and tolerance to microbiota. However, the emergence of such variation conferring resistance to pathogens is also influencing immune and inflammatory disease risk in specific populations.


In the past decade, multiple genome scans for signatures of selection on common variation have identified many immune-related loci. Similarly, 90 genome-wide association studies (GWAS) ( Table 1 ) have established rheumatic disease–associated alleles. There is also growing evidence that autoimmune and inflammatory disease–associated variants are under selection. This review expands on our previous work and summarizes the evidence for rheumatic disease–associated loci under selection and the candidate selective pressures. Given that genomic variation can have clinically important consequences, elucidating the patterns of variation and the functional role of the selective pressure might contribute to a better understanding of disease etiology and the development of new therapies for improved disease management.



Table 1

Rheumatic diseases with published genome-wide association studies (GWAS) and respective number of associated loci














































































Rheumatic Disease Number of
GWAS Loci
ANCA-associated vasculitis (AAV) 1 18
Ankylosing spondylitis (AS) 3 21
Behçet disease (BD) 5 9
Dermatomyositis (DM) 1 1
Gout 4 14
Granulomatosis with polyangiitis (GPA) 1 6
Juvenile idiopathic arthritis (JIA) 3 6
Kawasaki disease (KD) 6 16
Osteoarthritis (OA) 9 16
Osteoporosis (OP) 3 3
Paget disease (PD) 2 9
Psoriasis (PS) 11 60
Psoriatic arthritis (PsA) 2 4
Rheumatoid arthritis (RA) 19 129
Sjögren’s syndrome (SS) 1 4
Systemic lupus erythematosus (SLE) 16 124
Systemic sclerosis (SScl) 3 10

Numbers compiled from the NHGRI-EBI Catalog of Published Genome-Wide Association Studies ( https://www.ebi.ac.uk/gwas ). Accessed October 24, 2016.




Introduction


Rheumatic diseases are a family of more than 100 chronic, and often disabling, illnesses characterized by inflammation and loss of function, especially in the joints, tendons, ligaments, bones, and muscles. They collectively affect more than 20% of US adults, with osteoarthritis, rheumatoid arthritis (RA), spondylarthritides, gout, and fibromyalgia being the most prevalent. Patients often endure lifelong debilitating symptoms, reduced productivity at work, and high medical expenses. Arthritis and related illnesses, as well as back or spine problems, are major causes of disability. Importantly, because many rheumatic diseases present before or during a woman’s reproductive years, they can have effects on fetal and maternal outcomes, such as pregnancy loss in women with systemic lupus erythematosus (SLE) and vasculitis, and infertility in women with RA.


Most rheumatic diseases exhibit marked gender and ethnic disparities. Most predominately afflict women (eg, RA, SLE, systemic sclerosis, fibromyalgia), but spondyloarthropathies and gout are more common in men. African American individuals are at higher risk than European American individuals for SLE and systemic sclerosis, which they tend to develop earlier in life and experience more severe disease. Despite the variation in prevalence, incidence, and disease severity that are known to vary among ethnic groups, little is known about the genetic etiology of these diseases in the different populations and the reasons for the ethnic disparities remain elusive.


Left untreated, most rheumatic diseases can affect the ability to raise offspring that successfully reproduce and result in reduced reproductive fitness. Thus, alternative forces must exist that permit the relative high frequency of risk alleles. Because immune and inflammatory responses can be highly sensitive to environmental change, evolutionary adaptation to specific environments might have driven selection on immune-related genetic variants, impacting variant frequencies and leaving signatures of selection in the genome. Given that infectious organisms are strong agents of natural selection, it is plausible that alleles selected for protection against infection confer increased risk of autoimmune and inflammatory diseases, as the “hygiene hypothesis” postulates. It is thought that the adaptation to pathogen pressure through functional variation in immune-related genes conferred a specific selective advantage for host survival, including protection from pathogens and tolerance to microbiota. However, the emergence of such variation conferring resistance to pathogens is also influencing immune and inflammatory disease risk in specific populations.


In the past decade, multiple genome scans for signatures of selection on common variation have identified many immune-related loci. Similarly, 90 genome-wide association studies (GWAS) ( Table 1 ) have established rheumatic disease–associated alleles. There is also growing evidence that autoimmune and inflammatory disease–associated variants are under selection. This review expands on our previous work and summarizes the evidence for rheumatic disease–associated loci under selection and the candidate selective pressures. Given that genomic variation can have clinically important consequences, elucidating the patterns of variation and the functional role of the selective pressure might contribute to a better understanding of disease etiology and the development of new therapies for improved disease management.



Table 1

Rheumatic diseases with published genome-wide association studies (GWAS) and respective number of associated loci














































































Rheumatic Disease Number of
GWAS Loci
ANCA-associated vasculitis (AAV) 1 18
Ankylosing spondylitis (AS) 3 21
Behçet disease (BD) 5 9
Dermatomyositis (DM) 1 1
Gout 4 14
Granulomatosis with polyangiitis (GPA) 1 6
Juvenile idiopathic arthritis (JIA) 3 6
Kawasaki disease (KD) 6 16
Osteoarthritis (OA) 9 16
Osteoporosis (OP) 3 3
Paget disease (PD) 2 9
Psoriasis (PS) 11 60
Psoriatic arthritis (PsA) 2 4
Rheumatoid arthritis (RA) 19 129
Sjögren’s syndrome (SS) 1 4
Systemic lupus erythematosus (SLE) 16 124
Systemic sclerosis (SScl) 3 10

Numbers compiled from the NHGRI-EBI Catalog of Published Genome-Wide Association Studies ( https://www.ebi.ac.uk/gwas ). Accessed October 24, 2016.




Shared genetic etiology in rheumatic diseases


The family of rheumatic diseases is remarkable for its heterogeneity and similar underlying mechanisms. The genetic heritability of rheumatic diseases is extremely variable, ranging from very high in ankylosing spondylitis (AS) to almost negligible in systemic sclerosis. GWAS have proved particularly powerful for autoimmune diseases, including many autoimmune rheumatic diseases, which might be due to their immune and inflammatory genetic etiology. Table 1 summarizes the rheumatic diseases with published GWAS and the number of disease-associated loci uncovered from these GWAS.


The common genetic etiology is exemplified by the sharing of associated loci among rheumatic diseases, such as the Human Leukocyte Antigen ( HLA ), STAT4 , TNIP1, TNFAIP3, and BLK . This sharing of risk loci is greater among the groups of diseases characterized by the presence of particular serum autoantibodies (seropositive; such as RA, SLE) than it is between the seropositive and seronegative diseases (those typically characterized as not having associated serum autoantibodies). This supports the consensus that there is a common genetic background predisposing to autoimmunity and inflammation, and that further combinations of more serologically defined and disease-specific variation at HLA and non- HLA genes, in interaction with epigenetic and environmental factors, contribute to disease and its clinical manifestations. It has been suggested that different population genetic factors (eg, natural selection with coevolution with pathogens, random mutation, isolations, migrations, and interbreeding) in similar or distinct environments led to the establishment of the current plethora of loci that predispose to autoimmunity. It is thus plausible that population-level phenomena are a reason behind the complexity of gene effects in different autoimmune and rheumatic diseases.




Population genetics, natural selection, and adaptation


The genetic basis of disease is influenced by individual and population variation. Population-level phenomena, such as mutation, migration, genetic drift, and natural selection, have left an imprint on genetic variation that is likely to influence phenotypic expression in specific populations. Given its role in driving genetic variation, population genetics can help elucidate human genetic diversity and, consequently, disease etiology.


Natural selection is the process by which a trait becomes either more or less common in a population depending on the differential reproductive success of those with the trait. Natural selection drives adaptation , the evolutionary process whereby over generations the members of a population become better suited to survive and reproduce in that environment. N egative (or purifying) selection is the most common mechanism of selection, usually associated with rare Mendelian disorders. Positive selection increases the prevalence of adaptive traits by increasing the frequency of favorable alleles and is often associated with common complex traits. The enrichment for signals of positive selection among genes associated with complex traits is well documented. Balancing selection favors genetic diversity by retaining variation in the population as a result of heterozygote advantage and frequency-dependent advantage. Despite rarer, a pertinent example is the HLA (also known as major histocompatibility complex ( MHC )) region, where highly polymorphic loci play a central role in the recognition and presentation of antigens to the immune system. The high levels of polymorphism are the results of pathogen-driven balancing selection. The heterozygote advantage against multiple pathogens contributes to the evolution of HLA diversity, which in turn confers resistance against multiple pathogens and explains the persistence of alleles conferring susceptibility to disease. Nevertheless, there is also recent evidence that positive selection might be acting on specific HLA alleles in a local population due to unique environmental pressures.


Natural selection leaves a distinctive molecular signature in the targeted genomic region, and different statistical methods have been developed to detect signatures of selection. It has been hypothesized by Klironomos and colleagues that, in addition to genetic (sequence) variation, heritable epigenetic modifications can affect rates of fitness increase, as well as patterns of genotypic and phenotypic change during adaptation. However, the role of epigenetic variation in the response to natural selection has not been formally assessed, as the methodology to test signatures of natural selection on epigenetic variation is just emerging.




Natural selection in rheumatic disease


Given that, if untreated, rheumatic diseases can diminish reproductive potential and impair the ability to raise offspring that successfully reproduce, some evolutionary process must sustain the relative high frequency of risk alleles seen in current populations around the world. Because the human genome is shaped by adaptation to environmental pressures at the population level, one plausible reason for the higher frequency of disease-risk alleles may be the direct effect of population-specific natural selection. This hypothesis is supported by the experimental evidence for MHC heterozygote superiority against multiple pathogens, a mechanism that would contribute to the evolution of HLA diversity and explain the persistence of alleles conferring susceptibility to disease.


There is compelling evidence that natural selection is acting on a significant fraction of the human genome. Immune function genes and pathways are consistently reported in tests for natural selection. As a result of several genome-wide scans, more than 300 immune-related genes have been suggested as putative targets of positive selection. Although the challenge in validating the true signals remains, several genes involved in immune-related functions have been shown to be under selection.


A total of 61 regions with evidence for selection and association with at least one rheumatic disease are shown in Table 2 . This table includes 35 regions previously reported as being under selection in the literature, plus rheumatic disease–associated loci from current GWAS (in Table 1 ) and evidence of recent positive selection from HapMap phase II data. Specifically, a region published in the GWAS Catalog as associated with a rheumatic disease was considered as exhibiting evidence for natural selection if it contained at least 2 single-nucleotide polymorphisms (SNPs) within 200 kb with an absolute integrated Haplotype Score (iHS) value in the top 0.1% of the genome-wide distribution in one population (Asian, European, or African). A total of 39 regions that met these criteria are included in Table 2 , 14 of which were previously reported. These 39 regions with evidence for selection represent approximately 10% of all regions associated with a rheumatic disease in a GWAS: 13% for SLE, 9% for RA, 7% for psoriasis (PS), and 5% for AS. This fraction of disease-associated loci with concomitant evidence for selection is higher than previous reports focusing on SNPs instead of regions. Notably, when using the top 1% of iHS variants, Raj and colleagues reported that inflammatory diseases (which included AS, RA, and SLE) have 5% of SNPs targeted by positive selection. Limiting comparisons to SNPs instead of regions might miss regions with both evidence for disease association and selection at different SNPs. The numbers of GWAS-associated loci, including those with and without concomitant evidence for recent positive selection, are illustrated in Fig. 1 . Among all regions in Table 2 , a higher number of signals of selection were found in European (36%), followed by Asian (32%) and African (32%) populations. This is consistent with previous reports of enrichment of inflammatory disease SNPs targeted by positive selection in subjects of European ancestry.



Table 2

Rheumatic disease regions with evidence for selection and implicated agents of selection




















































































































































































































































































































































































































































































































Gene Region Position Rheumatic Disease Association References for Evidence of Natural Selection Population Selective Pressure References for Pathogen-Driven Selection
TNFRSF14, MMEL1* 1p36.32 RA YRI
IL23R 1p31.3 AS Protozoa
MAGI3, PTPN22* 1p13.2 RA, SLE YRI Protozoa
FCGR2B 1q23.3 SLE Plasmodium falciparum
TNFSF4 1q25.1 RA, SS, SLE
NCF2, RGL1* 1q25.3 SLE ASI
CR1 1q32 SLE Plasmodium falciparum
TLR5 1q41-q42 SLE YRI Salmonella enterica ser.
Typhimurium and other exposures
PELI1* 2p14 KD ASI
ALMS1P, DGUOK* 2p13.1 SLE CEU
PARD3B* 2q33.3 OA CEU
CNTN6* 3p26.3 SLE ASI
XCR1, CCR3* 3p21.31 BD YRI
CCDC66, ARHGEF3 3p14.3 RA YRI
BTLA 3q13.2 RA ASI
ARHGAP31, CD80 3q13.33 JIA, SLE YRI
MRPS22* 3q23 KD ASI
SLC2A9* 4p16.1 Gout YRI
KCNIP4* 4p15.2 RA CEU, YRI
TECRL* 4q13.1 KD CEU
ANTRX2 4q21 AS
IL2, IL21* 4q27 RA YRI
Intergenic* 4q28.3 SLE ASI
PTGER4 5p13.1 AS Protozoa
COMMD10, SEMA6A* 5q23.1 GPA ASI, CEU
ALDH7A1* 5q23.2 OP CEU
TNIP1 5q33.1 SLE, SScl, PsA
PTTG1 5q33.3 SLE
IRF4* 6p25.3 RA CEU
ITPR3 6p21.31 SLE YRI
HLA* 6p22.1- 6p21.32 AAV, AS, BD, GPA, JIA, KD, OA, PS, PsA, RA, SS, SLE, SScl ASI, CEU, YRI Bacterial infection
SNRPC, UHRF1BP1* 6p21.31 SLE CEU Mycobacterium tuberculosis
VARS, LSM2 6p21 SLE
CCDC167, MIR4462* 6p21.2 SLE YRI
PRDM1, ATG5* 6q21 RA, SLE YRI
TRAF3IP2* 6q21 PS, PsA YRI
IKZF1 7p12.2 SLE
GTF2I* 7q11.23 SS ASI
HIP1* 7q11.23 SLE YRI
LSMEM1, NPM1P14* 7q31.1 OA ASI
XKR6, BLK* 8p23.1 KD, RA, SS, SLE, SSc ASI
GRHL2* 8q22.3 RA CEU
KDM4C* 9p24.1 SLE ASI
NTNG2, SETX* 9q34.13 SLE CEU
FAM171A1* 10p13 SLE ASI
CTNNA3* 10q21.3 PsA CEU
CD5 11q12.2 RA
GRM5* 11q14.3 RA CEU
OSBPL8* 12q21.2 SLE CEU
SH2B3, NAA25 12q24.12- q24.13 RA Bacterial infection
KIAA0391* 14q23.1 PS ASI
PRKCH, HIF1A* 14q23.1 RA CEU, YRI
CLEC16A, CIITA 16p13.13 RA, SLE
ITGAM, ITGAX 16p11.2 SLE
PRSS54* 16q21 SLE YRI
WWOX 16q23.2 OA CEU
IRF8 16q24.1 RA, SScl
RABEP1, NUP88* 17p13.2 RA CEU
BCAS3, NACA2* 17q23.2 Gout, OA ASI
TYK2 19p13.2 RA, SLE Protozoa
PAK7* 20p12.2 PS YRI

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Sep 28, 2017 | Posted by in RHEUMATOLOGY | Comments Off on Population Genetics and Natural Selection in Rheumatic Disease

Full access? Get Clinical Tree

Get Clinical Tree app for offline access