Fig. 10.1
Concepts of Mendelian inheritance. (a) In dominant inheritance, one parent carries the disease-causing allele D, with one copy sufficient to show the phenotype. Offspring who inherited the D allele from this parent would show the phenotype. Based on Mendelian genetics, the ratio of inheriting to not inheriting the D allele would be 1:1. (b) In recessive inheritance, both parents have to be carriers of the disease-causing allele r. Only the offspring who inherited two r alleles from both parents would show the phenotype. The Mendelian ratio of non-carrier to carrier to diseased offspring in this example would be 1:2:1
There are a number of skeletal abnormalities that follow Mendelian inheritance. Classic examples included osteogenesis imperfecta (OI), a fragile bone condition due to mutations in one of the two α-chains of the collagen I genes (COL1A1 and COL1A2), and spondyloepiphyseal dysplasia (SED), which is characterized by dwarfism and shortened limbs, caused by mutations in the gene encoding collagen II (COL2A1), affecting the spine and epiphyses of long bones. These are rare disorders resulting from a single rare mutation which seriously affects the proper functioning of the gene products. On the other hand, common genetic variants in these genes can give rise to milder conditions. For example, genetic variations in COL1A1 are associated with reduced bone density and osteoporosis (Grant et al. 1996), while the COL2A1 gene is linked to osteoarthritis (OA) (Vikkula et al. 1993). These conditions, in general, are caused by multiple factors rather than a single genetic mutation, leading to the concept of complex diseases.
10.2.1.2 Complex Diseases
Complex diseases are dependent on multiple factors, which can be due to genes alone, or genes in combination with environmental factors. Taking osteoporosis as an example, apart from COL1A1, genes associated with bone loss include apolipoprotein E (APOE), vitamin D receptor (VDR), and interleukin 6 (IL6); identified environmental risk factor are age, gender, body mass index, smoking, and medications. Moreover, the association with APOE and COL1A1 being restricted to a subgroup of postmenopausal women not using hormone replacement therapy suggests that environmental factor could modulate the genetic effect on the disease (Mitchell and Yerges-Armstrong 2011). Similarly for OA, multiple genes including COL2A1, SMAD3, growth differentiation factor 5 (GDF5), and type II iodothyronine deiodinase (DIO2) also showed association with age, gender, and body mass index as known risk factors (Valdes and Spector 2011). In these cases, the allelic variations associated with the diseases are common, and individuals with these alleles may only increase their risk of developing the diseases. Therefore, in complex diseases, multiple genetic susceptibility and environmental factors contribute to disease outcome; they may act independently, or they modify the effect of each other through gene-gene interaction and/or gene-environment interaction (Fig. 10.2). Such effects can be additive, synergistic, or suppressive. Thus, complex diseases no longer present as simple Mendelian inheritance, and intervertebral disc degeneration is another such an example.
Fig. 10.2
Concept of complex diseases. Multiple genetic and environmental factors can be involved in a complex disease. They can act independently (e.g., gene A and environmental factor 1). Genetic factors can interact (e.g., gene B and gene C), or they can interact with environmental factor (e.g., environmental factor 2 and gene C) contributing to the disease
10.2.2 Intervertebral Disc Degeneration Is a Complex Trait
Prior to the involvement of genetic studies, age, gender, cigarette smoking, and mechanical loading related to occupation and sport activities have been reported to influence disc degeneration (this topic is discussed in more detail in Chap. 9). These findings place disc degeneration within the category of a complex trait. However, there is no consensus on the contribution of these environment factors to the degenerative process and hence this begs the question: what is the contribution of genetics to the intervertebral disc disease?
10.2.3 Genetics as a Major Contributing Factor to Intervertebral Disc Degeneration
To decipher the involvement of genetics in intervertebral disc degeneration, a classical heritability test (Box 10.2) was undertaken to identify patterns of familial aggregation. Two such analyses were conducted in 1995. In one study, 20 pairs of Finnish male identical twins were assessed based on magnetic resonance imaging (MRI) for the degree of similarities in degenerative findings, including disc desiccation, disc height narrowing, and disc bulging or herniation. The authors concluded that while smoking and age accounted for at most 15 % of the variability in the degenerative findings, 26–72 % of the variability was explained when the co-twin status was included (Battie et al. 1995a). In another study, 115 pairs of male identical twins with discordant exposures to suspected environmental risks were assessed for degenerative changes of the spine and symptomology. For changes in the upper lumbar region, occupational physical loading explained only 7 % of the variability in the summary score for degeneration; this increased to 16 % when age was included and to 77 % when twin status was added. In the lower lumbar region, leisure physical loading explained only 2 % of the variability in the degeneration scores; this increased to 9 % with the addition of age as a factor and to 43 % with the addition of the twin status (Battie et al. 1995b). These studies provide quantitative evidence for the existence of familial aggregation and potential genetic influences in intervertebral disc degeneration. These findings were later consolidated in a study comparing 86 pairs of monozygotic twins and 77 pairs of dizygotic twins from Australia and Britain for the contributions of genetic and environmental effects on disc degeneration (Sambrook et al. 1999). Using an overall degenerative score (summing the grades of disc height, bulge, osteophytosis, and signal intensity), heritability was estimated to be 74 % at the lumbar spine, after adjusting for age, weight, height, smoking, occupation, and exercise (Sambrook et al. 1999). This finding indicated a high degree of genetic involvement in intervertebral disc degeneration that lead to the first publication in 1997 on an associated gene, vitamin D receptor (VDR) (Videman et al. 1998), and subsequently other genes.
10.2.3.1 Box 10.2: Heritability Test
Heritability refers to the proportion of phenotypic variance attributed to genetic variance. It involves observing and statistically analyzing the patterns of phenotypes with varying levels of genetic or environmental background in close kin, such as parent-offspring, siblings, and twins (Visscher et al. 2008). More often, heritability can be estimated from identical twins grown up in separate environment (adoption studies). Under this situation, the genotype would be identical but the environmental factors vary, and thus the effects of the two factors can be separated. However, such twins are not easy to gather, and the age when they are separated may affect the findings. Another design would be to compare monozygotic and dizygotic twin pairs. In this scenario, the twin pairs would experience similar environmental factors, and thus comparing the phenotypic concordance of these two types of twins allows an estimation of the impact of genetic factors
10.3 Phenotypic Parameters for Assessing the Genetics of Intervertebral Disc Degeneration
10.3.1 What Defines Disc Degeneration?
Knowing that genetic components are involved in intervertebral disc degeneration, it raises the question what genes or genetic variations contribute to the “disease”? To address this question, it is first critical to define disc degeneration. Having a precise phenotype definition is essential for genetic studies in that the phenotype should be a distinguishable trait and preferably quantifiable (Wagner and Zhang 2011). Disc degeneration is a continuous process throughout life: predictable macroscopic changes include disruption of the highly organized lamellae structure, formation of tears and fissures in the annulus fibrosus, leakage of the nucleus pulposus through the fissures leading to disc bulge and herniation, damage of the end plate, an overall reduction in disc height, dehydration, increased cell death, and cell cluster formation. From a biochemical perspective, the major biochemical alterations include the loss and breakdown of collagens and aggrecan, resulting in reduced tensile strength and impaired hydration, respectively. Thus, while ample observations have been made to assess the alterations in disc degeneration, its definition remains indistinct, and there is a poor understanding of the relationship between these changes and their representation in the degenerative process.
10.3.1.1 Box 10.1: Glossary
Allele: An alternative form of the genetic material at a certain position (locus), which may or may not lead to a difference in observable traits (phenotype).
Mutation: A change in the genetic material which can involve a single nucleotide (point mutation) or a segment of genomic sequence (insertion, deletion, duplication, inversion, and translocation). The outcome of a mutation can be harmful leading to diseases, while neutral or beneficial effects are also possible.
Phenotype: An observable trait as a result of the genotypic composition but at the same time can be affected by environmental factors and their interactions with the genetic components.
Dominance: The relationship of the two alleles at a certain locus where one allele can mask the effect of the other to demonstrate the phenotype.
Recessive: The relationship of the two alleles at a certain locus where the effect of one allele is masked by the other and thus two copies of such allele are required to demonstrate the phenotype.
Linkage disequilibrium (LD): A situation when genotypes at two or more loci are not independent of each other and that there would be a difference between the observed and expected frequencies of certain combinations of alleles (haplotype) in a population.
Restriction fragment length polymorphism (RFLP): Genetic variation leading to the creation or destruction of restriction enzyme site and therefore generating different DNA fragments after enzymatic digestion. The variation can be identified simply by detecting the resultant fragment lengths using gel electrophoresis. This is an inexpensive technique for genotyping.
Variable number of tandem repeat (VNTR): A type of genetic variation in which a short nucleotide sequence repeats consecutively. The number of repeats can be variable, and thus in a population, more than two alleles can be present for a particular locus.
Single nucleotide polymorphism (SNP): The most common type of genetic variation which involves only a single nucleotide difference. It occurs throughout the whole genome. Unlike VNTR, each SNP typically contains only two alleles.
Genome–wide association study (GWAS): Examination of a huge number of genetic variants that are distributed all over the genome in multiple individuals to identify whether any of them are associated with a trait. The focuses are generally placed on the common SNPs and traits such as common diseases.
Epigenome–wide association study (EWAS): A new concept similar to GWAS – examination of a large number of epigenetic variations, such as DNA methylation, over the genome in multiple individuals to identify their associations with a trait.
10.3.2 Parameter Currently Used to Define Disc Degeneration
Theoretically, all the features mentioned above, and in other chapters of this book, could be used as measurable parameters for the study of disc degeneration; however, only a few can be assessed in vivo. Current assessments of disc degeneration rely on radiography and MRI. Radiographs display the density and composition of an object based on the proportion of X-rays being absorbed, providing information such as disc height and osteophyte formation. In contrast, MRI detects the rotating magnetic field of photons, an obvious advantage for the intervertebral disc as the nucleus pulposus is a hydrated tissue. Therefore, MRI can provide information on the hydration status, bulging and herniation, as well as irregularities of the end plates. In particular, the MRI images of a disc which is bright and bloated represent a highly hydrated tissue, and with the progressive loss of water, the image becomes dark. As indicated in Chap. 12, MRI can also assess proteoglycan contents in the disc, although improvements in accuracy are still required (Benneker et al. 2005; Marinelli et al. 2009).
There are multiple ways to evaluate the degenerative changes in the intervertebral disc. The first one would be to provide a definitive notation on the presence or absence of disc degeneration. This method is simple but the shortcoming is loss of information on the progressive changes during the degenerative process. Another method is to classify the severity of degeneration based on a scaling system. This is the most widely used method, and a number of classification systems have been developed. For example, Kellgren scale (Kellgren et al. 1963) combines the features of osteophytes and joint space narrowing based on radiographs and summarizes using a four-point score, ranging from score of 1 indicating no or very small osteophytes to score of 4 representing large osteophytes and pronounced disc space narrowing; Schneiderman’s grading (Schneiderman et al. 1987) focuses on the signal intensity of the nucleus pulposus from MRI images and classifies them into four grades, with the lowest grade indicating hyperintense signal to the highest grade illustrating hypointense signal with disc space narrowing; Pfirrmann’s classification (Pfirrmann et al. 2001) also utilizes the MRI images to evaluate the homogeneity of disc structure, signal intensity, distinction of nucleus pulposus and annulus fibrosus, as well as disc height. The information is converted into five grades, the lowest grade being homogeneous disc structure, hyperintense bright signal intensity, and normal disc height and the highest grade being inhomogeneous disc structure, hypointense black signal, loss of distinction between nucleus pulposus and annulus fibrosus, as well as collapsed disc space. While these grading systems maintain information regarding the severity of disc degeneration and provide semiquantitative evaluation of the degenerative status, interpretation of the MRI images is subjective and thus requires multiple experienced observers to perform the grading. The third method for assessing disc degeneration is by computational evaluation of the absolute signal intensity values of the MRI images (Videman et al. 1994). This approach can circumvent the potential bias arising from observers’ judgment, but the data generated would likely be composed of a spectrum of values complicating subsequent data analyses.
The methods mentioned above evaluate the status of a single disc, whereas there are multiple disc levels and each may display a different stage of degeneration. This raises the question, should the grades of various levels be combined and a summation score obtained representative of the degree of proneness of an individual to disc degeneration? Alternatively, should each level be treated separately and reported as multilevel disc degeneration? Moreover, should other parameters such as disc bulging, herniation, and end-plate irregularities from the MRI images be treated independently or considered as a part of the degenerative process? This is still uncertain, again due to the limited understanding of disc degeneration.
The variability of phenotypic parameters that define the degenerative status of the intervertebral disc confounds genetic studies. This is especially true for many situations, where replication and meta-analysis studies are required to substantiate the research findings, particularly, when the effect size is small. Replication provides credibility to initial findings of association, while meta-analysis increases the statistical power to detect associated genes. Both require comparable phenotypes among studies to produce meaningful results (Chanock et al. 2007; Nakaoka and Inoue 2009). Therefore, a unified phenotypic definition for disc degeneration is an absolute requirement for the successful identification of the involved genetic components.
10.3.3 Aging and Intervertebral Disc Degeneration
Disc degeneration is part of the normal aging process. With age, an increasing prevalence and severity of disc degeneration have been observed (Cheung et al. 2009). However, an understanding of what constitutes “normal progression” remains unclear, as many factors can participate and modify the degenerative process. Genetics is one of the components that can alter this “normality” by accelerating or decelerating the degenerative process. This change is reviewed elsewhere in the book and includes changes in cell function such as expression levels of genes, the stability of mRNA transcripts or proteins, or the binding affinity of proteins interacting partners, caused by the genetic variations in or near participating genes. In establishing a cohort for genetic study of disc degeneration, the effect of age must be taken into consideration in terms of subject recruitment and data analyses. Statistically, if sufficient population information is available, then appropriate adjustments can be established. An example would be a sliding window method, in which the degenerative score was logarithmic transformed to reduce skewness and standardized to a mean of 0 and a variance of 1 in each decade of age of the samples (Virtanen et al. 2007). The idea behind such adjustment is to identify the “normality” within the age band of a certain cohort, such that samples showing “accelerated” or “retarded” degeneration can be highlighted during analysis.
10.4 Moving from Phenotype Information to Identification of Genes Causing or Contributing to Disease
10.4.1 Candidate Genes
This approach utilizes the biological knowledge and etiology of the disease to identify genes of interest and to determine correlations between variants within the genes and the phenotype. The phenotype can provide clues to potential candidate genes, while information gathered from expression studies and animal models can enhance the selection process (Tabor et al. 2002). This method has been successful in identifying the genetic components of several skeletal diseases. For example, although over 90 % of the osteogenesis imperfecta cases are due to mutations in COL1A1 and COL1A2 genes, the causes of the remaining cases remained unknown. Until recently, candidate gene approach helped identify mutations in genes involved in posttranslational modifications required for collagen folding and stability. One of the modifying complexes, prolyl 3-hydroxylase 1 (LEPRE1) coupled with cartilage-associated protein (CRTAP) and cyclophilin B (PPIB), converts specific proline to 3-hydroxyproline, which is important in the formation of collagen triple helix. Also, since Crtap-null mice demonstrated skeletal abnormalities resembling a subtype of OI (Morello et al. 2006), mutation analysis of CRTAP in this subtype of OI (Barnes et al. 2006) identified mutations associated with the disease. Similarly, mutations in LEPRE1 and PPIB were also detected in OI patients (Cabral et al. 2007; van Dijk et al. 2009).
Apart from rare diseases, candidate gene approach also aids the identification of risk factors in common skeletal diseases, with osteoarthritis being an excellent example. There have been many genes reported to be associated with OA (Valdes and Spector 2011), among which GDF5 is one of the promising candidates as it survived testing in multiple populations and showed high significance in a meta-analysis (Miyamoto et al. 2007). It was selected as a candidate since it is involved in joint formation (Francis-West et al. 1999), and thus, a common variant in GDF5 could play a role in the disease (Miyamoto et al. 2007). Similarly, genes related to cartilage homeostasis have been studied as candidates, including extracellular matrix components, matrix degrading enzymes, and genes involved in TGF-β and Wnt signaling pathways, inflammation, and apoptosis. A careful selection of candidates with high biological relevance to the diseases of interest is often the key to success. Moreover, the results and downstream functional studies can provide new understanding in connecting molecular mechanisms with the disease state.
10.4.2 Family Linkage Analysis
Unlike the candidate gene approach, classical linkage analysis relies less on biological knowledge of the disease being studied; rather, it utilizes the principle of co-segregation where the disease-causing alleles, together with nearby markers, are passed on to the next generation within a family. These markers are linked to the disease because recombination events are infrequent within a short stretch of DNA. They can be in the form of restriction fragment length polymorphisms (RFLP) of a candidate gene, where specific patterns of DNA fragments are linked to disease. If there is no clue to any candidate genes, whole-genome scans using microsatellite markers can be used to locate disease susceptible loci, while further genotyping or sequencing can identify the causative variants. More recently, initial mapping has used high-density whole-genome SNP arrays and was successful in locating several new OI loci for downstream investigation (Alanay et al. 2010; Lapunzina et al. 2010; Martinez-Glez et al. 2011). In general, detection is most successful with large families with multiple generations of affected members evidencing a clearly defined phenotype. As such, the traditional family linkage approach is more commonly used to identify affected genetic regions of rare disorders.
10.4.3 Case-Control Association Studies
This is the method of choice for common diseases with a complex trait. It does not rely on family data, but rather on a large cohort selected from the general population or recruitment of patients with a defined phenotype. The allele or genotype frequencies between the case and control groups are compared, with statistical tests being applied to objectively discern if there are differences (Daly 2009). Chi-square test is commonly used for this purpose, while Fisher’s exact test should be used when sample sizes are small. On the other hand, regression analysis can be used if it is presumed that there is varying degree of effects, such as an additive influence between different genotypes. A disadvantage of this approach is that if the effect size is small, statistical significance is usually achieved only with very large cohorts (in the thousands) or if meta-analysis is performed using different cohort collected from multiple centers, regional and international.
10.5 Genes Associated with Intervertebral Disc Degeneration
While different strategies can be used to find genetic risk factors for intervertebral disc degeneration, the predominant approach is through case-control studies and the selection of appropriate candidate genes. These genes are usually chosen based on our current knowledge of the biology of the disc in health and disease, and the logic behind the selection is centered on the integrity of the disc tissue (Fig. 10.3). Collagens and aggrecan, together with other structural proteins, form the basis of the extracellular matrix which is an integral part of the disc. Not only are they the most abundant molecules, their role in providing tensile strength and osmotic pressure is essential for proper disc function. Therefore, it is not surprising that the extracellular matrix genes were prime candidates for study. In addition, tissue homeostasis – a balance between synthesis and degradation – would be equally important, and there are a number of studies focusing on catabolic genes, namely, the aggrecanases and matrix metalloproteinases (MMPs). In fact, a number of these genes are found to have elevated expression levels as well as increased enzymatic activity during disc degeneration (Roberts et al. 2000; Le Maitre et al. 2004). Such changes can be triggered by the proinflammatory cytokines, such as interleukins (Millward-Sadler et al. 2009), which are also expressed at elevated levels in the degenerative process (Le Maitre et al. 2005, 2007), resulting in overall accelerated degradation and posing an adverse effect on matrix integrity. Ultimately, it is the disc cells that are responsible for producing and maintaining the extracellular matrix; thus, genes that affect cell function and survival have also been studied for their associations with disc degeneration. In this section, specific genes will be highlighted, enhancing our understanding of their functions in the disc as well as their roles in the degenerative process.
Fig. 10.3
Current concepts of candidate gene selection. Candidate gene selection is closely related to the biology of normal disc and involvement in disc degeneration. (a) The extracellular matrix molecules form an integral part of the disc, providing mechanical strength, water absorption properties, as well as proper environment for cells. Their functional importance and high abundance make them the major genetic candidates. (b) During degeneration, increased matrix degradation is caused by enzymes such as MMPs and aggrecanases, while the presence of proinflammatory cytokines can accelerate such a process; moreover, cell maintenance is affected. These are the current understanding of the degeneration mechanisms, in which the involved molecules serve as important candidates to be studied
10.5.1 Extracellular Matrix Proteins
Aggrecan is the major proteoglycan in the disc, responsible for water absorption and retention through its highly negative charged chondroitin sulfate (CS) side chains. A variable number of tandem repeat (VNTR), located at exon 12 encoding the CS attachment domain, have been identified, which dictate the length of the aggrecan core protein as well as the potential number of attached CS chains (Doege et al. 1997). The first study showed an association between this VNTR of aggrecan with lumbar disc degeneration in a group of 64 young Japanese women aged 20–29 (Kawaguchi et al. 1999). There was an overrepresentation of the shorter alleles of 18 and 21 repeats in multilevel disc degeneration as well as an association with the severity of disc degeneration (Kawaguchi et al. 1999). The relevance of 21 repeat alleles with multilevel disc degeneration was replicated in a study of Koreans; the age of the subjects analyzed was limited to the fourth decade or younger (Kim et al. 2011). On the other hand, a Turkish study showed that the shorter alleles (13–25 repeats) were overrepresented in young individuals with disc degeneration (Eser et al. 2011). The shorter alleles were also associated with disc herniation in Han Chinese (21 and 25 repeats) (Cong et al. 2010a, b) and Turkish populations (13–25 repeats) (Eser et al. 2011). Together, these studies indicate that individuals carrying the shorter alleles are more susceptible to the severe forms of disc degeneration. Interestingly, the study in Han Chinese also showed a 4.5-fold increase in risk for symptomatic disc degeneration with smoking, suggesting an interaction between this polymorphism and smoking (Cong et al. 2010a, b). It is possible that smoking or nicotine metabolites can alter the metabolism of aggrecan, with a greater effect on the shorter forms of this molecule. However, similar association tests have not been conducted with other populations, and the author also pointed out the study was limited by the small sample size (132 cases) (Cong et al. 2010a, b). While most studies support the influence of the shorter alleles, 25 repeats or less, with disc degeneration, a study conducted with a Finnish cohort of 132 males, 40–45 years of age, showed that only the allele with 26 repeats was significantly associated among individuals with a dark nucleus pulposus (Solovieva et al. 2007). The differences could be due to ethnic variations, as well as the relatively small sample size. Nevertheless, as these studies provided supportive evidence for aggrecan as a genetic risk factor for disc degeneration, it would be worthwhile to perform a meta-analysis.
Collagen I is the predominant collagen in the annulus fibrosus and responsible for the highly organized lamellae structure that provides the tissue with its tensile strength. It is encoded by two genes, COL1A1 and COL1A2. An SNP located at the binding site of the transcription factor Sp1 (rs1800012) in the first intron of COL1A1 was initially identified and found to be associated with reduced bone mineral density and an increased risk of fracture and turnover (Grant et al. 1996; Garnero et al. 1998; Uitterlinden et al. 1998). Since it was suggested that there was an inverse relationship between osteoporosis and disc degeneration, the Sp1 binding site polymorphism in the COL1A1 gene was investigated in a Dutch cohort of 517 individuals who are at least 65 years of age (Pluijm et al. 2004). Individuals with the TT genotype were shown to have a 3.6 times higher risk of disc degeneration when compared with those having GT or GG genotype, after adjusting for age, sex, and body weight (Pluijm et al. 2004). It should be noted that disc degeneration in this study was defined by the presence of osteophytes and articular joint space narrowing based on radiographs, as opposed to the reduced signal intensity seen on MRI images. Despite this limitation, the Sp1 polymorphism was replicated in a small study of 40 young Greek army recruits. Here, the TT genotype was not found in any of the controls, but 33.3 % among those with disc degeneration (Tilkeridis et al. 2005). Association studies also showed that the Sp1 polymorphism was involved in hip osteoarthritis (Lian et al. 2005), myocardial infarction (Speer et al. 2006), cruciate ligament ruptures (Khoschnau et al. 2008), and stress urinary incontinence (Skorupski et al. 2006). It remains uncertain how a single polymorphism can participate in all of these dissimilar conditions, but it is generally agreed that increased binding affinity of Sp1 for the T allele leads to an increase in mRNA and protein levels. This in turn changes the ratio of α1(I) to α2(I) chains, resulting in an altered biomechanical properties (Mann et al. 2001).
Collagen IX is minor collagen coating the surface of collagen II/XI fibrils and is thought to interact with other matrix molecules to maintain matrix integrity (see Chap. 5). Its importance has been demonstrated in mice carrying truncated form of α1(IX) (Kimura et al. 1996) or inactivated Col9a1 gene (Boyd et al. 2008), both of which showed accelerated disc degeneration when compared with their normal counterparts. It is a heterotrimer encoded by three different genes, namely, COL9A1, COL9A2, and COL9A3. Analysis of the COL9A2 gene identified two consecutive SNP polymorphisms (rs12077871 and rs2228564) in exon 19, leading to a substitution of tryptophan for either glutamine or arginine at residue 326 (Annunen et al. 1999). Interestingly, this tryptophan allele was present in 6 out of 157 Finnish individuals with disc degeneration and associated sciatica, but none among the 174 controls (Annunen et al. 1999). Since this polymorphism involves a tryptophan (Trp) substitution in the α2(IX) chain, it is named the Trp2 allele. An age-stratified analysis of a group of 804 Southern Chinese (40–49 years) showed a 2.4-fold increase in risk of developing disc degeneration and end-plate herniation in those carrying the Trp2 allele (Jim et al. 2005). Moreover, affected Trp2 individuals were found to have more severe disc degeneration (Jim et al. 2005). This was confirmed in a study of 84 Japanese patients (under 40 years) undergoing lumbar disc nucleotomy (Higashino et al. 2007). However, another larger-scale Japanese study of 658 controls and 470 cases could not replicate the findings (Seki et al. 2006).
In addition to the Trp2 allele, a similar arginine to tryptophan substitution at residue 103 was detected in exon 5 of the COL9A3 gene (rs61734651) (Paassilta et al. 2001). This Trp3 allele was found in a Finnish cohort of patients at a significantly higher frequency of 12.2 % among 171 cases when compared to 4.7 % among the 321 controls, with a threefold increase in the risk of disc degeneration (Paassilta et al. 2001). A higher proportion of the Trp3 allele was also detected among people with disc degeneration than controls in a Greek study, but the difference did not reach statistical significance (Kales et al. 2004). There was also the possibility that the Trp3 allele might act synergistically with persistent obesity, an interaction that would serve to increase the risk of disc degeneration (Solovieva et al. 2002). In addition, interaction of Trp3 with another polymorphism, interleukin-1β (C(3954)-T), was examined. It was noted that carrying this allele in the absence of the interleukin-1β 3954T allele resulted in an increase in the risk of a “dark nucleus pulposus” (Solovieva et al. 2006). These results indicated the potential of the Trp3 allele interacting with environmental and genetic factors modifying its effect. There were other studies testing the association of Trp2 and Trp3 alleles, but the Trp2 allele was absent in Greek (Kales et al. 2004) and only present at a low frequency of 1.2 % in German (Wrocklage et al. 2000), while Trp3 was absent in Southern Chinese (Jim et al. 2005) and Japanese (Higashino et al. 2007), suggesting substantial ethnic variations.
An association for the COL9A1 gene with disc degeneration was also suggested in a study of 25 selected candidate genes in a cohort of 588 Finnish male monozygotic and dizygotic twins (Videman et al. 2009). A particular SNP (rs696990) located at the 5′ of the gene was associated with the disc signal intensity, which survived an empirical threshold value for global significance.
The Trp2 and Trp3 allelic products are incorporated into the cross-linked fibrillar network of developing human cartilage (Matsui et al. 2003). Thus, any pathological consequences are likely to be long term and cause alterations in the tissues mechanical properties (Matsui et al. 2003). Indeed, among human disc samples carrying the Trp2 allele, altered or comprised swelling pressure and compressive modulus was detected (Aladin et al. 2007). Although, the precise mechanism is still unclear, a hypothesis is the bulk side chain of Trp residue may interfere with the interaction of collagen IX with other matrix molecules including collagen II. This would destabilize the matrix and thus affect the biomechanical properties of the disc.
Type XI collagen forms the core of collagen II/XI/XI fibrils and functions to control the diameter of the fibril. It is composed of three α-chains encoded by the COL11A1, COL11A2, and COL2A1 genes (see Chap. 5). An initial screening of these genes identified an SNP c.4603C → T (rs1676486) in the coding region of COL11A1 to be associated with lumbar disc herniation among Japanese (Mio et al. 2007). The association was confirmed by testing all the sequence variations in COL11A1, among which this SNP remained the most significant, as well as by increasing the cohort size to a total of 823 cases and 838 controls. It was suggested that this SNP affected mRNA stability since the expression level of the T allele is significantly lower than that of the C allele (Mio et al. 2007).
Cartilage intermediate layer protein (CILP) is a non-collagenous matrix component initially found in the middle zone of human articular cartilage (Lorenzo et al. 1998). An SNP 1184T → C (rs2073711) in the encoded region of the gene was identified to be significantly associated with disc degeneration in a cohort of 467 cases and 654 controls of a Japanese population (Seki et al. 2005). This SNP is non-synonymous, resulting in an amino acid substitution of isoleucine at residue 395 to threonine. The authors demonstrated in vitro that CILP could inhibit TGF-β-induced transcription, and the effect of inhibition was increased in the presence of C allelic product (Seki et al. 2005). This SNP might also exhibit a differential gender effect since a small study in Japanese collegiate athletes showed an association in male, but not female athletes (Min et al. 2010). On the other hand, a recent study in a Finnish cohort found an association only in females (Kelempisioti et al. 2011), while replication in Chinese population cohort failed (Virtanen et al. 2007), suggesting that other factors such as ethnicity, sex, and environment may be modulating the effect of this polymorphism.
Asporin (ASPN) belongs to the family of small leucine-rich proteoglycans (SLRP), members of which include decorin and biglycan (see Chap. 4) (Lorenzo et al. 2001). It contains a stretch of aspartic acid residues at the amino-terminal, which are variable in number (Lorenzo et al. 2001). While a repeat of 13 aspartic acid residues (D13) was the most common variant, a repeat of 14 residues (D14) was overrepresented among patients with osteoarthritis in a Japanese population (Kizawa et al. 2005). Since both osteoarthritis and disc degeneration are degenerative “cartilage diseases,” ASPN was hypothesized as a candidate gene for disc degeneration (Song et al. 2008a). The aspartic acid repeats were tested in Chinese and Japanese cohorts of 1,055 and 1,353 individuals, respectively, and the D14 allele was overrepresented in the case groups. Meta-analysis showed that individuals carrying this allele were at higher risk with an overall odds ratio of 1.7 (Song et al. 2008a). Increased asporin expression was detected among degenerated discs (Song et al. 2008a; Gruber et al. 2009); in addition, the D14 allelic product showed a greater suppression of TGF-β-mediated transcription than that of the D13 allelic product in vitro (Kizawa et al. 2005). As discussed elsewhere in this book, TGF-β signaling regulates the expression of key matrix proteins such as collagen II and aggrecan. Indeed, in vitro studies support the notion that the risk allele would have a negative effect on the synthesis of matrix molecules (Kizawa et al. 2005).
Matrilins are a four-member family of multi-subunit extracellular matrix proteins (see Chap. 5). They function as adaptors in the assembly of various matrix molecules including aggrecan, collagen type II, SLRPs, and cartilage oligomeric matrix protein (COMP) (Klatt et al. 2011). In a Rotterdam study, it was found that a non-synonymous SNP (rs28939676) of matrilin-3, in which threonine at position 303 was substituted by methionine, was associated with disc degeneration at two or more levels based on radiographs, leading to an increased risk of 2.9 among individuals carrying the T allele (Min et al. 2006). On the other hand, this association was not confirmed in another cohort study of sibling pairs of Dutch origin (Min et al. 2006). While the effect of this polymorphism during disc degeneration remained unknown, it was postulated that the polymorphism weakened the role of matrilin-3 in stabilizing the extracellular matrix molecules (Min et al. 2006). Related to this finding, recent studies demonstrated that in primary human chondrocytes, the presence of matrilin-3 can induce the expression a number of proinflammatory cytokines including IL6, IL8, and TNFα, as well as degradative enzymes MMP1, MMP3, and MMP13 (Klatt et al. 2009). These molecules are known to be triggered during disc degeneration, suggesting a possible relationship between matrilin-3 and inflammation.
Thrombospondin-2 (THBS2) belongs to a family of extracellular matrix proteins, thrombospondins, with multi-subunit. The protein is thought to be involved in cell-matrix interaction, antiangiogenesis, regulation of collagen fibrillogenesis, and the effective levels of MMP2 and MMP9 (Bornstein et al. 2004). An intronic SNP, IVS10-8C → T (rs9406328) in THBS2 was shown to be significantly associated with lumbar disc herniation in two independent Japanese populations, composed of 847 cases and 896 controls (Hirose et al. 2008). The TT genotype caused a significantly higher rate of exon 11 skipping when compared to the CC genotype, with a reduction of MMP2 and MMP9 binding. These data suggested that THBS2 could be involved in regulating MMP expression in the disc, which in turn participate in the pathogenesis of disc herniation. Moreover, the authors also identified a combinatorial effect with a non-synonymous SNP (rs17576) in MMP9, with an odds ratio of 3.03, indicating a potential gene-gene interaction (Hirose et al. 2008).
10.5.2 Matrix Metalloproteinases and Other Proteases
Matrix metalloproteinases (MMPs) are a large protein family with a wide spectrum of substrates including extracellular matrix components. Based on their specificity, they can be broadly categorized into subgroups such as collagenases (MMP1, MMP8, and MMP13), gelatinases (MMP2 and MMP9), and stromelysins (MMP3) (Goupille et al. 1998). Details of these enzymes are presented in Chap. 8. A polymorphism for G insertion/deletion (G/D) at the −1607 promoter region of MMP1 was assessed in a Southern Chinese cohort of 691 individuals. The identified deletion (D) allele was found to be significantly associated with disc degeneration; this was particularly evident among individuals aged 40 or above (Song et al. 2008b). In another Chinese cohort of 162 cases and 318 controls, an SNP in the promoter region of MMP2, −1306C → T, was shown to be associated with disc degeneration, with the CC genotype being more prevalent in cases of severe degeneration (Dong et al. 2007). This polymorphism was previously found to disrupt an Sp1 binding site that can lead to a reduction in transcriptional activity (Price et al. 2001).