Transcriptome sequencing of gingival biopsies from chronic periodontitis patients reveals novel gene expression and splicing patterns

Periodontitis is the most common chronic inflammatory disease caused by complex interaction between the microbial biofilm and host immune responses. In the present study, high-throughput RNA sequencing was utilized to systemically and precisely identify gene expression profiles and alternative splicing. The pooled RNAs of 10 gingival tissues from both healthy and periodontitis patients were analyzed by deep sequencing followed by computational annotation and quantification of mRNA structures. The differential expression analysis designated 400 up-regulated genes in periodontitis tissues especially in the pathways of defense/immunity protein, receptor, protease, and signaling molecules. The top 10 most up-regulated genes were CSF3, MAFA, CR2, GLDC, SAA1, LBP, MME, MMP3, MME-AS1, and SAA4. The 62 down-regulated genes in periodontitis were mainly cytoskeletal and structural proteins. The top 10 most down-regulated genes were SERPINA12, MT4, H19, KRT2, DSC1, PSORS1C2, KRT27, LCE3C, AQ5, and LCE6A. The differential alternative splicing analysis revealed unique transcription variants in periodontitis tissues. The EDB exon was predominantly included in FN1, while exon 2 was mostly skipped in BCL2A1. These findings using RNA sequencing provide novel insights into the pathogenesis mechanism of periodontitis in terms of gene expression and alternative splicing.


Introduction
Periodontitis is a chronic inflammatory disease of periodontium, characterized by massive destruction of both soft and hard tissues surrounding the teeth [1]. The current concept for the periodontal diseases involve complex interaction between the microbial biofilm and host immune responses that leads to the alteration of bone and connective tissue homeostasis [2,3]. Understanding the molecular mechanisms underlying the pathogenesis as well as development of efficient therapeutics is furthermore important since periodontitis is linked to other metabolic and/or systemic diseases including diabetes, cardiovascular diseases, and rheumatoid arthritis [4][5][6].
The analysis of transcriptome by microarrays has been a valuable tool to study the changes in gene expression profiles in gingival tissues of periodontitis patients [7][8][9]. However, recent advances in the high-throughput RNA sequencing technology revolutionarily enhanced our understanding on the complexity of eukaryotic transcriptome [10,11]. RNA sequencing has several key advantages over the hybridization-based microarray techniques. First of all, direct sequencing enables an unbiased approach compared with the microarrays that depends on the predetermined genome sequences. Secondly, RNA sequencing is highly accurate in detecting gene expression with very wide dynamic detection ranges with low background. Thus, RNA sequencing is not only useful to precisely determine gene expression profiles but also particularly powerful to detect novel transcription variants via alternative splicing [10].
In the present study, we analyzed the pooled transcriptome from gingival tissues of periodontitis patients and compared with that of healthy patients. The large sum of novel information on the gene expression profiles as well as novel transcripts through alternative splicing would provide not only insights into the pathogenesis of periodontitis but also basis for the development of biomarkers and therapeutic targets.

Periodontitis patient characteristics and gingival tissue samples
Gingival tissue samples were collected from chronic periodontitis patients or healthy individuals. On the basis of clinical and radiographic criteria, the periodontitis-affected site had a probing depth of ≥4 mm, clinical attachment level of ≥4 mm, and bleeding on probing. A total of 10 gingival samples were collected from 9 periodontal healthy patients who visited Kyungpook National University Hospital. Similarly, a total of 10 periodontitis tissue samples were obtained from 4 periodontitis patients with pocket depth of 4~6 mm and 3 severe periodontitis patients with pocket depth of 7 mm or deeper. The patient characteristics are given in Additional file 1: Table S1. All patients were non-smoking and did not have untreated metabolic/ systemic diseases nor associated with infection/autoimmune diseases at the time of tissue collection. The size of 3-mm 2 gingival biopsies were obtained from the marginal gingiva during periodontal flap surgery and immediately stored in RNAlater solution (Thermo Fisher Scientific, Waltham, MA) at −70°C after removal of blood by brief washing in phosphate-buffered saline. The study was approved by the institutional review board of the Kyungpook National University Hospital with informed consent from all patients.

Isolation of RNA and RNA sequencing
Frozen tissues were disrupted in the lysis solution of mirVana RNA isolation kit (Thermo Fisher Scientific) using disposable pestle grinder system (Thermo Fisher Scientific). After RNA extraction, the same amount of total RNA isolated from each individual sample (1 μg) was pooled into 2 groups (healthy and periodontitis) and used for further analysis. The integrity of pooled total RNA was analyzed by Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). After purification of mRNA molecules by poly-T oligo-attached magnetic beads followed by fragmentation, the RNA of approximately 300-bp size was isolated using gel electrophoresis. The cDNA synthesis and library construction was performed using the Illumina Truseq RNA sample preparation kit (Illumina, San Diego, CA), following the manufacturer's protocol. The PCR-amplified cDNA templates on a flow cell was loaded and sequenced in the HiSeq 2000 sequencing system (Illumina) in the pairedend sequencing mode (2 × 101 bp reads).

Sequencing data analysis
All sequencing raw reads were aligned to the human genome reference hg19 using the GSNAP alignment tool (2013-11-27) [12]. Only uniquely and properly mapped read pairs were used for further analysis. The differentially expressed genes between gingival tissues from periodontal healthy patients and periodontitis patients were identified using the DESeq R package [13]. Differentially expressed genes were defined as those with changes of at least 2-fold between samples and at a false discovery rate (FDR) cutoff of 5 % based on DESeq adjusted p values. The analysis of alternative splicing events was performed using MATS software [14]. The differences in the alternative splicing in genes were considered significant when the inclusion difference between samples was equal or greater than 5 % at a 10 % FDR. Each alternative splicing change of the skipped exon vent was manually inspected in UCSC genome browser using the sequencing data. The functional classification analysis of differentially expressed genes was performed using the PANTHER tools (http://www.pantherdb.org). The GO term and KEGG pathway enrichment analysis was performed as described previously [15]. Briefly, the fraction of genes in a test set associated with each GO category was calculated and compared with that of control set comprised of randomly chosen genes of the same number and length of the test genes. The random sampling was repeated 100,000 times for the calculation of empirical p value. The significance of enriched GO terms or KEGG pathways were determined by the p value cutoff, which was 1/total number of GO terms considered.

Validation of differentially expressed genes and alternative splicing events
From the pooled RNA samples, 1 μg of RNA was reversed transcribed using the Superscript II Reverse Transcriptase (Thermo Fisher Scientific). Quantitative real-time PCR analysis was performed by the addition of 1 μg of cDNA and SYBR green master mix in MicroAMP optical tubes using the AB 7500 system (Thermo Fisher Scientific). The expression of genes relative to that of HPRT1 was determined by the 2 -ΔΔCT method [16]. The differential alternative splicing events were confirmed via RT-PCR analysis with the addition of 1 μg of cDNA and Takara premix Taq polymerase (Takara Bio Inc, Shiga, Japan) for 33 cycles of 10 s at 98°C, 30 s at 60°C, and 1 min at 72°C. The primers for the detection of alternative splicing were designed by the PrimerSeq software [17] in order that the PCR product to span the region of exon inclusion/skipping, enabling the differentiation of alternative splicing events by product size. The primer sequences for the realtime RT-PCR analysis of selected genes and those for the RT-PCR detection of alternative splicing events of FN1 and BCL2A1 gene were provided in the supplemental tables (Additional file 2: Table S2 and Additional file 3: Table S3).

RNA sequencing results
Total RNA was extracted from 10 healthy gingival tissue samples and 10 chronic periodontitis-affected gingival tissues as described above. Then, cDNAs synthesized from the pooled RNA samples of both groups were sequenced using the Illumina HiSeq 2000 system, which generated approximately 80 million pairs of reads of 101 bp in size. When compared with the reference sequence of Genome Reference Consortium GRCh37 (hg19), more than 90 % of read pairs were uniquely mapped on the human genome (Table 1). Gene annotation using the Ensembl (release 75) identified that a total of 36,814 genes have at least 1 read mapped on the exonic regions. Among these, 4800 genes were unique to the periodontitis tissue sample, while 2811 transcripts were detected only in healthy gingival sample.
Identification and classification of differentially expressed genes between periodontitis and healthy gingiva The differential expression of genes between periodontitis and healthy gingival samples was analyzed by DESeq package [13]. By applying the cutoff of at least twofold change in the number of reads with 5 % FDR, we found a total of 462 genes differentially expressed between the samples (Fig. 1a, volcano plot). While 400 genes were up-regulated in the periodontitis tissue sample, 62 genes were down-regulated compared with the healthy control (Additional file 4: Table S4). Previously, Davanian et al. reported the discovery of 381 genes up-regulated in the periodontitis-affected gingival tissues by RNA sequencing [18]. Notably, 182 genes among them were also found to be up-regulated in the present study (Additional file 5: Figure S1), demonstrating an overlap between the two sets of gene lists when analyzed by a hypergeometric test (p < 2.2e −16 ) [19].
To classify the differentially expressed genes into functionally related subgroups, we utilized the PANTHER classification system (http://pantherdb.org). As a result, the 462 differentially expressed genes between periodontitis and healthy gingival tissues were segregated into 20 different classes of proteins. When we compared the composition of these protein classes, there was a significant difference in the number of genes between periodontitis and healthy gingival samples in 6 protein classes. In the periodontitis tissue, genes classified as defense/immunity protein, receptor, protease, and signaling molecules were significantly enriched (Fig. 1b). On the other hand, genes in the categories of cytoskeletal protein and structural protein were predominant in healthy tissue sample compared with periodontitis. Furthermore, functional annotation of GO and KEGG pathway enrichment analyses as previously described [15] revealed enhanced immune responses in the periodontal tissues, including NOD-like receptor signaling, cytokine and chemokine activities, response to lipopolysaccharide, Jak-STAT signaling pathway, and B cell receptor signaling pathway (Additional file 6: Table S5 and Additional file 7: Table S6).

Validation of differentially expressed genes between periodontitis and healthy gingiva by quantitative realtime PCR analysis
To validate the differential gene expression results by RNA sequencing analysis, we selected 10 up-regulated or down-regulated genes in periodontal tissue and assessed their expression by quantitative real-time RT-PCR analysis. Figure 1c shows that the examination of differential gene expression by both methods is significantly concordant, with the Pearson's correlation coefficient (R) value of 0.81 (p = 0.005). Since the current study design employed pooling of samples, we further validated the variations in gene expression in individual samples of healthy and periodontitis patients. The real-time RT-PCR analyses for selected genes (Additional file 8: Figure S2) mostly repeated the RNA sequencing results, showing significant reduction in NOS1, CHP2, CDON, and MT4. Similarly, significant

Alternative splicing events in periodontitis and healthy gingival tissues
More than 90 % of human genes are alternatively spliced through different types of splicing [20,21]. To identify the differential splicing events between the healthy and  periodontitis gingival tissues, the inclusion level of alternative spliced exons was compared using the MATS tool [14] based on a statistical model that calculates the difference in the isoform ratio of a gene. The MATS analysis of RNA sequencing data revealed 183 significantly differential alternative splicing events in 155 genes with a cutoff of 5 % inclusion difference and 10 % FDR (Table 4 and Additional file 9: Table S7). The GO and KEGG pathway enrichment analyses for the determination of the biological relevance of those differentially spliced genes showed significant difference in the pathways including RNA splicing regulation, substrate adhesion-dependent cell spreading, response to wound healing, and positive regulation of cell migration (Additional file 10: Table S8 and Additional file 11: Table S9). Among the genes that exhibited prominently novel included exons was FN1 that encodes one of the major extracellular matrix protein fibronectin [22]. Fibronectin structure consists of 2 nearly identical~250-kDa glycoprotein subunits with each monomer composed of repetitive units of type I, II, and III domains [23]. The type III domains contain 2 exons called extra domain A (EDA) and extra domain B (EDB), the latter showed significantly increased inclusion in periodontitis gingival tissues compared with healthy samples (Fig. 2a; left  panel). The preferential formation of EDB-containing isoform in periodontitis was further corroborated by the RT-PCR analysis designed to amplify the included EDB exon regions ( Fig. 2a; right panel). The analysis of alternative splicing events also indicated that BCL2A1 (BCL2-related protein A1) exhibited prominently skipped exon 2 ( Fig. 2b; left panel). RT-PCR analysis designed to amplify the skipped region revealed significantly increased shorter isoform ( Fig. 2b; right panel). The individual variation between healthy and periodontitis tissues for these differences in the alternative slicing events was further confirmed by RT-PCR analyses (Additional file 12: Figure S3). For FN1, the inclusion of EDB exon was  preferentially observed in periodontitis tissues (7/10) compared with healthy tissues (3/8) tested. Similarly, the skipping of exon 2 in BCL2A1 was predominant in periodontitis tissues (9/10), compared with healthy tissues (2/8).

Discussion
Recent developments in the RNA sequencing technology and bioinformatics tools enabled elaborate analysis of gene expression in numerous human diseases. However in periodontitis research, most RNA sequencing studies have focused on the identification of microbiome that constitutes periodontal biofilm, with little attention to the host responses against such microbial challenge. The current study provides extensive information on gene expression as well as alternative splicing in periodontitis gingival tissues, which is crucial for the understanding the pathogenesis and development of biomarkers and therapeutic targets. The gene expression analysis revealed 62 down-regulated and 400 up-regulated genes in periodontitis tissues, suggesting the effectiveness of mRNA sequencing as a tool to scrutinize the differential gene expression during the development of periodontitis. Davanian et al. previously reported a series of up-regulated genes as well as enriched biological pathways in periodontitis [18]. When we compared these results with ours, the current results only partially overlap in terms of differential gene expression, possibly originated from the difference in the Fig. 2 Differential alternative splicing of FN1 and BCL2A1. a In the left panel, a read distribution plot for FN1 with differential isoform expression due to the inclusion of EDB domain in periodontitis tissues was shown. The black boxes in the annotated isoforms illustrated below the read distributions indicate the exons. Arrows indicated the location of EDB exon, which was magnified in the dotted box in the upper right panel. In the lower right panel, a reverse transcription-PCR analysis was performed to detect included EDB exon. b A read distribution plot for BCL2A1 with differential isoform expression due to the skipping of exon 2 (arrows) in periodontitis tissues was shown in the left panel. In the right panel, a reverse transcription-PCR analysis was performed to detect skipped exon 2. M molecular weight marker, H healthy gingival tissues, and P periodontitis-affected gingival tissues ethnic group of the subjects as well as in the methods to eliminate individual fluctuations in the gene expression. For example, Davanian et al. used healthy gingival tissue of the same periodontitis-affected individual as healthy control tissue. However, in the current study, the healthy and periodontitis tissues were pooled, allowing the dilution of individual differences in the gene expression. Indeed, the RNA sequencing analysis of pooled samples proved effective, since the expression levels of genes (except IL6 and IL19) identified as differentially expressed by RNA sequencing were also significantly different between healthy and periodontitis samples, when we confirmed by real-time PCR analysis of individual samples (Additional file 8: Figure  S2). Most of the top 20 up-regulated genes in periodontitis tissue ( Table 2) were associated with inflammation and tissue degradation. Notably, serum amyloid A isoforms consisted 3 of 20 most up-regulated mRNAs, supporting the notion that these can serve as biomarkers for periodontitisassociated acute as well as chronic inflammation [24]. Until recently, gene expression analyses mostly focused on the genes whose expression was significantly increased in periodontitis. In line with this, 18 of top 20 upregulated genes were associated with periodontal disease at least once by previous studies. The current study revealed 2 novel genes highly overexpressed in periodontitis tissues compared with healthy control. MAFA is a subgroup member of the basic leucine-zipper family transcription factor prominently known for its role in glucoseresponsive insulin secretion [25]. CLDN10 is an ion channel-forming member of claudin family, which is a constituent of tight junction [26]. The role of these genes in periodontitis is of great interest and requires further investigation.
In contrast to the highly expressed genes in periodontitis, fewer highlights have been drawn on the genes down-regulated in periodontal diseases. In accordance, most of the top 20 down-regulated genes (Table 3) have not been studied with regard to periodontitis, although investigating the role of those genes in periodontitis compared with that in normal tissues would greatly enhance our knowledge regarding the pathogenesis of periodontal diseases. Notably, keratin (KRT2, KRT27, and KRT1) and late cornified envelope (LCE3C, LCE6A, LCE1B, and LCE2D) genes constituted significant part of the down-regulated genes, suggesting the loss of epithelial barrier [27]. The causal relationship between the loss of these genes and the development of periodontal diseases requires further investigation.
It has long been suggested that different sites in the same individual exhibit different patterns of disease progression, morphology, and often response to therapy [28]. In addition, the oral microbiota responsible for the induction of periodontal diseases is distinct from site-tosite in the same individual [29,30]. Accordingly, it is recommended to design clinical studies based on individual sites rather than individual person [31]. In agreement of this notion, the analysis of gene expression in individual sites by real-time RT-PCR (Additional file 8: Figure S2) revealed site-specific variation. In different sites from the same periodontitis patients (P2: P3, P7: P8, and P9: P10), it was clearly noticeable that MMP3, MMP13, and LBP expressions differ in a site-specific manner. An individual RNA sequencing study with larger number of patients is ongoing, which will further provide detailed information on the site specificity of periodontitis.
The gene ontology and KEGG pathway enrichment analyses revealed both innate and adaptive immune responses in the periodontal tissues, including NOD-like receptor signaling, response to lipopolysaccharide, cytokine and chemokine activities, and B cell receptor signaling pathways (Additional file 6: Table S5 and Additional file 7: Table S6). The NOD1 and NOD2 have been suggested to mediate the sensing of periodontal bacteria [32]. In addition, NOD2 has been linked to the P. gingivalis-induced bone resorption, since NOD2 knockout mice were protected from bone loss in a periodontitis model [33]. Bellibasakis and Johansson showed that a periodontal pathogen A. actinomyceptemcomitans regulated NLRP3 and NLRP6 expression in human mononuclear cells [34]. Considering the existence of 22 human NOD-like receptor protein members and their crucial functions in immune diseases, it will be of great interest and importance to elucidate the involvement of these receptors in the pathogenesis of periodontitis.
In the periodontitis lesions, it has been estimated that more than 75 % of infiltrating immune cells are plasma cells and B cells, suggesting the importance of these cells in adaptive immunity during the development of periodontitis [35]. In accordance, molecules involved in B cell activation including CD79, CD19, Lyn, and CR2 were significantly increased in periodontitis tissue. An increasing body of evidence indicates that B cells with autoreactive propensities might be linked to tissue destruction in periodontitis [36,37]. Indeed, recent reports demonstrated that B cell-deficient mice were protected from alveolar bone loss in experimental periodontitis [38,39].
Numerous studies attempted to delineate the role of T helper (Th) cell subsets in human periodontitis by examining the cytokine mRNA levels by RT-PCR, flow cytometry, and immunohistochemistry. However, those studies are incoherent in terms of Th1 and Th2 cytokine expression, although the Th17 cytokines are consistently increased [37]. The current study revealed that the levels of Th1 cytokines IFNG and IL12 did not change between healthy and periodontitis-affected gingival samples while that of TNF slightly increased in periodontitis (Additional file 13: Table S10). The Th2 cytokines IL10 and IL33 remained unaltered in periodontitis patients. Interestingly, Th17 cytokines IL6, IL23A, and IL17C significantly increased in gingival tissues from periodontitis patients compared with those of healthy control, supporting the concept of Th17 cells as crucial mediators of inflammation, although it is still controversial whether these cells contribute to tissue destruction or protection in periodontitis [40,41].
Alternative splicing of genes contribute to the diversity of proteome as well as genome evolution, control of developmental processes, and physiological regulation of various biological systems [42]. Not surprisingly, dysregulation of alternative splicing is often linked to various human diseases such as cancer, metabolic, neurological, and skeletal diseases [43][44][45][46][47]. However, alternative splicing events in the context of periodontitis has rarely been investigated. The current study uncovered significant differential alternative splicing events in BCL2A1 and FN1. BCL2A1 is a target gene of NF-kB, implicated in the survival of leukocytes thereby inflammation [48]. However, the role of alternative splicing on the activity of the protein has not been suggested until the present. Interestingly, recent discovery showed that BCL2A1 was increased not only in periodontitis but also in systemic diseases such as cardiovascular diseases and ulcerative colitis [49]. Therefore, research regarding the multiple layers of regulatory mechanisms including mRNA expression and alternative splicing of BCL2A1 are required to fully understand the role of this gene during the pathogenesis of periodontitis.
Parkar et al. previously suggested that FN1 is differentially spliced in periodontitis [50]. Interestingly, the authors reported exon skipping of both EDA and EDB domain in periodontitis, while the current study showed conspicuously increased inclusion of EDB domain. Although whether these differences originated from the use of periodontal ligament [51] versus gingival tissues (the present study) yet to be cleared, it would be of great interest to fully identify the role of fibronectin isoforms in the pathogenesis of periodontitis considering the suggested role of EDA-and EDB-containing isoforms of fibronectin during embryonic development and tissue repair [23,51].
In conclusion, the current study presented novel gene expression profiles as well as alternative splicing in gingival tissues from periodontitis patients by RNA sequencing experiments. Considering its effectiveness for whole transcriptome analysis, the use of RNA sequencing in periodontitis research would facilitate the elucidation of pathogenesis.