LncRNA expression profile, functional analysis and potential biomarkers in children with pancreaticobiliary maljunction


 Background

The present study was aimed to investigate the expression profile of long non-coding RNAs (lncRNAs) and its potential as biomarker for pancreaticobiliary maljunction (PBM).
Methods

The differential expression of lncRNA and messenger RNA (mRNA) from 15 pediatric patients with PBM and 15 control subjects were analyzed with microArray and validated with quantitative reverse-transcription polymerase chain reaction (qRT-PCR). Gene Ontology enrichment and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis were used to investigate the biological functions of these genes. The area under curve (AUC) of receiver operating characteristic (ROC) curve was used to predict the biomarker of lncRNA in PBM.
Results

There were 2915 mRNAs and 173 lncRNAs upregulated, 2121 mRNAs and 316 lncRNAs downregulated in PBM cases, respectively. The enriched GO associated with differentially expressed mRNA were extracellular matrix, extracellular region and kinetochore. The most enriched pathway of Protein digestion and absorption was associated with cancer and PI3K-Akt signaling. Cis- and Trans-target gene analysis indicated that mRNAs were predicted to be regulated by one lncRNA and one mRNA corresponded to several lncRNAs. The results showed that the expression trends of NR_110876, NR_132344, XR_946886 and XR_002956345 were consistent with the microarray results, and the difference was statistically significant NR_132344, XR_946886 and XR_002956345 (P < 0.05).The AUC of ROC curve was significant only for XR_946886 (0.837, P < 0.001).
Conclusion

The present study indicated that lncRNAs were involved the pathogenesis of common bile duct in PBM and XR_946886 would be biomarkers for PMB.


Abstract Background
The present study was aimed to investigate the expression pro le of long non-coding RNAs (lncRNAs) and its potential as biomarker for pancreaticobiliary maljunction (PBM).

Methods
The differential expression of lncRNA and messenger RNA (mRNA) from 15 pediatric patients with PBM and 15 control subjects were analyzed with microArray and validated with quantitative reverse-transcription polymerase chain reaction (qRT-PCR). Gene Ontology enrichment and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis were used to investigate the biological functions of these genes. The area under curve (AUC) of receiver operating characteristic (ROC) curve was used to predict the biomarker of lncRNA in PBM.

Results
There were 2915 mRNAs and 173 lncRNAs upregulated, 2121 mRNAs and 316 lncRNAs downregulated in PBM cases, respectively. The enriched GO associated with differentially expressed mRNA were extracellular matrix, extracellular region and kinetochore. The most enriched pathway of Protein digestion and absorption was associated with cancer and PI3K-Akt signaling. Cis-and Trans-target gene analysis indicated that mRNAs were predicted to be regulated by one lncRNA and one mRNA corresponded to several lncRNAs. The results showed that the expression trends of NR_110876, NR_132344, XR_946886 and XR_002956345 were consistent with the microarray results, and the difference was statistically signi cant NR_132344, XR_946886 and XR_002956345 (P < 0.05).The AUC of ROC curve was signi cant only for XR_946886 (0.837, P < 0.001).

Conclusion
The present study indicated that lncRNAs were involved the pathogenesis of common bile duct in PBM and XR_946886 would be biomarkers for PMB.

Background
Pancreaticobiliary maljunction (PBM) is a rare congenital anomaly observed in clinical practice, in which the junction of pancreatic and the biliary ducts is located outside the duodenal wall [1,2,3]. Due to two-way re ux of bile and pancreatic juice in PBM, bile retention with cholangitis, pancreatitis and malignancies are commonly associated complications in PBM [4,5,6].
Cyst excision and Roux-en-Y hepaticojejunostomy is the rst choice for PBM treatment in clinic [7,8,9]. However, the postoperative problems in bile duct, such as cholangitis, hepatolithiasis and carcinogenesis at the residual bile ducts (8, 10,11,12,13) are worthy of attention. Thus, it is urgent to investigate the biomarkers that link PBM to the related complications of bile duct, in order to early detect the complications in PBM and develop effective diagnostic and therapeutic methods.
Long noncoding RNAs (lncRNAs), a group of non-coding RNAs longer than 200 nt, play an important role in regulating the occurrence and progression of many diseases, such as infectious and malignant tumors [14,15,16], through modulating biological processes, such as cellular proliferation, motility, immune and in ammation response to diseases [17,18,19]. In addition, some lncRNAs has been identi ed as biomarkers for diagnosis and prognosis of diseases [20,21]. However, the genome-wide expression and functional roles of lncRNAs in PBM is unclear.
To the best of our knowledge, there has been no publication focused on lncRNAs in common bile duct of PBM and PBM-linked biologic markers. Here, we investigated the expression pro les, functional analysis and potential biomarkers of lncRNAs in the common bile duct of patients with PBM. Our ndings would provide basis for developing novel diagnostic and therapeutic targets for PBM.
Material And Methods

Study subjects
The present study was approved by the institutional review board of the Children's Hospital of Soochow University. Written informed consent was signed by the guardian of the subjects before surgery. A total of 15 subjects with PBM and 15 normal subjects as control were enrolled in the present study. The expression of lncRNA and mRNA was analyzed with microarrays and veri ed with qRT-PCR. The clinical and pathological characteristics of the 15 patients were summarized in Table 1.

RNA Microarray analysis
Microarray experiments to investigate the differentially expressed genes between PBM and control group were performed by Western SCI Biotech Company, Chong qing, China (http://www.westernsci.cn). The Agilent Human lncRNA Microarray 2018Version (4*180k, Design ID: 085630) was used in this experiment.
The tissue of common bile duct of PBM cases were collected and stored at − 80 °C before RNA extraction. Total RNA was isolated using TRIZOL (Invitrogen, Carls-bad, CA, USA) according to the manufacturer's protocol and quanti ed by the NanoDrop ND-2000 (Thermo Scienti c). The RNA integrity was assessed using Agilent Bioanalyzer 2100 (Agilent Technologies). The sample labeling, microarray hybridization and washing were performed based on the manufacturer's standard protocols. Brie y, total RNA was transcribed to double strand cDNA, then synthesized into cRNA and labeled with Cyanine-3-CTP. The labeled cRNAs were hybridized onto the microarray. After washing, the arrays were scanned by the Agilent Scanner G2505C (Agilent Technologies).
For data analysis, the Feature Extraction software (version10.7.1.1, Agilent Technologies) was used to analyze the array images to get raw data. Genespring (version 13.1, Agilent Technologies) were employed to nish the basic analysis with the raw data. To begin with, the raw data was normalized with the quantile algorithm. The probes that at least 1 out of 2 conditions have ags in "P" were chosen for further data analysis. Differentially expressed genes or lncRNAs were then identi ed through fold change. The threshold set for up-and down-regulated genes was a fold change>= 2.0. Afterwards, gene ontology (GO) analysis and KEGG analysis were applied to determine the roles of these differentially expressed mRNAs and lncRNAs. Finally, Hierarchical Clustering was performed to display the expression pattern of distinguishable genes among different samples.

LncRNA function prediction
For function prediction of lncRNAs, we adopted the method from previous study [22]. In brief, the co-expressed mRNAs for each differentiated lncRNAs were calculated and subjected to functional enrichment analysis. The enriched functional terms were used as the predicted functional term of a given lncRNA. The co-expressed mRNAs of lncRNAs were identi ed by calculating the Pearson correlation with correlation P-value <0.05. The Hypergeometric cumulative distribution function was used to calculate the enrichment of functional terms in annotation of coexpressed mRNAs. The relationship between lncRNA and function prediction terms", the top 200 and top 500 prediction relationships with the highest prediction reliability were selected. The frequency of each functional prediction term was counted, and the GO or pathway term with more functional annotation was counted to re ect the overall situation of the functional distribution of different lncRNA. The top 20 signi cant lncRNAs were selected to draw a bar chart.

Fine mapping of genome coexpression of lncRNAs and adjacent coding genes
The Cis-regulatory regions were identi ed by the following procedures. For each lncRNAs, the mRNAs were identi ed as "Cis-regulated mRNAs" when: (1) the mRNAs loci were within 300k windows up-and down-stream of the given lncRNA, (2) the Pearson correlation of lncRNA-mRNA expression was signi cant (P-value of correlation <=0.05).
For each differentially expressed lncRNAs, the co-expressed genes and the enrichment signi cance of differentially expressed gene in each transcription factor (TF) entry was calculated by hypergeometric distribution test. The calculated results will return a p value of enrichment signi cance, and a small P value indicates that the gene is enriched in the TF entry.
The intersection between the coding gene set co-expressed by lncRNAs and the target gene set of TF/chromatin regulatory complex was calculated, and the enrichment degree of the intersection was calculated by using hypergeometric distribution test. This calculation obtained the TF signi cantly related to lncRNAs and identi ed the TF/chromatin regulatory factors that may play a regulatory role with lncRNAs. And the network diagram visualized using the analysis results of hypergeometric distribution.
According to hypergeometric distribution calculation, multiple lncRNA-TF pairs were obtained from each lncRNA. Each lncRNA-TF pair was the result of multiple gene enrichment.

Quantitative Reverse Transcription-Polymerase Chain Reaction (qRT-PCR) Assay
In order to verify the microarray results, we randomly selected a total of 6 differently expressed lncRNAs that associate with carcinogenesis or chronic in ammation in common bile duct in PBM cases and measured the lncRNA expression levels of these genes by qRT-PCR. qRT-PCR was performed using SYBR Green PCR Kit (Applied BI) according to the manufacturer's instructions. Gene expression was normalized to β-actin lncRNA. The relative expression of gene transcript was calculated using the 2 −ΔΔ Ct method.
. Statistical Analysis SPSS 20.0 software was used for statistical analysis. The results of measurement data were expressed as mean ± SD. The comparison between the groups was performed using the rank sum test. A fold change of 2.0 (P < 0.05) was used as a cutoff for differential expression of lncRNAs and mRNAs screened by microarray. Receiver operator characteristic (ROC) curve analysis was conducted and area under the ROC curve (AUC) was calculated to evaluate the diagnostic accuracy of selected genes in PBM patients. P < 0.05 was considered statistically signi cant.

General information
A total of 15 pediatric subjects with PBM and 15 control subjects were included in the study. Demographic and clinical data were summarized in Table 1. There was no signi cant difference between 2 groups in body weight, age, gender or other general information.

Functional analysis of differentially expressed genes
To avoid omission of any PBM-related genes, we performed functional analyses for all the differentially expressed mRNAs. It was found that the most enriched GOs associated with the differentially expressed transcripts were extracellular matrix (p=2.9E-12 up-regulated), extracellular region (p=0.000427, up-regulated), and kinetochore (p=2.29E-8, up-regulated) (Figure 2). The most enriched pathway was Protein digestion and absorption which could be related to 250 differentially expressed genes. Some pathways were associated with cancer, such as pathways in cancer (associated with 11 genes), PI3K-Akt signaling pathway (associated with 15 genes, Figure 3).

Co-expression analysis and target prediction
The functions of lncRNAs were performed by inter-acting with their targets. In the present study, the potential cis-and trans-acting were predicted. The mRNAs 100 kb upstream and downstream of the lncRNAs were searched for cis analysis, and we found that 5 lncRNAs were related with cis target genes ( Figure 4). The trans analysis of lncRNAs was performed by constructing co-expression networks (Fig. 5) of dysregulated mRNAs and lncRNAs based on the expression correlation coe cient (Pearson correlation P-value <0.05). The data showed that most lncRNAs acted in a trans-acting manner. The top 200 prediction relationships with the highest prediction reliability were selected to count the frequency of each TF, and the TF with more functional annotation was counted to re ect the overall functional distribution of different lncRNAs (Figure 6). With the detected 964 lncRNAs that corresponded to 9358 mRNAs, more than one mRNAs were predicted to be regulated by one lncRNA and one mRNA corresponded to several lncRNAs.

qRT-PCR Validation
The up-regulated genes (NR_110876, NR_132344, XR_946886, XR_002956345, NR_135295 and XR_002957935) that were associated with carcinogenesis or chronic in ammation in common bile duct were veri ed by qRT-PCR. The results showed that the expression trends of NR_110876, NR_132344, XR_946886 and XR_002956345 were consistent with the microarray results, and the difference was statistically signi cant NR_132344, XR_946886 and XR_002956345 (P < 0.05). However, the expression trends of NR_135295 and XR_002957935 were not consistent with the microarray results ( Figure 6). To further assess the accuracy of NR_110876, NR_132344, XR_946886, XR_002956345 as biomarker in PBM, analysis of the ROC curve was performed with the AUC. The results indicated that only the AUC of XR_946886 was signi cantly different (0.837, p<0.001) (Figure 8).
In PBM cases, the re ux of pancreatic juice into the bile duct could damage the bile duct wall and lead to bile duct dilatation, cholestasis, stone formation, epithelial hyperplasia and malignant transformation in the biliary tract [4,5,6,7]. The biological effects of re ux of pancreatic juice and its relationship with chronic in ammation and carcinogenesis have attracted great attention in recent years. Previous studies regarding the transcriptomes of congenital dilation of common bile duct (CDC) have discovered numerous novel transcriptomic alterations. Kaneko et al. identi ed several downregulated genes in the gallbladder of children with PBM which may contribute to the pathophysiology and found some upregulated noncoding RNAs that may be important for biliary carcinogenesis [23]. In our previous study [24], we found 876 differentially expressed genes in children with PBM, of which 530 genes were up-regulated and 346 genes were down-regulated. Wong et al. identi ed 21 damaging de novo variants (DNV) in 31 cases with CDD, 6 DNV-carrying genes associated with human developmental disorders and 4 DNVcarrying genes linked with cholangio-and hepatocellular carcinomas [25]. However, there is rare report about lncRNA in PBM patients.
In the present study, we identi ed 489 lncRNA and 5036 mRNA transcripts that were found differentially expressed in the bile duct tissues of patients with PBM. These differentially expressed lncRNAs may affect many pathways, including those involved in Protein digestion and absorption, ECM-receptor interaction, Focal adhesion, PI3K-Akt signaling, carcinogenesis and in ammation, which were consistent with previous studies [21][22][23]. These ndings, together with our qPCR data, suggest that lncRNAs may contribute to PBM associated complication, such as carcinogenesis and chronic in ammation in common bile duct. Among these differentially expressed lncRNAs, the upregulation of NR_110876, NR_132344, XR_946886 and XR_002956345 was con rmed by qPCR; however, only XR_946886 showed signi cant difference in the AUC of ROC. The ndings suggested that XR_946886 might be involved in the development of complication of PBM and could become new biomarkers for carcinogenesis or chronic in ammation in common bile duct in PBM.
The potential functions of lncRNAs were commonly predicted by their target genes. In the present study, the relationships between lncRNAs and their target genes collocated (within 100 kb) were analyzed. We found that 5 cis target genes were related with differentially expressed lncRNA and that most lncRNAs acted in a trans manner. For example, we found that E2F4 was closely associated with lncRNA and target genes, which is consistent with previous reports that E2F4 is associated with the development and progression of cancers [26,27].
There were some limitations in the present study. The sample size was relatively small and only six related lncRNAs identified in microarray experiments were validated via qRT-PCR. Future studies should be conducted with larger sample sizes and in-depth verification of more candidate genes. In addition, the potential mechanisms of XR_946886 and its clinical values in PBM still require further studies.
In summary, the present study provided an overall analysis of lncRNAs in common bile duct in PBM. By further exploring potential functions and pathways in which the lncRNAs were involved, we anticipate that lncRNAs such as XR_946886 will be biomarkers or even potential therapeutic targets for PMB treatment.
Abbreviations long non-coding RNAs=lncRNAs; pancreaticobiliary maljunction=PBM; messenger RNA=mRNA; qRT-PCR= quantitative reverse-transcription polymerase chain reaction; ROC= receiver operator characteristic; GO= Gene Ontology; KEGG= Kyoto Encyclopedia of Genes and Genomes; Declarations Ethics approval and consent to participate Informed written consent was obtained from all participants' parents. The study protocol was approved by the Ethics Committee of the Children's Hospital of Soochow University, Suzhou, China (No. 20170506016).

Consent for publication
Written informed consent for publication of the case was obtained for each participant's parents.

Availability of data and materials
All data generated or analyzed during this study are included in this article

Competing interests
The authors declare that they have no competing interests.        5 lncRNAs were related with cis target genes in PBM.

Figure 5
The trans-target prediction analysis of differentially expressed lncRNAs in PBM.

Figure 6
The functional distribution of differentially expressed lncRNAs with TF analysis.

Figure 7
Validation of the microarray test by qPCR.