Expression profile of circular RNAs in infantile hemangioma detected by RNA-Seq

Abstract Background: Circular RNAs (circRNAs) have emerged as a novel class of widespread non-coding RNAs, and they play crucial roles in various biological processes. However, the characterization and function of circRNAs in infantile hemangioma (IH) remain elusive. Methods: In this study, we used RNA-Seq and circRNA prediction to study and characterize the circRNAs in IH tissue and a matched normal skin control. Specific circRNAs were verified using real-time polymerase chain reaction. Results and Conclusion: We found that of the 9811 identified circRNAs, 249 candidates were differentially expressed, including 124 upregulated and 125 downregulated circRNAs in the IH group compared with the matched normal skin control group. A set of differentially expressed circRNAs (in particular, hsa_circRNA001885 and hsa_circRNA006612 expression) were confirmed using qRT-PCR. Gene ontology and pathway analysis revealed that compared to matched normal skin tissues, many processes that were over-represented in IH group were related to the binding, protein binding, gap junction, and focal adhesion. Specific circRNAs were associated with several micro-RNAs (miRNAs) predicted using miRanda. Altogether, our findings highlight the potential importance of circRNAs in the biology of IH and its response to treatment.


Introduction
Infantile hemangioma (IH) is the most common vascular tumor in infants. [1] It is characterized by an initial proliferation during infancy, followed by spontaneous involution over the next 5 to 10 years. [2] Some IHs show such a severe progression that they lead to tissue and organ damage and in some cases become lifethreatening. [3] The cause of IH is reported to be highly associated with fetal hypoxic stress. [4] However, the exact atiopathogeny underlying IH is still to be fully understood. [4] It is therefore necessary to deeply explore the regulatory processes for a better understanding IH pathogenesis. Circular RNAs (circRNAs) comprise a novel class of widespread non-coding RNAs that are predominantly generated by back-splicing of exons in eukaryotic genomes. [5] Acting as microRNA sponges, circRNAs regulate multiple biological processes such as cancer, heart development, and liver diseases. [6][7][8] CircRNAs have been reported to be abundant, conserved and stable in the cytoplasm. Specific circRNAs are correlated with hypoxia-regulated vascular cells, [9] indicating that circRNAs may be involved in angiogenesis. The circRNA_000203 enhances the expression of fibrosis-associated genes by derepressing targets of miR-26b-5p, Col1a2, and CTGF in cardiac fibroblasts. [10] Recently, circRNA profiles of IH were elucidated by microarray analysis, [11] exhibiting the prospect of studying circRNAs in IHs.
In this study, we use RNA sequencing (RNA-Seq) and circRNA prediction to examine the expression profile of circRNA in 3 IH skin samples and a matched normal skin control. Subsequently, gene ontology and pathway analysis showed that compared to matched normal skin, many processes over-represented in IH are related to immune system processes, extracellular regio,n and molecular transducer activity. Our study may help expand the understanding of the roles of circRNAs in the mechanisms that underlie IH development and may provide new research directions. skins were collected from patients who underwent surgery with their parents' written consent.

Tissue samples
Proliferating capillary IH and matched normal skin tissues were obtained from 3 different patients who were admitted to the Affiliated Obstetrics and Gynecology Hospital of Nanjing Medical University for IH removal. A diagnosis of proliferative IH was confirmed by routine pathological examination. The collected skin samples were immediately frozen in liquid nitrogen for the preparation of total RNA. Patient information is listed in Table 1.

Total RNA isolation
Total RNA was extracted from biopsy samples using the Qiagen miRNeasy Mini Kit (Qiagen, Valencia, CA). RNA purity was checked using the NanoDrop 2000 (Thermo Fisher, MA), and RNA concentration was measured using the Qubit 3.0 Fluorometer (Life Technologies, CA). RNA integrity was assessed using the Agilent 2100 Bioanalyzer (Agilent Technologies, CA).

Library preparation, RNA-Sequencing
The transcriptome library for sequencing was generated using the VAHTS Total RNA-Seq (H/M/R) Library Prep Kit for Illumina (Vazyme Biotech Co., Ltd, Nanjing, China) following the manufacturer's recommendations. The details of the library construction were as follows: first, ribosomal RNA was removed by target-specific probes, RNase H and DNA polymerase I. Following purification, the RNA was fragmented into small pieces using divalent cations at elevated temperature. The cleaved RNA fragments were copied into the first strand cDNA using reverse transcriptase and random primers, followed by secondstrand cDNA synthesis using DNA polymerase I, RNase H, and dNTPs (dUTP, dATP, dGTP, and dCTP). To these cDNA fragments were added a single 'A' base, and the adapter was subsequently ligated. To select the appropriate cDNA fragment size for sequencing, the library fragments were selected with VAHTS DNA Clean Beads. The UDG enzyme was used to digest the second strand of cDNA. PCR amplification was performed, and the desired products were purified. After cluster generation, the libraries were sequenced on an Illumina Hiseq X10 platform, and 150-bp paired-end reads were generated.

Quality control
Raw reads in FASTQ format were first processed using in-house Perl scripts. Clean reads were obtained by removing reads with adapters, reads in which unknown bases were more than 5% and low-quality reads (ie, for which the percentage of low quality bases was over 50% in a read; we defined a low-quality base to be the base whose sequencing quality was at or below 10). At the same time, Q20, Q30, and GC content was calculated for the clean reads. All downstream analyses were based on the clean reads.

Mapping to the reference genome
The reference genome and gene model annotation files were downloaded directly from the UCSC (hg38). The reference genome index was built using Bowtie (v2.1.0), [12] and paired-end clean reads were aligned to the reference genome using TopHat (v2.1.1). [13] 2.7. Raw data filtering The raw reads were filtered by removing reads containing adapter, ploy-N and low-quality reads for subsequent analysis. The steps of sequencing data filtering were as follows: removing reads containing adapter; removing reads containing poly-N (ie, unrecognized bases), reads with a ratio greater than 5%; and removing low-quality reads (eg, for which the number of bases with Q10 is more than 50% of the entire read). All downstream analyses were based on clean data of high quality.

circRNA prediction
circRNA prediction was performed with circRNA_Finder (https://github.com/bioxfu/circRNAFinder). [14] First, the clean reads were aligned to the reference genome using Bowtie2 (http:// bowtie-bio.sourceforge.net/bowtie2/manual.shtml). [12] Then, for unmapped reads, the junctions were selected using a back-splice algorithm and circRNAs were verified with the sub-module of circRNAFinder. Finally, circRNAs were annotated and abstracted with the circRNAAnno of circRNAFinder, which were considered the reference sequence for further analysis.

Differentially expressed circRNAs
The expression level of circRNAs was measured by "Transcripts Per Million" (TPM). [15] Differentially expressed circRNAs were analyzed using the DESeq package based on the negative binomial distribution test. The thresholds of differentially expressed genes were FDR0.05 and ølog2Fold changeø ≥1.  To confirm the RNA-Seq data, the expression of randomly selected circRNAs was tested in another 12 IH patients using qRT-PCRs with the SYBR green method on an Applied Biosystems ViiA Dx instrument (Life Technologies). Patient information is listed in Table 1. The sequences of specific PCR primer sets used for the qRT-PCR are listed in Table 2. The circRNA expression was normalized to the internal control gene, glyceraldehyde 3-phosphate dehydrogenase (GAPDH), using the 2 (-△△Ct) method. [16] 2.11. GO enrichment and pathway analysis of origin genes of circRNAs Using the statistical model, we performed gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Briefly, GO enrichment analysis of the origin genes of differentially expressed circRNAs was implemented with the GOseq R package, with gene length bias corrected. The resulting P-values were adjusted using Benjamini and Hochberg's approach for controlling the false discovery rate. GO terms with corrected P < .05 were considered significantly enriched among the differentially expressed genes. The top 10 GO terms are shown. We further used the KOBAS software program to test for the statistical enrichment of the origin genes of differentially expressed circRNAs among the KEGG pathways. The top 20 KEGG pathways are presented.

Statistical analysis
Data were analyzed using the SPSS 20.0 software package (SPSS, Chicago, IL) with an independent-sample t test between the 2 groups. All values were represented as the mean±standard deviation (SD) from at least 3 independent experiments. Statistical significance was defined as P < .05.

CircRNA expression profile detected by RNA-Seq in IH compared to matched normal skin
Skin samples were collected from 15 IH patients without previous treatment in the study. As shown in Table 1, the mean (SD) age of the IH patients was 7 months 6 days (2 months 15 days). To profile the differentially expressed circRNAs in IH, we performed RNA-Seq on 3 randomly selected IH samples and matched the normal skin. The length of circRNAs in all samples is shown in Figure 1 A. Most circRNAs were below 500 nucleotides. In all, 249 circRNAs were differentially expressed, including 124 upregulated and 125 downregulated circRNAs in IH compared to the matched normal skin control. Hierarchical clustering showed a list of the 249 differentially expressed circRNAs among the samples (Fig. 1 B). Table 3 shows part of the 249 differentially expressed circRNAs in the IH samples compared to the matched normal skin group (fold change ≥ 2, P .05).

Real-time quantitative PCR validation
To validate the RNA-Seq data, we randomly selected 6 differentially expressed circRNAs (bold in Table 3). Real-time quantitative PCR (qRT-PCR) analysis was performed on an additional 12 independent IH skin samples ( Table 1). The results revealed that similar upregulation or downregulation was observed in both RNA-Seq and qRT-PCR samples for the 6 circRNAs (Fig. 2, bold in Table 3). Therefore, our RNA-Seq data were reliable and stable. Among the 6 circRNAs, hsa_circRNA001885, and hsa_circRNA006612 expression in IH was 12.33-and 7.13-fold higher, respectively, than in matched normal skin.

GO and KEGG pathway analysis of origin genes of circRNAs in IH compared to matched normal skin
Recent advances have revealed that circRNAs can regulate the expression of parental genes at the transcription level. [18,19] Therefore, we analyze the origin genes of differentially expressed circRNAs through GO and KEGG pathway analysis. The GO results reveal that the origin genes of differentially expressed circRNAs are mostly involved in cellular component organization or biogenesis, binding, protein binding, and intracellular organelle parts (Fig. 3). The KEGG pathway analysis indicated that gap junction, focal adhesion, and adherens junction were implicated for the origin genes of differentially expressed circRNAs (Fig. 4).

Interaction between circRNA and miRNA
Accumulated evidence indicates that circRNAs can function as miRNA sponges. [20,21] The competitive endogenous RNAs (ceRNAs) contain shared miRNA response elements (MREs), such as circRNAs, messenger RNAs (mRNAs) and long noncoding RNAs (lncRNAs), and can compete for miRNA binding. [22] Therefore, we use miRanda to screen the MREs in the 6 circRNAs validated. The results displayed several miRNAs associated with specific circRNAs (Table 4). A total of 63 miRNAs (the highest amount) could potentially bind with hsa_circRNA000227 (Fig. 5), 15 miRNAs could bind with hsa_circRNA001885 and 36 miRNAs could bind with hsa_-circRNA006612.

Discussion
As a vascular neoplasm, IH is one of the most common tumors diagnosed in young children. [2] The pathogenesis of hemangioma has been widely studied, and several theories have been proposed, among which endothelial progenitor cell theory, Folkman Klagsbrun placental theory, angiogenesis theory, and hypoxia theory are the most accepted. [23] To date, circRNA profiles by microarray analysis in the IHs have been reported, [11] and 234 up-and 374 downregulated circRNAs were identified in IH by microarray. [11] In this study, based on the RNA-Seq technique, we found that 249 circRNAs are dysregulated in IH including 124 upregulated and 125 downregulated circRNAs.
CircRNAs have been reported abundant, conserved and stable in cytoplasm, [24] and originate from back splicing or exon skipping of linear RNA templates. [8] Numerous articles describe thousands of circRNAs throughout RNA-Seq and back-splicing junction discovery to quantify circRNAs. [24] A recent study proposed that the microarray is more efficient than RNA-Seq for circRNA profiling. [25] Our circRNA profile in IHs was not completely consistent with the microarray data. [11] According to the origin genes, we found that 47 origin genes of the 249 dysregulated circRNAs detected using RNA-Seq could be found in the circRNA microarray data (fold change ≥ 2, P < .5).
CircRNAs have recently emerged as novel star molecules that play crucial roles in the regulation of numerous biological or pathological processes. [26,27] A recent study revealed that exonintron circRNAs predominantly localize in the nucleus, interact with U1 snRNP and enhance the expression of their parental genes in cis. [19] Our study shows that the origin genes of the 249 deregulated circRNAs are related to binding, protein binding, intracellular organelle part, gap junction, focal adhesion and adherens junction, indicating that circRNAs may function in these biological processes, molecular functions, and signaling pathways. Further elucidating the underlying mechanism of the function of circRNAs in IH would be helpful in revealing the biological aetiology and could provide useful information for IH evaluation and treatments.
For the origin genes of the validated 6 circRNAs (Table 3), we found that TM7SF3 and GUCY1A2 also could be found in the microarray data lists. [11] Thus, hsa_circRNA002136 and hsa_circRNA001885 may be the most promising research Table 4 The interaction of circRNA and miRNAs was predicted using miRanda.
The functional interactions of circRNAs, miRNAs, and mRNAs could lead to a new explanation for the pathogenesis of diseases. Therefore, further elucidating the underlying mechanism of the function of miRNA, circRNAs, and mRNAs in IH would be helpful in revealing biological aetiology and potentially providing useful information for IH evaluation and treatments.