Genome-wide analysis of circular RNAs in goat skin fibroblast cells in response to Orf virus infection

Orf, caused by Orf virus (ORFV), is a globally distributed zoonotic disease responsible for serious economic losses in the agricultural sector. However, the mechanism underlying ORFV infection remains largely unknown. Circular RNAs (circRNAs), a novel type of endogenous non-coding RNAs, play important roles in various pathological processes but their involvement in ORFV infection and host response is unclear. In the current study, whole transcriptome sequencing and small RNA sequencing were performed in ORFV-infected goat skin fibroblast cells and uninfected cells. A total of 151 circRNAs, 341 messenger RNAs (mRNAs), and 56 microRNAs (miRNAs) were differently expressed following ORFV infection. Four circRNAs: circRNA1001, circRNA1684, circRNA3127 and circRNA7880 were validated by qRT-PCR and Sanger sequencing. Gene ontology (GO) analysis indicated that host genes of differently expressed circRNAs were significantly enriched in regulation of inflammatory response, epithelial structure maintenance, positive regulation of cell migration, positive regulation of ubiquitin-protein transferase activity, regulation of ion transmembrane transport, etc. The constructed circRNA-miRNA-mRNA network suggested that circRNAs may function as miRNA sponges indirectly regulating gene expression following ORFV infection. Our study presented the first comprehensive profiles of circRNAs in response to ORFV infection, thus providing new clues for the mechanisms of interactions between ORFV and the host.


INTRODUCTION
Circular RNAs (circRNAs), a new member of non-coding RNAs, are generally produced by back-splicing of pre-mRNA (Barrett & Salzman, 2016;Barrett, Wang & Salzman, 2015). Because of their unique circular structure, circRNAs can resist the activity of RNA exonuclease digestion. Thus, they are more stable and have longer half-lives than their linear transcripts in vivo (Jeck et al., 2013;Suzuki et al., 2006). Moreover, circRNAs are highly abundant and are conserved among a variety of species (Barrett & Salzman, 2016; multiplicity of infection of one. Following incubation for 1 h at 37 • C, the virus suspension was removed and cells were cultured for a further 6 h in standard medium.

RNA extraction, library construction, and sequencing
Total RNA from both ORFV infected GSF samples (OV) and uninfected GSF samples were isolated using an Ambion mir Vana miRNA Isolation Kit (Thermo Fisher Scientific, Waltham, MA, USA). The quality of total RNA were analyzed by Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA), and the concentration of the total RNA was quantified using a NanoDrop 2000 (Thermo Fisher Scientific, Lafayette, CO, USA). In this study, circRNA libraries were constructed as below: Ribosomal RNA was removed from 5 µg aliquots of total RNA using an Epicentre Ribo-Zero Gold Kit (Illumina, San Diego, CA, USA). Then, the RNA fractions were fragmented and were reverse transcribed using an mRNA-Seq Sample Preparation Kit (Illumina, San Diego, CA, USA). The cDNA libraries were then paired-end sequenced using an Illumina HiSeq 4000 platform (Lc-bio, Hangzhou, China). The raw and processed data have been deposited into the Gene Expression Omnibus database (https://www.ncbi.nlm.nih.gov/geo/) under accession number GSE121725. (2) Small RNA libraries: About 1 µg total RNA of each sample was used for cDNA library construction with the TruSeq Small RNA Preparation Kit (Illumina, San Diego, CA, USA) following the manufacturer's protocol. Then, the cDNA libraries were single-end 50 bp (SE50) sequenced with an Illumina HiSeq 2500 platform (Lc-bio, Hangzhou, China). The raw and processed data have been deposited into the Gene Expression Omnibus database (https://www.ncbi.nlm.nih.gov/geo/) under accession number GSE121726. Firstly, Cutadapt (Martin, 2011) was utilized to remove reads containing undetermined bases, adaptors, and low quality bases. Then sequence quality was verified using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Bowtie2 (Langmead &Salzberg, 2012) andTophat2 (Kim et al., 2013) were used to map reads to the Capra hircus (goat) reference genome (RefSeq assembly accession: GCF_001704415.1). The matched reads of each sample were assembled using StringTie (Pertea et al., 2015). StringTie and Ballgown (Frazee et al., 2015) were utilized to evaluate the expression levels of all transcripts by Fragments per kilobase per million reads (FPKM). The dysregulated mRNAs were selected with |log 2 (fold change) | ≥ 1 and p ≤ 0.05 by R package Ballgown (Frazee et al., 2015).

Identification and differential expression analysis of circRNAs and mRNAs, and miRNAs
Any unmapped reads were individually mapped to the goat reference genome by TopHat-Fusion (Kim & Salzberg, 2011). Then, reads mapped to the goat genome using TopHat-Fusion were analyzed by CIRCexplorer to identify candidate circRNAs (Zhang et al., 2016;Zhang et al., 2014b). The criteria were as follows: (1) GU/AG must occur at both ends of splice sites; (2) less than two mismatches; (3) more than one back-spliced junction read in at least one sample of GSF or OV group; (4) two splice sites are no more than 100 kb apart on the genome. The expression of circRNAs was calculated by the number of reads spanning back-splicing junction and FPKM was used to normalize the expression level of circRNAs. CircRNAs with |log 2 (fold change) | ≥ 1 and p ≤ 0.05 were regarded as differentially expressed by R package-edgeR (Robinson, Mccarthy & Smyth, 2014).
For miRNA analysis, ACGT101-miR (LC Sciences, Houston, Texas, USA) was used to acquire clean reads. Then, unique sequences containing 18 to 26 nucleotides were mapped to miRBase 21.0 by BLAST search to identify known and novel miRNAs. The expression level of miRNAs was normalized based on the read counts to tags per million counts (TPM). The significance standard was p ≤ 0.05.

Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis
GO analysis was used to determine significantly enriched GO terms (p ≤ 0.05) by hypergeometric test (Beißbarth & Speed, 2004;Gene Ontology Consortium, 2004). KEGG pathway analysis was used to explore significantly enriched pathways (p ≤ 0.05) by hypergeometric test (Kanehisa et al., 2017;Kanehisa et al., 2012).
N , Total number of circRNA-hosting genes; n, The number of circRNA-hosting genes with differential expression; M , The number of circRNA-hosting genes annotated to the GO term; m, The number of circRNA-hosting genes with differential expression annotated to the GO term.

Bioinformatics analysis and ceRNA network construction
The circRNA-miRNA and miRNA-mRNA interactions were predicted using TargetScan 7.0 and miRanda software. TargetScan 7.0 predicts the targets of miRNAs based on seed region homologies (Agarwal et al., 2015) while MiRanda is mainly based on the combination of free energy generated by miRNAs binding to their target genes (Betel et al., 2008). The lower the free energy, the stronger the binding. TargetScan score percentiles ≥ 50, Miranda max free energy values <−10 and Miranda score >140 were defined as the cutoff points for targets predicti. To further explore the functional role of circRNAs, a ceRNA network was constructed using Cytoscape 3.6.0 software (Shannon et al., 2003).

Validation of miRNAs
Six differentially expressed novel miRNAs (PC-3p-8215_174, PC-5p-406_14064, PC-5p-2253_1210, PC-5p-5127_361, PC-3p-10316_124 and PC-3p-4306_468) with mean TPM >30 in GSF samples or OV samples were selected for qPCR validation. First, the total RNA was reverse-transcribed using a miRNA 1st Strand cDNA Synthesis kit (Vazyme, Nanjing, China) with specific stem-loop primers (Table S1). Then, qRT-PCR assays were performed using miRNA Universal SYBR qPCR Master Mix (Vazyme, Nanjing, China) with specific forward primers (F) and the universal reverse primer (Table S1) with an ABI 7500 Real-Time PCR System (Applied Biosystems, Foster City, CA, USA). U6 snRNA was used as an internal control for normalization of the expression level of these miRNAs. All experiments were conducted independently three times.

Validation of circRNAs by qRT-PCR and Sanger sequencing
Differentially expressed circRNAs with mean FPKM ≥ 10 (relatively high expression ) in OV samples or GSF samples as well as possessing more than one back-spliced read in at least two replicates of OV samples or GSF samples were selected for confirmation by qRT-PCR. Thus, six down-regulated circRNAs: circRNA998, circRNA1000, circRNA1001, circRNA1684, circRNA3127, circRNA4287 and three up-regulated circRNAs: circRNA5112, circRNA7880, circRNA8565 were obtained. First, total RNA was reversely transcribed to cDNA using a Revert Aid First Strand cDNA Synthesis Kit (Thermo Fisher Scientific). Next, qPCR assays were performed with divergent primers using the ABI 7500 Real-Time PCR System (Applied Biosystems, Foster City, CA, USA) with 2× SYBR qPCR Mix (Aidlab, Beijing, China). 2 − Ct method was used to calculate the relative expression level of circRNAs with goat glyceraldehyde-3-phosphate dehydrogenase (GAPDH ) serving as an internal control. All experiments were conducted in triplicate. The PCR products from cDNA samples were ligated into pMD-19T (Takara, Dalian, China) for Sanger sequencing to determine the back-splicing junctions.
Moreover, genomic DNA(gDNA) was extracted from GSF cells using a Blood/Cell/Tissue Genome DNA Extraction Kit (Tiangen, Beijing, China). Both cDNA and gDNA were used as templates for PCR amplification using specific convergent and divergent primers ( Table 1). The PCR products were examined using 1.5% agarose gel.

Statistical analysis
Statistical significance analysis was performed by Student's t -test, and p ≤ 0.05 was considered statistically significant.

Properties of circRNAs in ORFV-infected and uninfected GSF cells
We first performed circRNA sequencing, using the ribosomal RNA-depleted method, of three uninfected GSF samples (GSF samples) and three ORFV-infected samples (OV samples) on the Illumina HiSeq 4000 platform. We acquired an average of 86 million and 93 million raw reads for the GSF and OV groups, respectively. Clean reads, which accounted for >98.5% of the raw reads, were obtained after the removal of the low quality raw reads. The Q30 of each sample was ∼93%. Most reads (∼85%) were linearly mapped to the goat reference genome. Among the remaining unmapped reads, approximately 1% of reads from each sample were identified as back-spliced junction reads. Furthermore, we aligned the reads unmapped to goat genome from each sample to Orf virus reference genome (Orf virus NA1/11 strain under GenBank number KF234407.1). The results indicated that a total of 125,619 reads, 165,878 reads and 157,779 reads were mapped to Orf virus genome in OV-1, OV-2, and OV-3 sample, respectively. There were nearly zero reads aligned to Orf virus genome in uninfected GSF samples (Table 2).
Finally, 9,979 and 10,844 circRNAs with more than one back-spliced read existed in at least one sample of GSF or OV group were identified, respectively (Table S2). Among these circRNAs, 4,649 were shared between the groups, while 5,330 and 6,195 circRNAs were uniquely expressed in the GSF and OV samples, respectively (Fig. 1A). Table 1 List of convergent and divergent primers used for circRNAs qRT-PCR validation.
Additionally, an average of 40,882 transcripts were detected in GSF samples while 40,778 transcripts were identified in three OV samples, respectively. Following miRNA sequencing, 695 mature miRNAs in GSF samples and 674 miRNAs in OV samples were acquired, respectively. The mapping overview of miRNA sequencing is provided in Table S3.

GO and KEGG analysis for host genes of differentially expressed circRNAs
Due to the fact that the biological functions of circRNAs may be associated with their corresponding parental transcripts, GO and KEGG analyses for host genes of dysregulated circRNAs were conducted in the study. The top 20 GO terms significantly enriched (p ≤ 0.05) in molecular function, cellular component, and biological process are presented in Fig. 3A. The top six enriched GO terms in cellular component were proteinaceous extracellular matrix, intercalated disc, voltage-gated sodium channel complex, plasma membrane, T-tubule and extracellular region. The top six enriched GO terms in molecular function were growth factor binding, voltage-gated sodium channel activity, neuropilin binding, semaphorin receptor binding, chemorepellent activity and integrin binding. Moreover, regulation of inflammatory response, epithelial structure maintenance, negative regulation of insulin secretion, positive regulation of cell migration, positive regulation of ubiquitin-protein transferase activity, regulation of ion transmembrane transport were significantly enriched in the biological process subgroup (Fig. 3A, Table S7). Next, we conducted KEGG enrichment analysis for circRNA-hosting genes. Nine pathways: Neuroactive ligand-receptor interaction, Tight junction, Rheumatoid arthritis, Transcriptional misregulation in cancers, Focal adhesion, Vascular smooth muscle contraction, Mismatch repair, Other types of O-glycan biosynthesis and Adherens junction were significantly enriched (p ≤ 0.05). Furthermore, pathways such as Platelet activation, Fc gamma R-mediated phagocytosis, Endocytosis, Cytokine-cytokine receptor interaction, and TGF-beta signaling pathway were also enriched (p > 0.05). Fc gamma R-mediated phagocytosis and cytokine-cytokine receptor interaction were involved in host immune response to pathogen infection (Fig. 3B, Table S8).

Validation of miRNAs and circRNAs
The qRT-PCR results showed all six novel miRNAs could be specifically amplified. The expression levels of PC-3p-4306_468, PC-5p-5127_361 and PC-3p-10316_124 were consistent with the results of small RNA sequencing while the remaining three miRNAs did not show significant differential expression (Fig. S1). The relative expression levels of nine circRNAs were validated by qRT-PCR using specific divergent primers. The results indicated that circRNA1001, circRNA1684 and circRN3127 were down-regulated while circRNA7880 was up-regulated, which were consistent with the RNA-seq data (Fig. 5A). As expected, the results of the agarose gel electrophoresis demonstrated that divergent primers could only amplify circRNAs from cDNA samples, while the convergent primers amplified products from both gDNA and cDNA. (Fig. 5B). Back-splicing junctions of circRNAs from cDNA samples were further validated by Sanger sequencing (Fig. 5C).

DISCUSSION
Transcriptional profiling is a powerful tool for researching the host-virus interactions during infection. Recently, researchers performed RNA-seq of sheep oral mucosa in response to Orf virus infection (Jia et al., 2017). They found that multiple differentially expressed genes were enriched in GO terms such as immune response, inflammatory response and apoptosis, indicating that the host could defend virus invasion through immune response and induce cell apoptosis to block viral proliferation. Another study reported alterations of transcriptional profiles in human foreskin fibroblast cells following ORFV infection (Chen et al., 2017). A variety of genes involved in immune response, apoptosis, cell cycle, etc were differentially expressed. These studies provided new insights into the mechanisms of infection by orf virus. Accumulating evidence indicated that circRNAs, a new type of non-coding RNAs, played key roles in diverse diseases (Li et al., 2015;Tian et al., 2017). Li et al. (2015) reported that circular RNA ITCH had inhibitory effect on ESCC by suppressing the Wnt/β-catenin pathway. Tian et al. revealed that hsa_circ_0043256 could inhibit cell proliferation and induce apoptosis by acting as a miR-1252 sponge. Although the mechanisms underlying ORFV-host interactions were thoroughly investigated, whether circRNAs were involved in the interactions between ORFV and its host yet remained unknown. Hence, in the current study, we conducted deep circRNA sequencing and small RNA sequencing to identify differentially expressed circRNAs, miRNAs and mRNAs and expected to find the potential circRNA-miRNA-mRNA network existing in interactions between ORFV and its host. After circRNA sequencing, R package-psych was used to perform PCA (Principle Component Analysis) statistics investigating the correlations coefficient among the three virus-infected samples and the three mock samples. According to the values of each sample in the first principal component (PC1) and the second principal component (PC2), a two-dimensional coordinate map is made (Fig. S2). The results indicated that the three mock samples were close to each other and they were clearly distinct from the three virusinfected samples. There existed a good similarity among samples in the same group and the obvious distinction between the virus-infected samples and the mock samples. Based on the existing criteria for identifying candidate circRNAs, approximately ten thousand circRNAs were detected in uninfected and ORFV-infected GSF cells. Previous studies have also demonstrated that circRNAs are abundant in mammalian cells (Guo et al., 2014;Salzman et al., 2012). However, if we only include the circRNAs detected in at least two of the three replicates in uninfected or ORFV-infected GSF samples, the number of candidate circRNAs decreased to a large extent. There were only 2,935 circRNAs detected in uninfected GSF samples while 2,359 circRNAs were detected in OV samples (Tables S9, S10). This would significantly reduce false positives of candidate circRNAs identified in GSF samples or OV samples. Furthermore, we found that >98% of circRNAs were ecirRNAs, which differs from Hu's finding in which 86% of circRNAs were ciRNAs while only 13% were ecircRNAs and 1% were exon-intron circRNAs derived from the pre-ovulatory ovarian follicles of goats (Tao et al., 2017). These discrepancies further demonstrate that the expression patterns of circRNAs are tissue specific and cell specific.
ORFV infection influenced the circRNA expression profile of the host cells. Compared with the GSF samples, 151 differentially-expressed circRNAs derived from 90 parental genes were identified. Our findings showed that a single gene locus could produce one, two, or several circRNAs through alternative splicing (Table S4). Diverse circRNA isoforms derived from the same cognate linear gene were differentially expressed in ORFV-infected GSF cells, indicating that their parental genes played significant roles in regulating the temporal expression of circRNAs. Fang et al. (Fang et al., 2018) recently reported that circ-Ccnb1 derived from its parental gene CCNB1, a regulator of cell mitosis, had an inhibitory effect on breast cancer cell proliferation and survival. The authors suggested that the biological functions of circRNAs might be closely associated with its parental gene. Next, we performed GO and KEGG analyses for the cognate linear isoforms of the differentially expressed circRNAs to explore the biological functions of circRNAs in response to ORFV infection. In the biological process oncology, regulation of inflammatory response, negative regulation of insulin secretion, positive regulation of cell migration, positive regulation of ubiquitin-protein transferase activity, regulation of ion transmembrane transport were significantly enriched with p ≤ 0.05 (Fig. 3A, Table S7). CircRNA12709, circRNA14794 and circRNA14795, enriched in GO term ''regulation of inflammatory response'', were different isoforms of their parental gene TNIP (TNFAIP3-interacting protein (1) which were upregulated in OV samples compared to GSF samples during ORFV infection. TNFAIP3 (tumor necrosis factor α-induced protein (3) also called A20 encoded a ubiquitin-editing protein which was an inhibitor of NF-κB. TNIP1 was shown to play a role in NF-κB inhibition by interacting with A20 (Aya et al., 2010). Pathways such as Tight junction, Rheumatoid arthritis, Transcriptional misregulation in cancers, Focal adhesion, Vascular smooth muscle contraction, Mismatch repair and other types of O-glycan biosynthesis were significantly enriched (p ≤ 0.05). These findings indicated that differential expression of circRNAs may be involved in many biological processes and cellular response to ORFV infection, providing us some valuable clues about the functions of circRNAs. Given that circRNAs could function as miRNA sponges regulating gene expression. We constructed a ceRNA network to explore the potential functions of differentially expressed circRNAs during ORFV infection. For example, circRNA302, circRNA1684, circRNA2565, circRNA4319, circRNA7192, circRNA8828 and circRNA10352 were predicted to sponge chi-miR-92a-5p. Also, circRNA131, circRNA302, circRNA9787, circRNA11661 potentially bound chi-miR-122. These results revealed that a potential ceRNA regulatory network existed in the host-ORFV interaction, although the exact regulatory mechanism requires further investigation.

CONCLUSION
In conclusion, we identified 9,979 and 10,844 circRNAs in GSF cells before and after ORFV infection. A total of 151 circRNAs (59 circRNAs up-regulated and 92 circRNAs downregulated) 341 mRNAs, and 56 miRNAs were differentially expressed following ORFV infection. Four circRNAs: circRNA1001, circRNA1684, circRNA3127 and circRNA7880 were validated by qRT-PCR and Sanger sequencing. Host genes of differentially expressed circRNAs were significantly enriched in many biological processes including regulation of inflammatory response, positive regulation of cell migration, and regulation of ion transmembrane transport. A potential circRNA-miRNA-mRNA regulatory network exists during ORFV infection. Our study is the first to present the expression profiles of circRNAs in GSF cells in response to ORFV infection and may provide new insights into the mechanism underlying ORFV pathogenesis.