Alternative Splicing of RNF180 Genes in Different Species Based on Comparative Genomics Analysis

Alternative splicing (AS) inuences gene regulation, cell differentiation, and tissue development and is involved in many human diseases. The ring nger protein 180 (RNF180), a tumor suppressor gene, has four transcripts and seven predicted transcripts. However, the role of alternatively spliced products in the function and regulation of the RNF180 gene remain unknown. We used a comparative genomics approach to investigate RNF180 AS in different species. Conserved coding sequences, alternative splicing expression proles and intron sequences were compared, and evolutionary selection pressure analyses of exons were performed. We found that the RNF180 zinc nger structure, which was related to major ubiquitination functions, was highly conserved and exon 5 was absent in many species. Comparisons with the corresponding intron revealed that exon 5 possessed high similarity. In exon pressure selection analysis, exon 6 was in the purifying selection, which corresponded to the zinc nger domain, while exons 7 and 8 faced positive selection evolution. Finally, there were multiple alternatively spliced forms of RNF180 in four transcripts. These results suggested that complex alternative splicing of the RNF180 gene occurred in multiple species. Partial splicing variants had evolutionarily conserved regions and functional region deletions.


Introduction
Alternative splicing (AS) is an important basic regulation mechanism in eukaryotes, whose spliceosomes recognize different splicing sites on pre-mRNA. The same gene pre-mRNA is then processed to produce different pattern combinations, resulting in different mature transcripts and proteins. Recent studies have shown that AS has an important in uence on gene regulation, cell differentiation, and tissue development and is closely involved in many human diseases. Although AS patterns differ in different species, splicing patterns are largely uniform in conserved regions. Through comparing the gene sequences of different species, the role of conserved sequences and AS sequences in the evolutionary process may be elucidated.
The RNF180 (ring nger protein 180) tumor suppressor gene is a member of the RNF/Rines gene family and is located at position 5q12.3 [1] . RNF180 expression has been detected in the endoplasmic reticulum membrane of cultured mammalian cells. RNF180 product (Rines) is an E3 ubiquitin ligase with protein ubiquitination activity. Rines contains a RING nger domain (431-472), a basic coiled-coil domain (351-400), a novel conserved domain DSPRC (83-132) and a C-terminal hydrophobic region that is predicted to be a transmembrane domain (564-586) [2] . The ubiquitin-proteasome system (UPS) is a highly speci c enzyme cascade involving the E1 ubiquitin-activating enzyme, E2 ubiquitin-binding enzyme, and E3 ubiquitin-protein ligase and plays a vital role in cell proliferation, differentiation, apoptosis, and other cellular processes [3,4] . The RNF180 encoded protein is expressed in many tissues, with the highest expression reported in brain and lowest expression in lung and thymus tissues. Methylation of the RNF180 DNA promoter is also involved in cell proliferation, cell, the cell cycle, tumor invasion, and tumorigenicity [5] . There are four known RNF180 transcripts (Fig. 1) and seven predicted RNF180 transcripts in the NCBI database(www.ncbi.nlm.nih.gov). Transcripts 2, 3, and 4 have different exon deletions than does transcript 1. Comparison of RNF180 gene sequences from different species allows us to analyze AS evolution. Moreover, clearly de ning the conservation of deleted exons throughout the evolution of an entire species can provide insight into gene function, tissue distribution, and differential expression in normal and tumor tissues. These comparisons may also provide clues for elucidating the molecular mechanisms of RNF180-related tumor development and progression.
Comparative genomics is a new approach for studying AS. Comparing exon sequences, AS type, relative expression of variants, and potential variable splice variants in two or more genomes may lead to the identi cation of variable splice regulatory elements and uncover the evolutionary conservation of spliced exons. Alternative splicing information and conserved RNA sequence could be integrated by comparative genomics to reveal RNA motifs that might have possible functions [6] . In this study, we used genomic data including multiple sequence alignment and selective pressure of evolution analyses to explore the evolution of RNF180 AS in different species. Using this approach, we elucidated the AS function of RNF180 and its alternative splicing function. Here, we describe the relationship between RNF180 splicing, gene regulation, and resulting functional changes.

Materials And Methods
Comparison of the amino acid sequences of RNF180 genes from different species.
Ten evolutionary representative species were selected from the 73 species of vertebrates included in the UCSC Genome Browser (http://hgdownload.soe.ucsc.edu/downloads.html), and the amino acid sequences of their RNF180 proteins were downloaded. Interspecies evolution was analyzed using ClustalX [7] and MEGA7 [8] software.
The evolutionary selection pressure (dN/dS) of RNF180 exons.
The exon sequences of 23 vertebrate RNF180 genes were downloaded from the UCSC genomic database (http://genome.ucsc.edu) ( Table.1). Each exon sequence was re-aligned based on a codon using MEGA7 software and manually corrected based on the corresponding encoded amino acid sequence. Using an online server (http://www.datamonkey.org/), we performed exon-selective pressure analysis by SLAC (single-likelihood ancestor-counting).  The relative rates of synonymous (dS) and non-synonymous substitutions (dN), ω = dN/dS, has been widely adopted as a measure of selective pressure [9] . This is de ned as the average number of synonymous substitutions (dS) at each synonymous site and the average number of non-synonymous substitutions (dN) at each non-synonymous site. An excess of non-synonymous substitutions (dN/dS 1) can be interpreted as positive selection, suggesting that replacement substitutions increase tness. A paucity of replacement changes (dN/dS 1) indicates that negative selection is working to remove such substitutions from the gene pool. dN/dS=1 indicates neutral selection.
Analysis of RNF180 AS products in different species.
The NCBI (http://www.ncbi.nlm.nih.gov) and Ensembl genome databases (http://asia.ensembl.org/index.html) were searched for RNF180 AS variants and splice sites and splice modes were compared and analyzed.

RNF180 intron sequence comparison in different species.
Based on the RNF180 gene annotation in the UCSC genome database (http://genome.ucsc.edu), RNF180 genomic sequences from various species were download from the Ensembl genome database. Exons were compared and the corresponding intron sequences were obtained from each species and compared using Clustalx (Table.1).

Results
Comparison of RNF180 amino acid sequences in different species.
Ten evolutionary representative organisms from lower vertebrates to higher mammals were used for RNF180 amino acid sequence comparisons. These species included Zebra sh, Xenopus tropicalis, Chicken, Zebra nch, Cow, Rat, Mouse, Orangutan, Chimpanzee and Human. Our results showed that conserved RNF180 gene sequences gradually increased from lower vertebrates to primates, and the RNF180 sequence similarity between Humans and Chimpanzees is 99.0% (Table.2). With the exception of Cow, the species RNF180 sequence alignment results revealed an evolutionary relationship consistent with that described by the evolutionary tree (Fig. 2). The RNF180 C-terminal amino sequence was highly conserved as were those of exons 3, 6, and 7. Exon 1 and exon 8 were not translated most of the examined species. Exon 2 (1-45) was missing in Zebra nch and us partially deleted in Chicken and X. tropicalis. Exon 4 (78-398) was not well conserved in Zebra sh and Chicken, X. tropicalis, and Zebra sh were devoid of exon 5 (399-411) ( Supplementary Fig. S1). Further analysis using the COBALT online multiple sequence alignment analyzer revealed considerable variability at amino acid sites 151, 175, 221, 254, 361, 400, and 531.  Analysis of alternative splicing of the RNF180 gene in different species.
We downloaded the standard reference RNF180 genes from a range of species from lower vertebrates to higher mammals using the NCBI database and the Ensembl Gene Browser. From these data, we predicted AS isoforms in Human, Chimpanzee, Mouse, Rat, Cow, Chicken, X. tropicalis and Zebra nch. The Orangutan, Guinea pig, and Zebra sh sequences did not match our query transcripts (Table.4). The Human RNF180 genome annotation was provided by the UCSC Genome Browser. The longest transcript, transcript 1, had eight exons. For transcript 2, the TGA premature termination codon was introduced in exon 5. As for transcript 3, a jump of exon 4 to exon 3 was observed and a premature TGA termination codon was introduced in exon 5. A premature TAA termination codon was also observed in exon 7 of transcript 4. The CDS sequences of human RNF180 gene were used as a reference. Comparison with RNF180 sequences on other species revealed that the main splicing patterns in these species are consistent with that of human. Exons 7 and 8 terminate prematurely, exon 4 jumping only was observed in human, and exon 5 was missing in multiple species (Supplementary Fig. S2 and Table.5). Additionally, RNF180 splicing in Chicken, X. tropicalis and Zebra nch was complicated and requires further study.  tropicalis, Zebra sh, Zebra sh, and other species) using Clustalx. We found that Human and Gorilla had the most conserved intron sequences, and that exon 5 was absent in Chimpanzee, Cow, Cat, Dog, Chicken, X. tropicalis, Zebra sh, Armadillos, Elephant, and Guinea pig. Therefore, we sought to determine the evolutionary origin of the human exon 5 sequence by comparing it with the corresponding intron region sequences in other species. Our results showed that the conservation between Human and Gorilla, Cow, Cat, Armadillo, Elephant, and Dog were 100%, 96%, 94%, 91%, 88%, and 86%, respectively (Table.6). In silico analysis revealed that most of the frequent mutations were predicted to be deleterious (Supplementary Table S1). Discussion AS is a tightly regulated process whereby a single gene can encode multiple distinct transcripts, and provides an essential means of expanding the proteome [10] . Comparative genomics is an indispensable approach for studying AS [11] . More recently, investigations of new transcription products rely on highthroughput sequencing and EST sequencing technologies [12] . Here, we used comparative genomics to explore RNF180 and found that the functional domain (zinc nger structure) was highly conserved. We also found that exon 5 was absent in many species. Comparison with the corresponding intron in other species revealed high similarity to Human exon 5. Exon pressure selection analysis revealed that exon 6, which corresponds to the zinc nger domain, was subjected to purifying selection. This analysis also showed that exons 7 and 8 were subjected to positive selection. Meanwhile, different splicing patterns were observed in the four human RNF180 transcripts, which resulted in different mature transcripts, with transcripts 2 and 3 being shorter. In conclusion, this work is helpful to further understand the structure of human RNF180 gene and the effect of AS on its function.
In this study, we compared RNF180 amino acid sequences form orthologous species and discovered that the zinc nger and the C-terminus of the basic coiled-coil domain were highly conserved. Furthermore, exon 5 was absent in Chimpanzee, Cow, Cat, Dog, Chicken, X. tropicalis, Zebra sh, Armadillo, Elephant, and Guinea pig, while exons 1 and 2 were deleted in lower vertebrates, suggesting that AS is closely related to RNF180 gene evolution.
AS is an important mechanism for accelerating genome evolution [13,14] and plays a prominent role as a source of functional innovation [15] . Synonymous amino acid substitutions have no impact on protein composition, but non-synonymous substitutions may directly affect protein function. Xing exon analysis of AS in human and mouse transcripts revealed that a variable splice exon had a higher dN/dS value than did the constitutive exon [16][17][18][19] . These ndings suggest that AS plays a role in promoting gene evolution [20] . Lu et al. predicted a strong selectable AS event for 345 human genes and 262 mouse genes by combining multiple genomic alignments with RNA selective stress analysis. Here, dN/dS analysis showed that RNF180 exons 7 and 8 had high dN/dS values (1.49825 and 1.23845, respectively), suggesting that they are undergoing adaptive selection. It is not clear which domain exon 7 belongs to. Exon 8 is located in the transmembrane domain and may be related to the subcellular localization of RNF180 protein or may act as a signal peptide. Exons 7 and 8 are undergoing positive selection to accelerate RNF180 evolution and may acquire new functions. The dN/dS value of exon 6 was 0.160536 (< 0.25), indicating puri cation selection. The deduced amino acid sequence of exon 6 is located in the RNF180 ring nger domain (431472). The RING domain is a conserved domain rich in Cys residues, which binds two zinc ions through a highly conserved Cys or His residue. It mediates ubiquitin transfer of substrates (including itself), and plays a major role in ubiquitination as a support for bringing E2-Ub and substrate proteins into close proximity [21,22] . These data suggest that exons 7 and 8 splicing may represent functional AS.
Exons and the AS patterns of orthologous genes in different species are in constant evolution [23,24] , mainly involving the deletion and retention of exons. Differences in exon splicing patterns and gene expression levels have been reported between species [25] . Potential variable splice variants can be identi ed by comparisons between two or more genomic sequences [26] . In this study, we identi ed the RNF180 variable splice isoforms and compared them with those of the human RNF180 reference sequence. Our results showed that all species examined have corresponding RNF180 transcripts. Longitudinal comparison between AS in different species revealed RNF180 exons 3, 6, and 7 are highly conserved, and exon 5 is absent in various species. Moreover, we found that exons 7 and 8 frequently have introduced termination codons, exon 2 is skipped in lower vertebrates skips, and human exon 4 is jumped. Transverse comparisons of transcripts from multiple species revealed speci c splicing patterns in different species. Exon splicing in Chicken, Zebra nch, and X. tropicalis (1)(2)(3)(4)(5) is complex and requires further study. These results suggest that RNF180 AS is much more complex than previously known. There are eleven types of human RNF180 genes identi ed with predicted transcripts in the NCBI database. These speci c splice variants are generally expressed in low-abundance and may not have a major effect on the physiological function of the gene. However, the accumulation of speci c spliceosomes may have a subtle effect on gene function and may indirectly affect the interspeci c characteristics of the gene. A comparison of four human transcripts has shown that transcripts 2 and 3 do not contain conserved exons 6 and 7. This would result in deletion of the ring nger structure and may affect RNF180 E3 ubiquitin ligase function, leading to protein degradation. E3 ubiquitin ligase is involved in speci c binding to ubiquitin, which is a key step in identifying the proteasome substrate and can be investigated by studying different AS forms. COBALT online multiple sequence alignment analyses of RNF180 amino acid sequences from 10 species, showed that amino acid sites 400 and 531 loci were located at the edge of splicing. Many exon-intron boundaries contain auxiliary splice recognition site cisacting elements, and amino acid changes in these regions may affect splice site recognition.
Species evolution is both a simple and highly complex process. The information carried by the genome continues to change through selection and elimination leading to the gain of new features. The evolution of AS introns plays an important role in gene function gain [27,28] . Comparisons of RNF180 gene sequences from different species revealed that exon 1 is located in the 5 ' untranslated region (5'-UTR) and is absent in lower vertebrates. This exon emerged through evolution, but is not translated. Approximately 30% of young cassette exons appear in the 5'-UTR. The new exon may be stored in the non-coding region to avoid the impact of translation and may be accessed in the open reading frame with the gradual emergence of function [29,30] . RNF180 exon 5 is deleted in many species, especially in lower organisms, and comparison of exon 5 with corresponding introns in species lacking exon 5 revealed that exon 5 sequence conservation in Gorilla, Cow, Cat, Armadillo, Elephant, and Dog reached 100%, 96%, 94%, 91%, 88%, and 86%, respectively. These data suggest that exon 5 may be derived from the original intron sequences. Smart and Uniprot databases suggest that exon 5 is an unknown region and does not match other domains in the NCBI blast tool. Therefore, the effect of exon 5 on RNF180 gene function remains unclear and required further study. The functional evolution of RNF180 is affected by AS, and new exons are introduced through AS. Frequently, these exons are derived from the original intron sequences, and these novel exons may be involved in the translation of protein sequences, thereby increasing the function of the original protein. To understand RNF180 gene structure and the effect of AS on its function, we used bioinformatics methods including multiple sequence alignment and selective pressure analysis to explore RNF180 AS in different species.
In summary, our ndings showed that complex RNF180 AS occurred in different species, including species-speci c AS. Future well designed investigations are required to elucidate the function and regulatory mechanisms of AS in RNF180 and thus develop the clinical utility.

Data availability
All the datasets and software used in this work are publicly available.    Other species reference sequence of RNF180 gene.

Figure 5
Results of clustering analysis of Exon5.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. Supplementaryinformation.pdf