Genome Characteristics of Two Ranavirus Isolates from Mandarin Fish and Largemouth Bass

Ranaviruses are promiscuous pathogens that threaten lower vertebrates globally. In the present study, two ranaviruses (SCRaV and MSRaV) were isolated from two fishes of the order Perciformes: mandarin fish (Siniperca chuatsi) and largemouth bass (Micropterus salmoides). The two ranaviruses both induced cytopathic effects in cultured cells from fish and amphibians and have the typical morphologic characteristics of ranaviruses. Complete genomes of the two ranaviruses were then sequenced and analyzed. Genomes of SCRaV and MSRaV have a length of 99, 405, and 99, 171 bp, respectively, and both contain 105 predicted open reading frames (ORFs). Eleven of the predicted proteins have differences between SCRaV and MSRaV, in which only one (79L) possessed a relatively large difference. A comparison of the sequenced six ranaviruses from the two fish species worldwide revealed that sequence identities of the six proteins (11R, 19R, 34L, 68L, 77L, and 103R) were related to the place where the virus was isolated. However, there were obvious differences in protein sequence identities between the two viruses and iridoviruses from other hosts, with more than half lower than 55%. Especially, 12 proteins of the two isolates had no homologs in viruses from other hosts. Phylogenetic analysis revealed that ranaviruses from the two fishes clustered in one clade. Further genome alignment showed five groups of genome arrangements of ranaviruses based on the locally collinear blocks, in which the ranaviruses, including SCRaV and MSRaV, constitute the fifth group. These results provide new information on the ranaviruses infecting fishes of Perciformes and also are useful for further research of functional genomics of the type of ranaviruses.

It has been reported that aquaculture has become the fastest-growing agricultural production industry in the world, and a major contributor is China [16][17][18]. Mandarin For genomic DNA sequencing, the insertion libraries were constructed with SMRTbell Express Template Prep Kit 2.0 (Pacific Biosciences, Menlo Park, CA, USA) according to the manufacturer's instructions and sequenced using a PacBio Sequel II instrument (CCS; The Beijing Genomics Institute, Beijing, China).

Genome Annotation and Analysis
The DNA composition, structure, nucleotide, and amino acid sequences were analyzed with the DNASTAR program (Lasergene, Madison, WI, USA) as described previously [23]. The open reading frames (ORFs) were predicted using SnapGene software (version 6.1.1) and NCBI ORF finder (https://www.ncbi.nlm.nih.gov/orffinder/, accessed on 12 December 2022). The following criteria were considered during ORF prediction: (1) the length was at least 120 bp, (2) the predicted ORF was not located in another larger ORF, (3) overlapping ORFs should have homologs in other sequenced iridoviruses [6]. Comparisons of homologous sequences among different viruses were performed by using BLAST programs (blastn for DNA sequence and blastp for protein sequence). All coding protein sequences of ranavirus were collected from GenBank. Multiple sequence alignments were conducted with ClustalX 1.83, and sequence identities were calculated with the MegAlign program. For a detailed comparison of the ORFs between SCRaV, MSRaV, and other ranaviruses, nine strains of ranaviruses were selected, including the four isolated from mandarin fish and largemouth bass previously and five others representing different genomic types of ranaviruses.
For phylogenetic analysis, the 26 iridovirus core proteins from SCRaV, MSRaV, and other completely sequenced iridoviruses were collected, identified based on homology comparison, and concatenated separately, and a reminder is needed that the Shrimp hemocyte iridescent virus and Cherax quadricarinatus iridovirus just have 24 core proteins. The MUSCLE program in Mega software (version 11.0.11) was used to make alignment, and a phylogenetic tree was constructed by the Neighbor-Joining method with default parameters. The Multiple genome alignment, including all 6 isolates from mandarin fish and largemouth bass (SCRaV, MSRaV, mandarin fish ranavirus strain NH-1609, largemouth bass virus strain Alleghany, largemouth bass virus strain GDOU, and largemouth bass virus strain Pine), RGV, FV3, ADRV, CMTV, ATV, epizootic hematopoietic necrosis virus (EHNV), SGIV, and grouper iridovirus (GIV), was performed with the progressive Mauve plugin in Geneious software (version 2023.0.2) [24].

Virus Isolation and Identification
Tissue extracts from the diseased largemouth bass and mandarin fish both induced cytopathic effect (CPE) in several cultured cells, including SCSC, EPC, and GSTC. Infection of the cells with supernatants from the infected cells still caused typical CPE. A representative CPE in the three cells is shown in Figure 1. The two viruses' infections both induced the lysis or detachment of cells. In the fibroblast-like SCSC cells, the infected cells lysed or detached rapidly, and only about half of the cells retained at the culture surface at 24 hpi, which formed a discrete distribution. At 48 hpi, most of the SCSC cells have lysed, and the remaining cells became round, indicating their death. For the epithelioid EPC and GSTC cells, a few plaques formed at 24 hpi, and plaques enlarged with infection time due to the lysis and detachment of infected cells. The CPE in SCSC cells seemed more serious than in the other two cells. Infection of GSTC with ADRV, a previously identified ranavirus, was used as a control, which showed similar CPE with the two viruses.  Ultrastructural observations were performed with SCRaV-infected SCSC cells and MSRaV-infected GSTC cells, respectively. As shown in Figure 2, serious cytoplasmic vacuolation was observed in SCRaV-infected SCSC cells, which caused difficulties in finding cellular organelles (Figure 2A). Cell shrinkage was observed in MSRaV-infected GSTC cells with a compacted and deformed nucleus ( Figure 2B). Several regions that were full of mature or immature viral particles can be found in the cells (cytoplasm of GSTC). Intact virions in the ultrathin section are hexagonal or approximately circular, with a diameter of about 160 nm. Paracrystalline arrays that were formed by virion accumulation can be observed in a small number of cells ( Figure 2C). Ultrastructural observations were performed with SCRaV-infected SCSC cells and MSRaV-infected GSTC cells, respectively. As shown in Figure 2, serious cytoplasmic vacuolation was observed in SCRaV-infected SCSC cells, which caused difficulties in finding cellular organelles (Figure 2A). Cell shrinkage was observed in MSRaV-infected GSTC cells with a compacted and deformed nucleus ( Figure 2B). Several regions that were full of mature or immature viral particles can be found in the cells (cytoplasm of GSTC). Intact virions in the ultrathin section are hexagonal or approximately circular, with a diameter of about 160 nm. Paracrystalline arrays that were formed by virion accumulation can be observed in a small number of cells ( Figure 2C).

Architecture and General Features of the Two Virus Genomes
The complete genome sequence of the two viruses was determined. The genome of SCRaV consists of 99,405 bp with 105 potential ORFs, and the genome of MSRaV consists

Architecture and General Features of the Two Virus Genomes
The complete genome sequence of the two viruses was determined. The genome of SCRaV consists of 99,405 bp with 105 potential ORFs, and the genome of MSRaV consists  Detailed information about the predicted ORFs and  comparisons with their homologs of other ranaviruses, including the four other ranaviruses  (MFRV, LMBV-G, LMBV-A, LMBV-P) isolates from mandarin fish and largemouth bass  worldwide were shown in Table 1 and Table S2. The length of the predicted proteins of the two viruses (SCRaV and MSRaV) both ranged from 49 to 1354 aa. There are very high sequence identities between the proteins of the two viruses. Most of their proteins (94/105) have sequence identities of 100% with the homolog. Ten proteins have sequence identities ranging from 92.5% to 99.9% with their homolog. Sequence identity lower than 90% was only obtained in one protein (79L) between the two viruses, which encodes a predicted neurofilament triplet H1-like protein.
Genome and encoding proteins of SCRaV and MSRaV were then compared with the previously sequenced four ranaviruses from the mandarin fish and largemouth bass worldwide. The results showed that the genome sequence identity between SCRaV and MSRaV was 99.92%, and a range of 98.68-99.88% was obtained between SCRaV and the other four isolates ( Table 1). Most of the coding proteins of the six ranaviruses isolated from the two fishes possessed high identities, more than 96% among their homologs. It could be observed that the four isolates from China had higher similarity in genome sequences and coding proteins than the two from the USA (Table S2), especially the six proteins (11R, 19R, 34L, 68L, 77L, and 103R), in which 11R and 68L contain domains of LPXTG-anchored collagen-like adhesin and 77L contains a domain of DNA polymerase III subunit.
However, the sequence identity between the two viruses and ranaviruses from other hosts is not high. Although the sequence identity of the major capsid protein (MCP) between the two viruses and other ranaviruses could reach more than 83%, more than half of the proteins of the two viruses share sequence identity of less than 55% with homologs of ranaviruses from other hosts. There are still several proteins possessing sequence identity lower than 30% (the lowest was 22.3%) with its homolog, and 12 proteins cannot find homologs in iridoviruses from other hosts.
The schematic diagrams of the genome organization of SCRaV and MSRaV are shown in Figure 3. The two viruses have the same genome organization and gene composition. Combined with function analysis, the predicted genes were clustered as genes encoding structural proteins, nucleotide metabolism-related genes, DNA replication-and transcription-related genes, virus-host interaction-related genes, and unknown genes. Detailed information about the genes are described below. Because of the high sequence identity between the two viruses, gene and protein descriptions were mainly performed based on SCRaV.

Structural Proteins
SCRaV 104R was predicted to encode the major capsid protein (MCP), which contains 463 aa. Among the viral proteins, the MCP of SCRaV and MSRaV has the highest sequence identity with their homologs of ranaviruses infecting other animals. For example, they had a sequence identity of 84% with ADRV MCP and 83.6% with RGV MCP. SCRaV 1L and 16R encode two myristylated membrane proteins corresponding to ADRV 2L/RGV 2L and ADRV 58L/RGV 53R, respectively, which belong to core genes of iridoviruses and have been identified as envelope proteins of ranaviruses [25,26]. SCRaV 1L and 16R have sequence identities ranging from 70.5% to 75.2% and 55.4% to 63.7% with their homologs of the last five ranaviruses in Table 1, respectively. There are several other predicted proteins containing transmembrane domain (SCRaV 5R/8R/9R/56L/86R/98R), which could contain envelope proteins.  sition. Combined with function analysis, the predicted genes were clustered as genes encoding structural proteins, nucleotide metabolism-related genes, DNA replication-and transcription-related genes, virus-host interaction-related genes, and unknown genes. Detailed information about the genes are described below. Because of the high sequence identity between the two viruses, gene and protein descriptions were mainly performed based on SCRaV.  ORFs. The scale is in kilobase pairs. Arrows indicate the size, location, and orientation of the ORFs. The iridovirus core genes and SCRaV/MSRaV specific genes were shown in black and blue color, respectively. There are 12 SCRaV/MSRaV-specific genes that have no homologs in viruses infecting other animals, including a TNFR-like protein encoded by 17L.

Structural Proteins
SCRaV 104R was predicted to encode the major capsid protein (MCP), which contains 463 aa. Among the viral proteins, the MCP of SCRaV and MSRaV has the highest sequence identity with their homologs of ranaviruses infecting other animals. For example, they had a sequence identity of 84% with ADRV MCP and 83.6% with RGV MCP. SCRaV 1L and 16R encode two myristylated membrane proteins corresponding to ADRV 2L/RGV 2L and ADRV 58L/RGV 53R, respectively, which belong to core genes of iridoviruses and have been identified as envelope proteins of ranaviruses [25,26]. SCRaV 1L and 16R have sequence identities ranging from 70.5% to 75.2% and 55.4% to 63.7% with their homologs of the last five ranaviruses in Table 1, respectively. There are several other predicted proteins containing transmembrane domain (SCRaV 5R/8R/9R/56L/86R/98R), which could contain envelope proteins.

Nucleotide Metabolism Related Genes
There are 4 predicted proteins that could involve in nucleotide metabolism. SCRaV 71L encodes a protein of 141 aa, which contains domains of the deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTPase) family. SCRaV 69L (387 aa) and 94L (562 aa) are two homologs of ribonucleotide reductase (RNR) subunit that could catalyze the synthesis of deoxyribonucleotides that was used as precursors of DNA synthesis. SCRaV 99L (189 aa) contains the domain of deoxyribonucleoside kinase (dNK) or thymidine kinase (TK), which is a key enzyme in the salvage of deoxyribonucleosides. The four proteins all have ORFs. The scale is in kilobase pairs. Arrows indicate the size, location, and orientation of the ORFs. The iridovirus core genes and SCRaV/MSRaV specific genes were shown in black and blue color, respectively. There are 12 SCRaV/MSRaV-specific genes that have no homologs in viruses infecting other animals, including a TNFR-like protein encoded by 17L.

Nucleotide Metabolism Related Genes
There are 4 predicted proteins that could involve in nucleotide metabolism. SCRaV 71L encodes a protein of 141 aa, which contains domains of the deoxyuridine 5'-triphosphate nucleotidohydrolase (dUTPase) family. SCRaV 69L (387 aa) and 94L (562 aa) are two homologs of ribonucleotide reductase (RNR) subunit that could catalyze the synthesis of deoxyribonucleotides that was used as precursors of DNA synthesis. SCRaV 99L (189 aa) contains the domain of deoxyribonucleoside kinase (dNK) or thymidine kinase (TK), which is a key enzyme in the salvage of deoxyribonucleosides. The four proteins all have homologs in other ranaviruses.

DNA Replication-and Transcription-Related Genes
For the proteins that could be involved in DNA strand replication, SCRaV 66R encodes a homolog of DNA polymerase, which has a length of 1004 aa and contains a 3 -5 exonuclease domain and a B-family DNA polymerase domain. SCRaV 37R encodes a protein of 955 aa, which contains a domain of primase and the D5_N family. SCRaV 12L (261 aa) is a homolog of the p31K protein of ranaviruses, which has been identified as the virus single-stranded DNA binding (SSB) protein [27]. SCRaV 100L encodes a protein of 242 aa, whose homologs in other ranaviruses have been considered a homolog of proliferating cell nuclear antigens (PCNA) [28]. In addition, SCRaV 77L (284 aa) contains a domain of DNA polymerase III subunits gamma/tau. SCRaV 82L (566 aa) contains a DNA polymerase III subunit gamma/tau and an SAP domain. SCRaV 75L (356 aa) encodes a putative RAD2 family DNA repair protein, which could be involved in ranavirus DNA recombination and repair [29]. SCRaV 31L (173 aa) contains a domain of Holliday junction resolvases.
For the proteins that could be involved in genome transcription, there are 3 putative subunits of DNA-directed RNA polymerase (RNAP) II. SCRaV 45R encodes a protein of 1354 aa, which is the putative largest subunit of RNAP (Rpb1). SCRaV 74R has a length of 1094 aa and could be the

SCRaV-and MSRaV-Specific Genes
Sequence analysis also revealed 12 putative genes (4R, 15L, 17L, 25R, 34L, 36R, 43L, 46R, 53R, 54L, 81L, and 93L) that no homologs were found for their encoding proteins in viruses of other hosts, which could be considered as specific genes for SCRaV and MSRaV (or SCRaV/MSRaV-like viruses) ( Figure 3 and Table 1). It should be noticed that there are 16 genes of SCRaV/MSRaV, including the 12 genes that cannot be found homologs in the compared viruses (ADRV, RGV, FV3, EHNV, and SGIV) in Table 1, but 4 of them (19R, 20R, 44L, and 49L) had homologs in other ranaviruses that were not listed in the table. Most of the specific genes encode hypothetical proteins that no conserved domains/motifs can be found. Only two proteins contain known domains. The 4R protein contains an N-terminal immunoglobulin (Ig)-like domain, and the 17L protein contains a domain of tumor necrosis factor receptor (TNFR), which could be involved in virus-host interactions.
In addition, ORF prediction and analysis also showed that SCRaV/MSRaV encodes five putative proteins (11R, 32L, 33L, 67L, and 68L) that contain domains of LPXTGanchored collagen-like adhesins. The amino acid length of the 5 predicated proteins is 245, 240, 257, 243, and 288 aa, respectively. Sequence alignment and motif search showed that they all contain variable-length regions full of Gly-X-X repeats, which is a character of LPXTG-anchored collagen-like adhesin. Although homologs of the five proteins could be found in some ranaviruses, the sequence identity between the five proteins and their homologs is low, which made most of their homologs do not contain the LPXTG-anchored collagen-like adhesins domain. So, the five proteins can also be considered SCRaV/MSRaVspecific proteins.

Phylogenetic Analysis
A phylogenetic tree was constructed based on the proteins of core genes from 56 iridoviruses, including 35 ranavirus isolates ( Figure 4). All the ranavirus isolates clustered in a big branch, which could be divided into small branches, including FV3/RGV-like, CMTV/ADRV-like, EHNV/ATV-like, largemouth bass virus (LMBV)/SCRaV-like, and SGIV-like viruses. The two viruses, MSRaV and SCRaV, were clustered with the other largemouth bass virus and mandarin fish ranavirus isolates, which indicated that they belonged to LMBV-like viruses.

Genome Comparison
We tried to perform a dot plot analysis to determine the genome similarity degrees between the two viruses and other ranaviruses, but no obvious collinearity can be found, possibly because of the low sequence identity between the two virus genomes and other ranaviruses. Then, a genome-wide alignment was carried out and revealed the genomic arrangement of the aligned ranaviruses ( Figure 5). The genome of the 14 ranaviruses can be divided into more than 20 locally collinear blocks (LCBs), which were indicated by different colors in the figure. It can be observed that there were 5 types of genomic arrangement in the aligned ranavirus genomes based on the arrangement of LCBs. All the ranaviruses isolated from mandarin fish and largemouth bass, including SCRaV and MSRaV, have the same genomic arrangement and belong to the first type named SCRaV/MSRaV/LMBV-like or Santee-Cooper ranavirus (SCRV), and RGV and FV3 have the same second type of genomic arrangement. ADRV and CMTV possess the third type of genomic arrangement.   Table S1.  Table S1.

Discussion
Fish ranaviruses are getting more and more attention for the development of the aquaculture industry, such as these infecting fishes of the order Perciformes. However, a detailed analysis of the genome architecture of ranaviruses from Perciformes fish and a comparison with other ranaviruses was lacking. In the present study, based on two newly isolated ranaviruses from mandarin fish and largemouth bass, genome characters of the types of ranaviruses were analyzed.
Sequence comparison showed that there was highly sequence identity between SCRaV and MSRaV, which indicated that the two viruses should belong to one species. Among the eleven proteins that possessed differences between the two viruses, the 79L (predicted neurofilament triplet H1-like protein) of the two viruses had identities lower than 90%, which hinted that the proteins, especially the 79L, could determine the characteristics of the two viruses. We also observed that the proteins among the SCRaV/MSRaV-like viruses isolated in China possessed more sequence identity than that of virus isolates of the USA, and vice versa, especially for six proteins, including a DNA polymerase subunit, which indicated that these proteins may be associated with the regional divergence and replication efficacy of the viruses.
Sequence divergence between the type of ranavirus and other ranaviruses (e.g., FV3/RGV-like, ATV/EHNV-like, CMTV/ADRV-like, and SGIV-like) is relatively high, which indicated that the ranaviruses isolated from mandarin fish and largemouth bass have their own characters. Up to now, reports on gene functions of the type of ranaviruses are few. It could be observed that the MCP of SCRaV and MSRaV have the highest sequence identity with its homolog of other ranaviruses, which indicated the high homology of MCPs among ranaviruses. On the contrary, several proteins possessing low homology with other ranaviruses were found. The viral proteins that could be involved in virus-host interactions all belonged to the low homology proteins, which indicated the adaptation to a specific host.
Genome-wide recombination, deletion, insertion, and inversion have been reported in ranaviruses [6,10,14,30]. Our genome alignment showed the sequence inversion and insertion among different types of ranaviruses. The inversion and insertion may be an adaption of viruses to different hosts or environments, which can be used as the basis to classify different types of ranaviruses and also would help in the identification or prediction of emerging and re-emerging ranaviruses. Combined with the results from sequence identity comparison, genome-wide alignment, and phylogenetic analysis, the SCRaV and MSRaV or SCRV-like viruses constitute a unique type/group in ranaviruses.
NCLDVs usually encode their own proteins to conduct DNA replication and transcription. Our previous study with ADRV and RGV has revealed the replication and transcription machinery of ranaviruses [27]. For DNA replication, the viral DNA polymerase, helicase/primase, PCNA, and SSB should be key components of the replisome. The four proteins were identified in SCRaV and MSRaV encoded proteins (SCRaV 66R, 37R, 100L, and 12L), which indicated that the core components of the replisome of SCRaV and MSRaV were similar with ranaviruses infecting amphibians. Interestingly, domain/motif search showed that two proteins of SCRaV (77L and 82L) contain domains of DNA polymerase III subunits. DNA polymerase III is the main enzyme in bacterial DNA replication [31]. Whether the two proteins participated in ranavirus DNA replication needs to be researched in the future. For DNA transcription, there are 3 predicted RNAP subunits (45R, 74R, and 28R) and 3 possible transcription factors (22L, 40L, and 7R) in SCRaV-encoded proteins, but the number is lower than the need for a complete RNAP in eukaryotes [32,33]. There should be host factors involved in the genome transcription of SCRaV-like viruses, as occurred in ADRV and RGV [27].
To facilitate virus infection, viruses usually encode multiple proteins to regulate cellular processes [34]. Immune responses are important strategies to resist virus infection. It has been reported that two proteins of ranaviruses, the homolog of RNase III and eIF2α, have the ability to regulate the activation of host interferon responses [35][36][37][38]. The two proteins were both identified in SCRaV encoded proteins (23R and 26R), although the eIF2α homolog of SCRaV only has a sequence identity of about 30% with corresponding homologs of other ranaviruses. Other cellular processes include inflammation and apoptosis. SCRaV encodes homologs of LITAF, TNFR, and apoptosis regulator (61L, 72L, 41L, and 70L), which could have functions in the regulation of cell death and inflammation and prompt virus infection, as reported in other ranaviruses [39][40][41]. Interestingly, SCRaV-like ranavirus was found to encode a homolog of insulin-like growth factor (SCRaV 95L). Its homolog in ranaviruses was only found in SGIV, which could modulate cell proliferation and apoptosis [42]. In vitro synthesized viral insulin-like peptides have activities in mammalian cells [43]. However, its function in SCRaV-like viruses in vivo need to be investigated in the future.
It should be noted that there are 5 predicted proteins containing characters of LPXTGanchored collagen-like adhesins that are mainly found in Enterococci and function as a virulence factor [44]. Whether they also have a function in viral virulence in SCRaV and MSRaV infection remains unknown up to now.
In conclusion, the present study provided a complete genome analysis for SCRaV/ MSRaV/LMBV-like ranaviruses, especially the genome architecture and variations compared with other ranaviruses. These results provided new information for understanding the genetic evolution of ranaviruses from fish species and other animals and also facilitated the early warning of fish ranavirus epidemics.

Data Availability Statement:
The complete genome sequence of SCRaV and MSRaV have been submitted into NCBI GenBank. The accession number of SCRaV is OQ267588, and MSRaV is OQ267587. Data is contained within the article or supplementary material. The raw data is available upon reasonable request from the corresponding author.