Viruses of Rhodophyta: lack of cultures and genomic resources pose a threat to the growing red algal aquaculture industry

ABSTRACT There is a growing global demand for algal-derived products as they offer alternatives that can help to mitigate climate change, support coastal biodiversity and reach food sovereignty. Red algae (Rhodophyta) have been cultivated for hundreds of years and currently support an economically important industry worldwide. However, pathogen outbreaks pose a major potential threat to this growing industry, in some cases already causing more than $1 million loss annually, as in Pyropia (Bangiaceae) farms. Some of the most destructive algal pathogens, although poorly understood, are viruses. They are highly diverse and abundant, exerting strong pressure on the life cycle of their hosts. Knowledge of red algal viruses has developed at a much slower pace than for their green algal counterparts, even though it was in a red algal species, Sirodotia tenuissima (Batrachospermaceae), that the first observation of viruses in eukaryotic algae was made. Furthermore, it was only in 2016 that the first Rhodophyta viruses were isolated, and to date only three have sequenced genomes, all RNA viruses, isolated from the hosts Delisea pulchra, Chondrus crispus and Pyropia suborbiculata. Given that viruses are prevalent and constitute major threats to all groups of life and that genomic resources are still lacking, they could pose a serious menace to ongoing aquaculture endeavours. Therefore, here we report a thorough and comprehensive (though not exhaustive) review of studies on Rhodophyta viruses, suggesting a set of propositions to advance knowledge in the field and to encourage more active focussed investigation. This will benefit not only red algal aquaculture but also shed light on the associated viral diversity, evolution and viral impacts on the life history of their red algal hosts.


Introduction
Rhodophyta, the red algae, is a highly diverse group of photosynthetic eukaryotes that comprises unicellular, foliose, cylindrical and crustose species. More than 7,000 species of Rhodophyta have been described (Guiry & Guiry, 2022), but only a few species are used commercially such as in polysaccharide extraction or for direct consumption. Several metabolites from red macroalgae have antibiotic, antioxidant, and antiinflammatory activities (Cardozo et al., 2007;Dos Santos Amorim et al., 2012;Gressler et al., 2010). The study of these algae is also critical for understanding eukaryote evolution since secondary endosymbiosis events probably occurred via engulfment of a unicellular rhodophyte (Yoon, Hackett, Ciniglia, Pinto, & Bhattacharya, 2004). Also, due to adaptation to extreme environments, these organisms may have suffered intense genome reduction followed by genomic expansion by horizontal gene transfer and activity of transposable elements as their compact genomes display only 5,000 to 10,000 genes (Bhattacharya et al., 2018;Brawley et al., 2017;Qiu, Yoon, & Bhattacharya, 2016). However, red algal genomics is still in its infancy, with as few as 16 genomes published up to 2021, some with incomplete and/or unavailable annotated sequences, thus there is a real need for more genomes of this group, not only to gain a complete and deeper view of their evolution (Bhattacharya et al., 2015) but also to understand their metabolic potential, which that could exploited for novel materials for the food and drugs industries.
Pathogen outbreaks pose a major potential threat to this rising and promising industry, in some cases already causing large economic losses, as in Pyropia (Bangiaceae) farms, where Green-spot disease (GDS) had already caused 10% loss of total sales in 2013 (Kim, Klochkova, Lee, & Im, 2016;Kim et al., 2014). Due to the rapid expansion of these farms, new diseases are reported each year (Kim et al., 2014), mostly in Asian countries, where many types of mariculture are also affected by other diseases (Im et al., 2019).
In the environment, viruses are among the most destructive algal pathogens and, at the same time, lesser known. They constitute the most abundant biological entities on the planet and are considered as major evolutionary drivers of microbial life (Suttle, 2007). More than 40 algal viruses have been isolated so far, from more than 60 species of algae in planktonic and nonplanktonic lineages (Coy, Gann, Pound, Short, & Wilhelm, 2018;Mirza et al., 2015), though for many the specific host is unknown, or they lack complete genomic information to define their phylogenetic placement and evolutionary history, as well as their specific metabolism. While most are lytic viruses, with doublestranded DNA genomes (dsDNA) representing giant and large viruses, some representatives of RNA viruses were recently described (Charon, Marcelino, Wetherbee, Verbruggen, & Holmes, 2020). And, despite this progress in algal virology, it is still essential to increase the number of algal viral cultures and in general to create an aquatic virus collection that is readily accessible to the scientific community (Nissimov, Campbell, Probert, & Wilson, 2020). Furthermore, the sampling, cultivation and isolation of algal viruses, as well as genomic resources, still remain biased towards green and brown algal hosts.
In Rhodophyta, it appears that viral discoveries were mostly incidental, using electron microscopy to genomics, and it is surprising that a group so diverse, ancient and economically important as the red algae has generated so few studies regarding pathogen or virus interactions. We believe that one explanation is that most of the work performed in this group, although pioneering, was during a pre-genomic era, scattered across journals that were not virus or protist focused. They were mostly describing micrograph observations of viral-like particles (VLPs), that is, structures resembling viral capsids or viral putative inclusions but without biochemical or genetic evidence of their viral nature. The lack of studies of these viruses could be attributed to reasons such as (1) presence of VLPs in certain life stages only, (2) low concentration and/or low number of VLPs per cell and (3) low numbers of infected individuals in culture collections. Another possibility for the lack of research on red algal viruses could be that some of these viruses possess life cycles that are very different and/or deviant from other known algal viruses, where most are lytic, destroying the host cells and therefore easier to observe and more obvious to the observer. Rhodophyta viruses could have latent, or lysogenic life cycle strategies, becoming silent infections and coexisting through generations without destroying the host cells. Finally, these viruses could contain sequences divergent from sequenced algal viruses, thus they are not retrieved using only primary sequence similarity searches.
Currently, much of the evidence for viruses reported for this group is in the form of endogenous viral elements (EVEs) identified in the genomes and transcriptomes of red algal hosts, and also observations of VLPs (Table 1). More cytological investigations and the detailed biochemical structure of these viruses are critical needs for red algal aquaculture. It is suggested that with the increase in red algal cultivation, more viruses and other parasites will come to light (Kim et al., 2014). Viruses could cause intense damage to hatcheries, where several different strains are maintained, so understanding these viruses is of great importance in maintaining seaweed biodiversity and food security. Thus, a more directed research programme to survey for red algal viral associations is highly desirable and needs urgent implementation. As abiotic factors can also contribute to a higher infestation by pathogens, anthropogenic global warming might also increase viral diseases in red seaweed in the near future (Ward et al., 2020).

A timeline of discoveries of Rhodophyta viruses
Although the first reports of a non-bacterial lysogenic agent in eukaryotic algae putatively suggested as a virus dates back to 1966 (Tikhonenko & Zavarzina, 1966), the first clear microscopic evidence linking a host cell to putative viruses came from a red alga, Sirodotia tenuissima, in 1971 (Lee, 1971) ( Figure 1a). In this freshwater species from the family Batrachospermaceae, class Florideophyceae, VLPs ranging from 50 to 60 nm in size were observed in the host cytoplasm, exhibited a polygonal shape organized in crystalline arrays. Ultrastructural observations on other green and brown algal species (Pickett-Heaps, 1972;Toth & Wilce, 2008), together with abnormal features of these algal cells, reinforced the suggestion of the viral nature of such VLPs. In 1973, Chapman & Lang (1973) observed cytoplasmic and nuclear inclusions in Porphyridium purpureum (Porphyridiophyceae), tentatively named as "concentrosomes". Investigations on a different strain also found small (40 nm) circular and polygonal particles, some organized in array forms, although without proof of infectivity ( Figure 1b).
In 1976, gall structures were observed in Gracilaria verrucosa (Florideophyceae) (Tripodi & Beth, 1976), and were associated with "caterpillar-like" and fusiform bodies organized in rows. The hypothesis that these structures were related to viruses was discussed, suggesting that infective viruses could result in gall formation in algae, as observed in other species when infected by some bacteria (McBride, Kugrens, & West, 1974). Tumour-like and gall structures are well studied in plants and are generally associated with pathogen infection by insects and viruses as well as bacteria. These structures are characterized by localized abnormal tissue growth (Goecke et al., 2012). When caused by insects they can be as highly organized and complex as a normal plant organ, although the precise mechanisms behind their formation are not well understood, much less when caused by viruses (Schultz, Edger, Body, & Appel, 2019). In macroalgae, galls are morphologically characterized by host cell hyperplasia and hypertrophy producing an abnormal callus-like unorganized cell proliferation (Apt, 1988). In fact, these structures were first studied in detail in a red algal genus, Prionitis, family Halymeniaceae (McBride et al., 1974), class Florideophyceae, and as early as in 1970, viral particles were conjectured to be the cause of the observed wartlike protuberances (Chiang, 1970).
During the 1980s, despite a growing body of literature showing diverse VLPs and their putative effects in several other eukaryotic algal groups, to the best of our knowledge, there were no VLP reports for any Rhodophyta species, apart from a brief mention of unpublished observations of VLPs associated with tumour-like growth in Gracilaria epihippisora (Apt, 1988), not suggested as the direct causative agent of these tumours. These gall structures ( Figure 1d) were composed of cells 20-40 µm in diameter, pigmented, with convoluted and irregular morphology (Apt & Gibor, 1991). In Acrochaetium (Audouinella) saviana, family Acrochaetiaceae, class Florideophyceae, a different kind of VLP inclusion was reported (Pueschel, 1995), unlike previous observations in red algae. These were long, moderately electron-dense rods of ~30 nm, clustered within dilated endoplasmic reticulum cisternae, but there was no proof of infectivity for this or for any other red algal VLP described by then.
In the ultrastructural description of post-fertilization development of Cryptopleura ruprechtiana (family Delesseriaceae), crystalline inclusions with ~3 µm of VLPs, together with darkly staining material were observed outside the nucleus in the cytoplasm of auxiliary cells (Delivopoulos, 2003), which in red algae initiate the generation of carposporophyte in their triphasic life history (Papenfuss, 1957). Francki, Milne, & Hatta (1985) suggested that VLPs observed in their study showed some resemblance to plant viruses, in this case Lettuce necrotic yellows virus (LNYV). However, it was unclear if there was any link between VLPs and developmental stages in this species.
After these microscopical observations, there was a decadal gap until the next report of red algal VLPs, an intriguing study by West and collaborators (West et al., 2013). These authors described several galls in cultured isolates of the filamentous red algae Bostrychia spp. (Rhodomelaceae), suggesting the possibility of viruses being the causative agents, since VLPs were observed in these tissues. They hypothesized that a virus may be latent in some Bostrychia spp., since they experimentally observed that gall formation was induced by low temperature. Furthermore, since a higher level of gall formation was observed on males and bisexual thalli, they suggested that the frequent release of spermatia from spermatangia on male branches would make these open surface areas more prone to attachment by parasites such as viruses.
Hosts when infected by latent viruses do not show any visible disease symptoms, the viruses remaining "silent" in the host cell when copies of the viral genome persist as an episome in the host cytoplasm or become integrated into the host genome. However, these viruses can be activated when triggered by environmental stresses, such as nutrient depletion, high UV electromagnetic radiation or abrupt changes in temperature (Lawrence, Wilson, Davy, Davy, & Lindell, 2014;Roossinck, 2005;Speck & Ganem, 2010;Takahashi, Fukuhara, Kitazawa, & Kormelink, 2019). This leads to a disease phenotype of the host or the lysis and destruction of the cell in unicellular organisms, and the "silent" strategy can be exploited by both DNA and RNA viruses, observed in bacterial, plant, animal and algal viruses. One of the most dramatic and profound consequences of viral infections, generally but not only when latent, is the integration of complete or partial viral gene sequences into the host genome through horizontal gene transfer events (HGTs). When there is a complete viral genome, these sequences are classified as a "provirus", or when near complete or partial genes, they are generally denoted as "endogenous viral elements" (EVEs).
The endogenization event is believed to have resulted mostly from accidental integration encoded by the viral genome or by the host as DNA repair or retrotranspositions (Feschotte & Gilbert, 2012;Katzourakis et al., 2010). Once these viral genes or genomes become integrated into chromosomes, they can be inherited with host alleles, being eliminated from the host gene pool in a small number of generations, but can also increase in frequency reaching fixation (Katzourakis et al., 2010), especially if EVEs provide a fitness gain for the host genome. Expression of EVEs may be beneficial to their recipient host, for example, by causing persistence/latency or conferring resistance to infective exogenous viruses (Takahashi et al., 2019). However, EVEs' biological roles are not entirely clear or understood in the majority of cases, specially in algal genomes, where recently giant viruses EVEs were found populating several Chlorophyte green algal genomes, in some cases amounting to 10% of open reading frames (ORFs) of the recipient genomes, impacting genome composition and potentially the evolution of these green algae (Moniruzzaman, Weinheimer, Martinez-Gutierrez, & Aylward, 2020).
In fact, the very first genomic evidence for viruses in Rhodophyta came to light in the form of EVEs (Wang et al., 2014), where in total 39 EVE copies were found in genomes and transcriptomes of 16 red algal species, such as in the genome of the extremophile red algae Cyanidioschyzon merolae (6 EVEs), and in transcriptomes of Pyropia yezoensis (8 EVEs), Gracilaria blodgettii (4 EVEs), Grateloupia chiangii (3 EVEs), Chondrus crispus (2 EVEs) and several others (full list in Table 1). The viral taxonomy at the genus level of these EVEs was Chlorovirus (44 EVEs), Phaeovirus (23 EVEs), and Coccolithovirus (1 EVE), from the Phycodnaviridae family, which are generally found infecting green and brown algal hosts. Although these individual numbers are low, these findings could be an underestimation of the true scale of viral integration events, given the number of sequenced algal genomes and transcriptomes at the time, specially for red algae. If there had been an ancient integration event, given that once an exogenous fragment is inserted into a novel recipient genome and is inherited, sequence evolution tends to erase signatures of their original donor, becoming much similar to the novel recipient genome, a process known as amelioration which was first described in bacterial genomes (Lawrence & Ochman, 1997;Ravenhall, Škunca, Lassalle, Dessimoz, & Wodak, 2015).
The year 2016 saw many viral discoveries in red algae, compared to previous years, starting with the first description of the viral community associated with the marine red macroalga, Delisea pulchra (Bonnemaisoniaceae) through virome sequencing and transmission electron microscopy (TEM) (Figure 1c) (Lachnit, Thomas, & Steinberg, 2016). This species dominates subtidal seaweed communities in temperate and subtropical Australia. It has a complex microbiome that when disrupted, results in a bleaching disease (Fernandes et al., 2011). TEM showed a range of VLP morphotypes, including icosahedral particles of 30 and 40 nm and coiled pleomorphic to bacilliform particles (Lachnit et al., 2016). Their virome analysis was focused only on describing the RNA viral fraction, given its pathogenic importance (Roossinck, 2012). Of 143 contigs in total of their sampling that matched with viruses, ~20% were taxonomically assigned to ssRNA viruses, and ~80 to dsRNA viruses. The majority of ssRNA viruses showed the highest sequence similarity to Heterosigma akashiwo RNA virus and Chaetoceros sp. RNA virus, both infecting Ochrophyta host algae from the Stramenopiles group. The most dominant virus in the virome showed highest sequence similarity to Asterionellopsis glacialis RNA virus (Agla RNA virus), a diatom-infecting virus, and it was possible to assemble a near complete viral genome. Finally, only a few sequences were related to plant viruses. Regarding affiliation at lower taxonomic levels, D. pulchra dsRNA viruses showed a phylogenetic relationship with the families Partitiviridae and Totiviridae, where the gene RdRp formed multiple, distinct clusters within the genus Totivirus, while the ssRNA sequences clustered within the Picornavirales, with its closest phylogenetic relationship to A. glacialis RNA virus.
Further indication of the existence of HGT events between red algae and viruses came later in 2016. By sequencing, the plasmid genomes from five red algal species, including Gelidium elegans (family Gelidiaceae), Sporolithon durum (family Sporolithaceae), and Pyropia (Porphyra) pulchra (family Bangiaceae), Lee et al. (2016) suggested the possibility of viruses mediating HGTs as the origin of some of these plasmid ORFs. These authors demonstrated that diverse circular DNA viruses were phylogenetically grouped with these red algal plasmid ORFs, as well as the replicase gene from P. pulchra, which was also detected in both nuclear and organellar genomes. They had previously been shown to contain conserved motifs of and phylogenetic affiliations to geminiviruses.
Also in 2016, the first link between red algal diseases and viruses was reported and a novel virus was isolated and characterized that infected species of Pyropia (family Bangiaceae): P. yezoensis, P. tenera, and P. dentata . This virus, named Pyropiainfecting virus 1 (PyroV1), is suggested as the causative agent of the Green-spot disease (GSD) (Figure 1e). Symptoms include the development of numerous holes on the infected blades, where the infected area grows concentrically as a pinkish border around a green centre, that may lead to lysis of the whole blade (Fujita, 1990;Im et al., 2019). Viral particles were isometric, apparently spherical, ~100 nm with a distinct electrondense core, and around 400 viral particles were observed in the infected host cell. This disease reported in Japanese Pyropia farms is one of the most common and serious diseases in Korean farms, and about half of Pyropia sold in the markets showed some traces of these infections, causing significant economic losses (Kim et al., 2014). Additionally, these authors tested the virus infectivity in other red algal species such as Pyropia plicata, Bostrychia tenuissima, and Griffithsia monilis with any sign of infection, suggesting that the virus was only infectious to Pyropia and specific to Korean species, since P. plicata and Porphyra lucasii were from New Zealand and Australia, respectively. Interestingly, no infection was observed in the diploid stages (conchocelis) of the susceptible Pyropia species. Although these authors did not extract PyroV1 genomic material, they suggested that since it was infecting the chloroplast, like plant RNA viruses and that the infectivity was stable, it could be an RNA virus.
Finally, a serendipitous discovery was made during the Chondrus crispus (family Gigartinaceae, class Florideophyceae) genome project, when a gel band of an unusual size of ~6,000 bp was observed in the RNA purification step, revealing a partial RNA viral genome directly extracted from this seaweed (Rousvoal et al., 2016). Through sequencing and bioinformatic analysis, this band revealed partial sequences similarities with dsRNA viruses from the totiviruses group, which was known to infect fungal and protist hosts, but not algae. Even though these authors did not find evidence for VLPs by inspecting transmission electron micrograms, and were unable to successfully purify any viral particles, suggesting that C. crispus did not have viruses under normal culture conditions, by searching for the presence of CcV sequences in the C. crispus nuclear genome, a non-identical copy of the gag gene was found. The expression of this gene in RNAseq data suggests the possibility of this viral DNA being integrated into the red algal host genome through HGT. Furthermore, these authors tested several other C. crispus strains, in order to exclude the contamination hypothesis, and to verify if this finding was specific to the strain that was first discovered. They found that this prominent 6,000 bp band was also present in eight other species, such as Laurencia pinnatifida, and Porphyra/ Pyropia sp. (Bangiophyceae) among others (full species list in Table 1). These viral-like sequences were present in various life history stages and strains of C. crispus, and in other red algal species, indicating that the presence of these viruses is much more widespread than previously thought. Since these viruses apparently have no effects in their hosts, their observation is difficult.
In 2019, Im et al. (2019) were investigating the molecular mechanisms and genetic defences against pathogens in Pyropia. They interrogated the transcriptomic profiles of P. tenera against its most common parasites, such as the GSD (green-spot disease) agent PyroV1 virus, searching for genes associated with host/virus interactions. They found some uniquely up-regulated genes when P. tenera was experimentally infected with PyroV1, such as serine/threonine protein kinases, which in plants are known to have a role in defending the host against viral infections (Zhang et al., 2021). These infection profiles also induced up-regulation of more than 20 genes involved in DNA/RNA metabolism, that although not directly related to pathogenicity, could have been exploited by infecting viruses, such as retrovirus-related Pol polyprotein and RNA reverse transcriptase. They suggested that this exploitation strategy, which hijacks host cellular machinery, could explain the prevalence of this virus in Korean Pyropia sea farms. Other upregulated genes were related to senescence, a stress response of the host that could protect it against pathogens and also prevent virus replication (Seoane, Vidal, Bouzaher, El Motiam, & Rivas, 2020). Regarding downregulated genes it was found that genes involved in respiratory burst oxidase, one of the earliest cellular responses following successful pathogen recognition (Torres, Jones, & Dangl, 2006), and alternative oxidases, thioredoxin, manganese and copper/zinc superoxide dismutases, were also down-regulated in PyroV1 infection. While it is known that superoxide dismutases also provide defence against pathogens, this indicated that reactive oxygen species (ROS) generation, which protects the hosts from pathogen invasion, may not be effective enough to stop the progress of infection in Pyropia (Kang et al., 2014). Altogether, this study successfully showed that there is a known set of homologous genes to other organisms that are involved in pathogen recognition and defence in Pyropia, and highlighted the need to understand the genetics and metabolism exploited by parasites, such as the PyroV1. This knowledge may help in the development of counter and preventive measures to fight these diseases, such as the selection and/or engineering of resistant strains for crops.
By collecting samples of marine macroalgae in Inland Sea, Japan, extracting RNA, and applying the fragmented and primer ligated dsRNA sequencing (FLDS) method Chiba et al. (2020) obtained the putative complete genome sequences of novel RNA virus genomes associated with red algae, but also with diatoms and brown algae. This FLDS method is able to retrieve fulllength viral RNA genome segments, from dsRNA to ssRNA viral genomes. In the case of the red alga identified as Pyropia suborbiculata, these researchers found a dsRNA 3 kbp band, suggesting the presence of a RNA virus, which after sequencing, and gene prediction, showed similarities with non-segmented dsRNA viruses from the Totiviridae family. This virus, named Red algae totivirus 1 (RaTV1), although presenting a typical genomic structure of Totiviridae (Jamal et al., 2019), occupies a distinct position from the Totivirus previously identified in the red alga D. pulchra (Lachnit et al., 2016). Furthermore, this virus did not cluster phylogenetically with any previously established genera in Totiviridae, suggesting that it represents a novel viral genus, probably specific to this red algal group.
Transcriptomic libraries can also be used to search for the presence of infecting and/or latent viruses with both DNA and RNA genomes and, in the case of highquality assemblies, even indicate the presence of EVEs, as long as they are flanked in the same contig with hostspecific non-viral genes, thus indicating integration and not contamination (infection for instance). Using this approach to explore the diversity of RNA viruses associated with microalgae, (Charon, Murray, & Holmes, 2021) selected the RNA-dependent RNA polymerase (RdRp), the most conserved protein of RNA viruses, and performed sequence and structural-based searches in 570 transcriptomes from 19 major microalgal lineages sequences in the Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP) (Keeling et al., 2014). Evidence for RNA viruses was found in eight of the 19 major groups of unicellular algae, mostly Bacillariophyta, Dinophyceae, and Haptophyceae (given that the majority of transcriptomes were from this group), but also in a few other taxa as Rhodophyta, in this case Rhodosorus marinus (family Stylonemataceae) and Rhodella maculata (family Glaucosphaeraceae). For R. marinus (identifier MMETSP0011), two viral genomes were found: a partial genome, Aethusa amalgalike virus, in the Amalgaviridae family, with similarities to the fungus Zygosaccharomyces bailii virus Z, and a full-length genome, Otus toti-like virus, that clustered phylogenetically with the Delisea pulchra totivirus IndA (Lachnit et al., 2016). For R. maculata (identifier MMETSP0167), a lull-length genome, Despoena mitolike virus was found, with a best sequence hit with the plant virus Soybean leaf-associated mitovirus 1. This study not only extended the knowledge of RNA virus hosts and diversity in other unicellular algal taxa, taking advantage of the existence of large transcriptomic databases, some underexplored, but also provided important evidence for viruses infecting unicellular rhodophytes, where evidence beyond VLP micrographs was lacking.
Lastly, (Nelson et al., 2021) sequenced 107 new microalgal genomes spanning 11 phyla, screening for EVEs in these algal genomes, shedding light on the nature of virus-microalgal evolution given the acquisition of viral gene families (VFAMs). These viral elements could be interpreted as records of past viral infection, ancient when shared by several species within a group, and could even resolve questions regarding their environmental sources and specific hosts. In the case of Rhodophyta genomes, from the total novel genomes that were sequenced, they found evidence for VFAMs in Cyanidium caldarium, Galdieria sulphuraria, Porphyridium cruentum and Porphyridium sp. but they also found VFAMs in the already sequenced Cyanidioschyzon merolae, Chondrus crispus, and Porphyridium purpureum. Other multiple microalgal lineages were found to contain remnants of infections represented as VFAMs, as well as distinct VFAMs collections, although for the Rhodophyta VFAM cluster, these profiles seem much more conserved in most algal lineages, indicating that viral integration contributed to and shaped the life story of these algae from early on. Furthermore, by using long-read sequencing, it was possible to rule out contamination from infective viruses, since it was found that VFAMs were interspersed with known algal genes in these contigs. This work demonstrates the possibility that many more already sequenced red algal genomes could bear EVEs, or other viral HGT signatures revealing hidden and ancient interactions with hidden viruses.

The genomes from red algal viruses
To date, there are only three red algal viruses with sequenced genomes: Delisea pulchra virus (Lachnit et al., 2016), Chondrus crispus virus (CcV) (Rousvoal et al., 2016), and the red alga1 totivirus 1 (RaTV1) isolated from Pyropia suborbiculata (Chiba et al., 2020), which are all RNA viruses. Next, we briefly discuss and compare their genome characteristics (in chronological order of description), genetic architecture and similarities (Figure 2), and we will also present a phylogenetic tree from their coding genes (Figure 3 -sequence information supplemental material online), to infer the relationships between these viruses and other viral representatives.  Figure 2. Genomic architecture of red algal viruses. Genome maps from the available red algal genomes, showing the length of the coding sequences RdRp/non-structural polyprotein (RNA-dependent RNA polymerase) (arrows in yellow for AglaRNAV diatom virus and red for red algae viruses), and coat/capsid/structural polyprotein (black arrows). Amino Acid sequence identity represented in the key (right) given local blastP similarities. Since the architecture of the virus CcV is unknown, we joined each pair of sequences in the same group. The CcV sequence showed the highest amino acid similarities within its own grouping (47.984 between CcV4 and CcV3 for the capsid). Delisea pulchra virus showed the highest similarity to AglaRNAV, from 35.465% (for the coat/structural protein) to 43.24% RdRp/ non-structural polyprotein. Even though these viruses are from red algae, they displayed a very low similarity, probably reflecting that they each represent divergent and highly novel viral groups, with different genomic architectures. Scale in 1kb. Abbreviations: AglaRNAV (Asterionellopsis glacialis RNA virus), D. pul RNAV (Delisea pulchra virus), RaTV1 (Red algae totivirus 1), CcV1-3 (Chondrus crispus virus). Genome maps were drawn in genoPlotR (Guy, Roat Kultima, & Siv, 2010) in the R environment (version 3.4.2) with final manual curation. . Colour coded circles on the right represent host taxa affiliation. None of the red algal viruses clustered in the same group in both trees, with exception of CcV, which probably represent a quasispecies entity. However most red algal viruses are clustered with diatom viruses (Bacillariophyta), but also fungi and metazoan viruses. Amino acid sequences were subjected to blastP online to retrieve best hits and later downloaded from genbank (full species names and accession numbers in supplemental material online). Sequences were aligned with MAFFT v7.123b (Katoh & Standley, 2013) using the L-INS-i algorithm and the removal of poorly aligned regions with trimAl version 1.2 (method automated1) (Capella-Gutiérrez, Silla-Martínez, & Gabaldón, 2009). Maximum likelihood phylogenetic trees were built in IQ-TREE v2.0.3 (Nguyen, Schmidt, von Haeseler, & Minh, 2015) with automated best fit model selection and branch support assessed through 1000 ultrafast bootstrap replicates. Phylogeny was visualized and plotted using the package ggtree (Yu et al., 2017) in the R environment (version 3.4.2) with final manual curation.
Delisea pulchra virus is a single-stranded RNA virus (ssRNA), showing the highest sequence similarity to another RNA virus which infects the diatom (Bacillariophyta) Asterionellopsis glacialis, Asterionellopsis glacialis RNA virus (Agla RNA virus) from the Marnaviridae family. The near complete genome of D. pulchra virus is 9581 bp in length (Genbank accession number KT455464), and encodes two CDSs, both the replication associated protein, which contains the RNAdependent RNA polymerase (RdRp) gene, and the structural polyprotein, which contains the putative capsid. When compared with the polyproteins from Agla RNA virus, it shows a protein sequence similarity of 29 and 23% for the replication associated polyprotein and the structural polyprotein, respectively. The replication associated polyprotein consists of an RNA helicase or replication associated protein AAA, transmembrane proteins and the RdRP, while the structural polyprotein contains the protein domains Rhv, DicistroVP4, Calici coat, and CRPV capsid.
A probably double-stranded RNA virus, C. crispus virus (CcV), is more similar to fungi-infecting viruses such as Xanthophyllomyces dendrorhous virus (XdV) from the Totiviridae family, thus suggesting that CcV resides in this viral family. The organization of the CcV genome is similar to other totiviruses, such as having two overlapping ORFs, a 5'-ORF containing a partial capsid (gag) and a 3'-ORF coding an RdRp polymerase. Phylogenetically, RdRp genes from CcV clustered closely with XdV and within other totivirus-related viruses. Also the gag (capsid) gene clustered with XdV. Despite (Rousvoal et al., 2016) numerous efforts to obtain the full length complete sequence, their methods only retrieved new sequences instead of a completed one. They thus suggested that they were observing quasispecies or mutant swarms of CcVs, given that 20-65% amino acid identity were found between these sequences. Since there is a high error rate of viral replication using the RdRp gene, quasispecies formation is a phenomenon known to occur in RNA viruses, where a large number of viral genomes forms a population within the same host (Domingo & Perales, 2019).
And finally, the red algal totivirus 1 (RaTV1), from P. suborbiculata (accession numbers: LC521321-LC521329), is suggested to be a non-segmented dsRNA virus, although it is described as three segments, RNA1-3, presenting typical features from the Totiviridae family, such as overlapping between the coat protein (capsid) and the RdRp gene regions. Segment 1 is 5031 bp in length, while segments 2 and 3 are 2 627 and 2 623 nt, respectively. However, the predicted amino acid sequences from segments 2 and 3 have no matches with other known proteins in the NCBI nr and Pfam databases (we checked again in Genbank in November 2021). These authors suggested that segments 2 and 3 could be structures similar to the M satellite RNAs, from some totiviruses (Tipper & Schmitt, 1991). Phylogenetic analysis of RaTV1 showed that it could not be clustered with any genera of totiviruses, including the D. pulchra totivirus, suggesting the possibility of a novel genus of viruses within Totiviridae hosted by red algae.

Strategies and propositions to advance studies of red algal viruses
To conclude, here we present a non-exhaustive set of proposals, particularly using genomic methods, to advance knowledge of Rhodophyta viruses, mostly by exploring already available datasets.

Screening for non-target sequences, in this case viruses in raw sequence data and pre-filtered contigs in red algal genome and transcriptome projects
Since complete decontamination from a host target organism during a genome project is challenging, genome and transcriptome read and filtered contig data could hide valuable information regarding organisms that live externally, attached or not to the cell surface, or intracellularly, as mutualistic, parasitic or commensal endosymbionts. These sequences are usually discarded pre or post assembly, given high differences in guaninecytosine content (GC) from the target genome, presence of non-target genes in a contig, or high similarity to non-target species, thus flagged as contaminants and binned. We suggest using known algal virus sequences, to perform similarity searches and compositional methods such as GC or tetranucleotide skewing, during the acquisition of Rhodophyta novel genomes, but also to exhaustively search retroactively for the presence of such sequences in previous datasets. These methods and others (Kumar, Jones, Koutsovoulos, Clarke, & Blaxter, 2013) were proven successful in discovering unnoticed known and novel bacterial taxa with host genome or transcriptome projects, such as in bees (Gerth & Hurst, 2017), Caenorhabditis (Fierst, Murdock, Thanthiriwatte, Willis, & Phillips, 2017), Drosophila (Salzberg et al., 2005), human tissues (Olarerin-George & Hogenesch, 2015), but also in plants (Chialva et al., 2020;Gathercole et al., 2021). Novel viruses were also discovered with these strategies, particularly interesting in Symbiodiniaceae dinoflagellates (Brüwer, Agrawal, Liew, Aranda, & Voolstra, 2017) and Cnidarians (Brüwer & Voolstra, 2018;Lewandowska, Hazan, & Moran, 2020).

Search for viral horizontal gene transfer (HGT) events and EVEs in contigs from red algae genome projects
It was noted above that HGT was an important generator of novelties in Rhodophyta genomic evolution, by boosting niche specialization and adaptations. At least 1% of genes in cyanidiophycean genomes derive from HGT (the so-called "1% rule") (Etten & Bhattacharya, 2020), 5% in G. sulphuraria, and up to 9% in Porphyridium purpureum Schönknecht et al., 2013). However, most reported HGTs in red algae occurred between bacteria and other eukaryotes, with the exception of the already cited work that found viral HGTs Nelson et al., 2021;Rousvoal et al., 2016;Wang et al., 2014). Thus, since to date there are at least three viral genomes described for this group, a thorough and systematic search could be conducted in the available red algal genome projects, and also in transcriptomes from MMETSP (Keeling et al., 2014), and 1KP projects (Matasci et al., 2014) (carefully given that sequences are less contiguous), using the known genes from CcV (Chondrus crispus virus), RaTV1 (Red algae totivirus 1) and Delisea pulchra virus, together with a full and complete viral database not only from algal hosts, including environmental sequences. HGTs must be validated beyond sequence similarity matches using phylogenetic methods where candidates are topologically clustered with viruses, ensuring that candidate viral HGT sequences are contiguous and nested to host native genes (Richards & Monier, 2016). Also, intron structure and eukaryotic features should be present, and preferably, candidates should be present in more than one genome and transcriptome in independent datasets. These methods were effectively carried out, reporting novel viral HGTs and also indicating putative present and past infections, in the Chlorarachniophyte algae Bigelowiella natans (Cercozoa) genome, where a novel group of EVEs, the provirophages, was found integrated in the algal nuclear genome hinting at past interactions with giant viruses (Blanc, Gallot-Lavallée, & Maumus, 2015). Also, at least one of the five conserved giant virus core genes was found integrated integrated in 66 eukaryotic genome or transcriptome datasets, mostly protists (Gallot-Lavallée & Blanc, 2017), and past infections by giant viruses were also suggested at some point in the genomic evolution of land plants such as S. moellendorffii and P. patens giving the integration of such genes (Maumus, Epert, Nogué, & Blanc, 2014).

Exploration of marker viral red algal genes such as RdRp and capsid in publicly available metagenomic datasets
By using known marker genes from the red algal virus genomes described, and sequences reported in the future, as seeds in large databases such as the Tara Oceans and Global Ocean Sampling (GOS), specially within the user-friendly web application Ocean Gene Atlas (Villar et al., 2018), and other publicly available metagenomic datasets, it would be possible to reveal not only the biogeographical boundaries and global distribution patterns of known red algal viruses but also to find near homologs for these viruses, which would improve the phylogenetic resolution to better delineate genera and families of these viruses. Furthermore, novel associations could be predicted by using cooccurrence, networks and also HGT methods focusing in red algal hosts and putative viral sequences (Roux, Hallam, Woyke, & Sullivan, 2015;Schulz et al., 2020)

Built partnerships and collaborations with private and public red algal farm growers
Here, we propose that established or novel viral research groups should develop active collaborations with people directly involved in red algal cultivation and processing for food and other products, to perform intense genomic surveillance in these algal farms, and to monitor disease phenotypes that could be caused by viruses. This could be accomplished by discussion in aquaculture workshops, bringing to the public discourse the urgent need for more red algal viral cultures and genomes. If species-specific viruses are discovered and successfully propagated, they could even be used to solve other issues, such as to treat epiphytic infections, mainly from other rhodophytes, like Polysiphonia, Hypnea and Melanothamnus (Ward et al., 2020) but also from bacteria in the case of specific bacteriophages associated with red algal holobionts. The development of modern technologies to study hatchery physiology (Alves-lima, Teixeira, Hotta, & Colepicolo, 2018) and molecular biology advances (Alves-Lima et al., 2017;Vorphal & Bunster, 2016) are also of great interest to selection of pathogen and stress-resistant strains.

Concluding remarks
Further knowledge of and resources in red algal virology could be achieved by using established red algal culture collections for classic virus plate assay isolation, generation of broad viromes focused in wild and in culture samples, and most importantly, acquisition of high-quality genomic and transcriptomic resources from viral isolates. It is desirable of course to generate novel genomes and transcriptomes from red algal hosts, particularly in the case of transcriptomes, by sequencing of on-polyadenylated transcripts, and also using long read technologies to access repetitive regions in chromosomes that may hide viral integrations. This will not only reward growing red algal aquaculture endeavours with a set of potential viruses and precise diagnostics of viral diseases and viral agents in the field but will also cast light on the red algal associated viral diversity, biogeographical distribution, evolution and impacts on the life story of their hosts. Consequently, it will certainly benefit red algal studies as a whole, from genomics and ecology to applied research and sustainable parasite-free cultivation.