Spatiotemporal Dynamics of Coastal Viral Community Structure and Potential Biogeochemical Roles Affected by an Ulva prolifera Green Tide

ABSTRACT The world’s largest macroalgal green tide, caused by Ulva prolifera, has resulted in serious consequences for coastal waters of the Yellow Sea, China. Although viruses are considered to be one of the key factors in controlling microalgal bloom demise, understanding of the relationship between viral communities and the macroalgal green tide is still poor. Here, a Qingdao coastal virome (QDCV) time-series data set was constructed based on the metagenomic analysis of 17 DNA viromes along three coastal stations of the Yellow Sea, covering different stages of the green tide from Julian days 165 to 271. A total of 40,076 viral contigs were detected and clustered into 28,058 viral operational taxonomic units (vOTUs). About 84% of the vOTUs could not be classified, and 62% separated from vOTUs in other ecosystems. Green tides significantly influenced the spatiotemporal dynamics of the viral community structure, diversity, and potential functions. For the classified vOTUs, the relative abundance of Pelagibacter phages declined with the arrival of the bloom and rebounded after the bloom, while Synechococcus and Roseobacter phages increased, although with a time lag from the peak of their hosts. More than 80% of the vOTUs reached peaks in abundance at different specific stages, and the viral peaks were correlated with specific hosts at different stages of the green tide. Most of the viral auxiliary metabolic genes (AMGs) were associated with carbon and sulfur metabolism and showed spatiotemporal dynamics relating to the degradation of the large amount of organic matter released by the green tide. IMPORTANCE To the best of our knowledge, this study is the first to investigate the responses of viruses to the world’s largest macroalgal green tide. It revealed the spatiotemporal dynamics of the unique viral assemblages and auxiliary metabolic genes (AMGs) following the variation and degradation of Ulva prolifera. These findings demonstrate a tight coupling between viral assemblages, and prokaryotic and eukaryotic abundances were influenced by the green tide.

be due to the growth of some microalgae after the green tide (25,26), resulting in an increase in chlorophyll concentration in seawater. The temperature and salinity gradually increased with the bloom, while pH values increased after the arrival of bloom (on day 165) and then gradually decreased starting on day 181 ( Fig. 1c; Table S1). There was a similar temporal pattern in nutrient concentrations at all three stations; peak values of NH 4 , SiO 3 , and NO 2 were found on day 208. Unlike other nutrients, nitrate also showed a clear upward trend after the bloom (on day 271). Viral and bacterial abundances increased several times, from day 165 to peak values on day 201 and then decreased on day 208. The virus-to-bacteria ratio (VBR) increased from day 165 to day 193 and then decreased to minimum values on day 201 (Fig. 1e). The Qingdao coastal virome (QDCV) data set contained 17 DNA virome libraries with a total of 176.85 Gb of sequence data. A total of 236,478 contigs longer than 3 kb were assembled. Combined with the prediction outputs of VirSorter, VirFinder, and Contig Annotation Tool (CAT), a total of 40,076 viral contigs were detected and clustered into 28,058 viral operational taxonomic units (vOTUs , Table S2). Most (84.8%) vOTUs were shared among the three stations, and only 6.2% of vOTUs (1.7% in ZhanQiao [ZQ], 0.9% in May 4th Square [MS], and 3.6% in XiaoGang [XG]) were at a single station (Fig. S1b). However, the vOTUs of the QDCV changed with the arrival (on day 165) and termination (on day 208) of the U. prolifera bloom (Fig. S1a), and the viral diversity, Shannon index (Shannon's H) and Peilou's J, increased from during the bloom (on days 165, 181, 193, and 208) to after the bloom (on days 243 and 271) (Fig. S1d).
The canonical correspondence analysis (CCA) between vOTUs and environmental factors showed that the virome samples could be divided into three groups (Fig. 2a), which is similar to the Principal-component analysis (PCA) results (Fig. S1c). The first two CCA axes explain 65.73% of the virome variations. The virome samples on days 165, 181, 193, and 208 were clustered together (group 1) and positively correlated with pH, temperature, salinity, bacterial abundance, and NH 4 (Fig. 2a). The virome samples on day 243 (group 2) were positively correlated with NO 3 , and the virome samples on day 271 (group 3) were positively correlated with NO 2 . The virome samples from both days 243 and 271 were negatively related to pH, temperature, salinity, bacterial abundance, and NH 4 (Fig. 2a).
More than 84% of the vOTUs could not be taxonomically classified at the family level (Fig. S2b), which was similar to the gene-sharing network analysis (Fig. S3a). From the gene-sharing network of the viral genomes ($10 kbp) of the QDCV, which was influenced by the U. prolifera bloom, and the other environmental viromes from the Integrated Microbial Genomes/Viruses (IMG/VR) v3 data set, a total of 4,025 VCs were predicted, of which 525 VCs contained U. prolifera bloom-associated vOTUs in the QDCV data sets (Fig. 3a). Overall, most of the VCs in the QDCV (326 VCs, 62.1%) were endemic viruses, and only 199 (37.9%) were clustered with viral sequences from other habitats, including 113 VCs clustered with marine-derived vOTUs, 11 VCs clustered with freshwater-derived vOTUs, 7 VCs clustered with terrestrial-derived vOTUs, 2 VCs clustered with wastewater-derived vOTUs, and 5 VCs clustered with all other habitatderived vOTUs (Fig. 3b).
Spatiotemporal dynamics of the viral community structure. To draw a picture of the succession of viruses during the different stages of U. prolifera blooms at the three stations, the temporal dynamics of the peak abundance vOTUs are presented (Fig. 2b). Briefly, vOTUs with single peaks in relative abundance after the bloom (on days 243 and 271) and wide peaks during the bloom (on days 165, 173, 181, and 208) were the most abundant ( Fig. 2b and c). In addition, 53 abundant vOTUs with an abundance greater than 1% were selected. Most (29 vOTUs) of the abundant vOTUs were previously unknown, 10 vOTUs were not classified. Of the classified vOTUs (15 vOTUs), 10 were classified as Pelagibacter phage HTVC027P (Fig. 2d).
For the 21% of vOTUs classified at the family level, most of them (ca. 94% to 95%) were classified into the Caudovirales order (Fig. S2b), followed by the families of nucleocytoplasmic large DNA viruses (NCDLV, including Phycodnaviridae and Mimiviridae) and virophages (Lavidaviridae) (Fig. S2b). Although the viral community structure of the three stations at the family level was similar, the relative abundance of different viral families varied after the arrival of U. prolifera (Fig. 4a). Viruses infecting Pelagibacter (SAR11) were the dominant group, but these decreased with the arrival of the bloom and then rebounded after the bloom (Fig. 4b). In contrast, the viruses infecting Cyanobacteria (Synechococcus phage and Prochlorococcus phage), Verrucomicrobia (Verrucomicrobia Coastal Viromes Affected by Green Tide mSystems phage), Roseobacter (Roseobacter phage), and Vibrio (Vibrio phage) increased from during the bloom to after the bloom (Fig. 4b). In addition, Phycodnaviridae infecting eukaryotic algae increased after the bloom (Fig. 4a). Lavidaviridae, such as the Yellowstone Lake virophage, Organic Lake virophage, and Chrysochromulina parva virophage, increased during the bloom and decreased after the bloom. Viruses infecting protists, such as Terrestrivirus sp., Monosiga MELD virus 2, and Klosneuvirus, and infecting eukaryotic algae, such as Pleurochrysis sp. Polinton-like virus decreased with the arrival of the bloom (Fig. 4b). In addition, the spatiotemporal dynamics of vOTUs ( Fig. 1c) was generally similar to the dynamics of VCs (Fig. S3b) and was tightly coupled to the dynamics of the community structure of the prokaryotes and eukaryotes ( Fig. S4a and  Host prediction and lineage-specific virus-host relationships. Using the sequence similarity, tRNA sequences and CRISPR spacers, putative hosts were predicted for 278 of the 12,768 QDCV vOTUs ($5 kb; Fig. 5a). Most predicted vOTUs had narrow host ranges, with only 12 potentially exhibiting a broader host range across several classes. Predicted prokaryotic hosts spanned 25 bacterial classes, with Gammaproteobacteria (33.1% of virus-host pairs) and Alphaproteobacteria (13.3%) as the most frequently predicted (Fig. 5a). The relative abundance of vOTUs linking to Alphaproteobacteria decreased from arrival of the bloom (on day 165) to after the bloom (on day 271) at all three stations, whereas the vOTUs linking to Gammaproteobacteria increased with the arrival of the bloom (on day 165) and reached peaks 2 months after the bloom (on day 271) at ZQ station and on day 193 at the MS and XG stations, respectively (Fig. 5b).
Spatiotemporal dynamics of AMGs and relationships with environmental variables. Overall, QDCV data sets were found to encode AMGs for carbohydrate, amino acid, cofactor/vitamin, and energy metabolism based on VIBRANT annotations ( Fig. S5a) (27). Six and three AMGs were affiliated with sulfur metabolism and the sulfur relay system, respectively (Fig. 6a). The AMGs relating to the sulfur relay system, such as the thiF, moeB, and mec, were more abundant after the bloom, whereas the AMGs relating to sulfur metabolism, such as cysK, cysH, and msmA, were more abundant during the bloom (Fig. 6a). CAZyme genes, such as the "lysozyme," "a-1,2-fucosyltransferase," "glycosyl transferase," and "glycoside hydrolase" families, were abundant during the bloom, whereas "phage lysozyme," "photosystem II protein D1," and "predicted chitinase" were abundant after the bloom (Fig. 6a). In addition, the "polygalacturonase," "peptideoglycan-binding protein," "lytic murein transglycosylase," and LmbE family proteins were detected during the bloom. Pearson correlation analysis showed that several dominant CAZyme genes and genes related to sulfur metabolism and sulfur relay system were correlated with temperature, pH, NO 2 , NO 3 , NH 4 , SiO 3 , and chl. a (P , 0.05) (Fig. 6b).

DISCUSSION
Although the world's largest macroalgal green tide, caused by U. prolifera in the Yellow Sea, has resulted in serious social and environmental consequences and has had serious impacts on the coastal microbial community structure and metabolic activity (9,10,28), the impacts and responses of coastal viromes is still unknown. This study has shown that green tides significantly influence the spatiotemporal dynamics of the viral community structure, diversity, and potential function and that viruses may play potentially important roles in the biodegradation of the very large amount of organic matter released by the massive macroalgal green tide.
Spatiotemporal patterns of the known and unknown viruses in the QDCV and a coupled dynamic with their putative hosts. After comparing the QDCV with the NCBI Viral RefSeq and IMG/VR v3.0 (29), it was found to contain a large fraction of unknown viruses (Fig. 2 to 4; Fig. S2 and S3), suggesting that marine viruses in general are still largely uncharacterized (30)(31)(32)(33)(34). In general, the spatiotemporal dynamics of abundance, community structure, and diversity of the viral assemblages (Fig. 1e, 2, and 4; Fig. S1 and S3) were coupled with those of their putative hosts ( Fig. 4 and 5; Fig. S4), along with the accumulation of massive amounts of algal-derived DOM after the arrival of U. prolifera (9,10,28). Massive algal-derived labile DOM promoted the rapid growth of bacteria and might have driven a rise in the corresponding abundance of viruses (Fig. 1e). By day 208, most of the labile DOM may have been consumed by bacteria (9,10), and the bacterial and viral abundances gradually decreased (Fig. 1e). After the bloom, the bacterial-derived DOM (including the remnants of dead bacteria and bacterial secretions) made an important contribution to the DOM pool (9, 10) and may have driven an increase in viral diversity (Fig. S1d) and dominant vOTUs (Fig. 2b). Due to a large proportion of unknown viruses, dynamic patterns of all vOTUs at the different stages of the bloom were characterized. Both the known (taxonomically classified viruses) and unknown (including the unclassified and previously unknown viruses) viruses had either a single or a wide peak of abundance on one or successive Julian days. Only 7% to 11% vOTUs had several peaks (Fig. 2b and c). This suggests that most viruses were influenced by the U. prolifera bloom, although there were a few that may have had a relatively stable abundance. Abundant viruses with a peak after the bloom (single 3) or during the bloom (single 1, wide 1, and wide 2), together with the CCA results (Fig. 2a), may indicate differences in the viral communities between during and after the bloom. Based on the results of viral annotations and host prediction, it is speculated that the dynamics of these known viruses, although only a minority, may also be closely related to their hosts at different stages of the bloom. The most abundant known Pelagibacter phages were more abundant after the bloom (Fig. 4b), which show a similar temporal pattern to that of their hosts, Candidatus Pelagibacter (SAR11 clade) (Liu et al., submitted; Table S5). It is speculated that the eutrophic environment during the bloom was not suitable for the growth of the oligotrophic SAR11 clade (35). Some phages whose potential hosts were associated with degradation of algae-derived organic matter were found, such as the Sulfitobacter (36), Cycloclasticus (37), Vibrio (10,38,39), Bacilli (40), and Roseobacter (41) (Tables S3 and S4). These phages had similar temporal distribution patterns to their hosts, or there was a time delay between the peak in relative abundances of the virus OTUs and the putative host OTUs. Synechococcus phages were also abundant after the bloom (Fig. 4b), whereas their putative hosts had a peak abundance during the bloom (Liu et al., submitted; Table S5). Synechococcus are well known to grow in relatively high nutrient conditions and are abundant in coastal and estuarine waters (42)(43)(44)(45). The covariation in abundance of Synechococcus spp. and cyanophages has been found in some estuarine ecosystems (42,46). Eutrophic environments in this study may also promote the growth of Synechococcus, which correspondingly caused an increase in Synechococcus phages after the bloom. Verrucomicrobia has been reported to possess numerous pathways for the assimilation of cyanobacterial extracellular polymeric substances during cyanobacterial blooms (47) and has also acquired a complex machinery for the degradation of brown macroalgal polysaccharide fucoidans (48). In this study, the increased abundance of a Verrucomicrobia phage (Fig. 4b) and its putative hosts after blooms (Liu et al., submitted; Table S5) may be due to the polysaccharide released by cyanobacteria or U. prolifera.
Due to limitations in the current virus databases and metagenomic analysis techniques, most vOTUs are unknown. However, their dominance in Qingdao coastal environments suggests that they may infect some abundant bacterial and eukaryotic populations that have not yet been identified. Since unknown viral populations account for a large portion of the viruses, and since their potential hosts and ecological role still remain largely unknown, it is clearly necessary to better understand these cryptic viral groups.
Potential virus-mediated sulfur and carbon metabolism of the organic matter released by green tides. As viruses carry and express some AMGs to mediate the metabolism of the host cells, such as genes involved in photosynthesis, carbon, sulfur and nitrogen metabolism, viruses can indirectly affect biogeochemical cycles (14). In this study, AMGs for sulfur metabolic cycles and CAZyme genes were found in the QDCV database (Fig. 6), which suggests that the viruses may assist their hosts with the degradation of the DOM from the U. prolifera blooms during the infection process and thus provide energy for their own reproduction. cysC, cysK, and cysH are predicted to participate in assimilatory sulfate reduction, and msmA is capable of degrading methanesulfonate (49); all of these AMGs have a wide geographical distribution (50). The moeB-related sulfur relay system genes were most prevalent in archaeal viral genomes and are involved in the ubiquitination process (51). Through post-translational modification of proteins, this system regulates several cellular processes, making it an ideal target for viruses to facilitate viral replication (52). Mec is a sulfur carrier protein that participates in L-cysteine biosynthesis and has not been found in other viral genomes. However, some other viral-encoded sulfur carrier protein, such as dsrC, tusE-like, and soxYZ, have been found in the Integrated Microbial Genomes/Viruses (IMG/VR v2.1) database, and phage sulfur carriers were found to be more abundant than catalytic subunits due to the greater need for sulfur carriers to drive dissimilatory sulfur transformations (53). These vOTU-encoding AMGs related to the sulfur metabolism and sulfur relay system were more abundant during the bloom and postbloom stages, respectively (Fig. 6a), suggesting their potential roles in the degradation of sulfated DOM (3).
Most AMGs were affiliated with carbohydrate metabolism, indicating that viruses can influence the prokaryotic metabolisms of carbon cycling and organic carbon decomposition during the different stages of the bloom (Fig. 6a). The high abundance of vOTUs encoding glycosyltransferase families (including a-1,2-fucosytransferase and other unclassified GT families) and glycoside hydrolase families (including polygalacturonase, lytic murein transglycosylase, and other unclassified GH families) during the bloom stage might indicate that the viruses encoding AMGs can assist bacteria to degrade the large amount DOC from U. prolifera (Fig. 6a) (3,9); this is supported by the presence of AMGs related to central carbon metabolism and viral replication in other marine environments (54). Virus-encoded CAZyme genes have also been proposed to enhance the breakdown of complex carbohydrates to promote the metabolism and energy production of the host during viral infection (55,56). In this study, several novel AMGs related to carbohydrate metabolism were detected, such as a-1,2-fucosytransferase, polygalacturonase, and lytic murein transglycosylase (Fig. 6a).
Massive macroalgal blooms can cause the death of other chitin-rich organisms, including protozoa and invertebrates, through the development of hypoxia (3,9). In this study, the relative abundance of vOTUs encoding the chitinase and peptidoglycan-binding protein increased at the TB stage or after bloom (Fig. 6a), suggesting that the viruses may be related to chitin-degradation bacteria (57). Viral-encoded chitinase has been detected in flavobacterial phages, whose hosts are usually abundant during algal bloom and may degrade the bacterial peptidoglycan function (58,59).
Conclusion. This study, for the first time, used a metagenomic analysis to investigate the community structure, diversity, life strategy, and AMGs of DNA viruses in coastal waters during a U. prolifera bloom. A large proportion of the viruses identified were novel, and the viral community structure (both of the known and unknown viruses) and AMGs showed a clear succession during the different stages of the bloom, which was consistent with the spatiotemporal dynamics of their putative hosts. For the known viruses, viruses linked to hosts with degradation of algal-derived organic matter were found, such as Sulfitobacter, Cycloclasticus, Vibrio, Bacilli, and Roseobacter phages. AMGs related to sulfur (such as cysC, cysK, cysH, and msmA) and carbon (such as polygalacturonase, a-1,2-fucosytransferase, and chitinase) metabolism were also discovered, and they may assist the hosts to degrade the algal-derived and other DOM. Whether the viruses will assist the host in degrading the liable, semiliable, and refractory DOM requires further study. This study has enhanced our understanding of the interactions between viruses and their hosts and created a preliminary view of the ecological roles of viruses in coastal waters affected by the largest U. prolifera green tide in the world.  (Fig. 1a). This included six time points (Julian days 165, 181, 193, 208, and 271) and covered the bloom stages from the arrival of the bloom to 2 months after the bloom in 2017. All of the seawater samples were filtered through 200-, 3.0-, and 0.2-mm filters (Millipore, Burlington, MA, USA) to remove particles and larger cellular microorganisms. The 3.0-and 0.2mm filters were used for 18S and 16S sequencing analysis, respectively (Liu et al., submitted). Seawater was collected with Niskin bottles for nutrient (SiO 3 , NH 4 , NO 3 , and NO 2 ) and chl. a concentration and were determined by a four-channel AA3 continuous flow AutoAnalyzer (60) and the extractive fluorescence method on a Turner Designs 10-AU field fluorometer (61), respectively. Triplicate samples (1.5 mL) for viral and bacterial abundance were fixed with glutaraldehyde (final concentration, 0.5%), frozen in liquid nitrogen, and stored at 280°C until analysis (62).

MATERIALS AND METHODS
Viral and bacterial abundance. Viral abundance was estimated following the previous methods (62, 63) with some modifications. The fixed and frozen samples (280°C) were thawed at 37°C, diluted 10-to 100-fold with 0.02-mm filtered Tris-EDTA buffer (pH 8) (Sigma-Aldrich, Hamburg, Germany) and stained with SYBR green I (final concentration of 0.5 Â 10 24 of the Molecular Probes stock solution, Thermo Fisher, USA) in the dark at 80°C. Then, the incubated samples were cooled at room temperature for 5 min and analyzed with a CytoFLEX flow cytometer (Beckman, Shanghai, China) at a total volume of 30 mL sample 21 . For bacterial enumeration (63,64), the thawed samples were diluted 10-fold with 0.02mm filtered Tris-EDTA buffer, stained with SYBR Gold (final dilution of 10 24 of the commercial stock solution) for 15 min in the dark, and analyzed with the CytoFLEX flow cytometer (Beckman) for 1 min at a delivery rate of 60 mL min 21 .
Viral concentration, DNA extraction, and metagenomic sequencing. The FeCl 3 -mediated flocculation method (65) was used to enrich the viruses. In brief, 2.5 mL of 10 g/liter Fe stock (FeCl 3 ) solution Coastal Viromes Affected by Green Tide mSystems was added to the filtered seawater with a 0.2-mm filter and mixed thoroughly. After incubating at room temperature for at least 40 min, the sample was passed through a polycarbonate membrane filter (0.8 mm; Millipore) and stored at 4°C until analysis. The filtered membranes with concentrated viral particles were resuspended in 0.1 M EDTA, 0.2 M MgCl 2 buffer (pH 6.0) at 4°C. The viruses were concentrated to about 400 mL by centrifugal ultrafiltration (membrane package: 122 Pellicon 2 Cassette, Biomax, 50 kDa; polyethersulfone). Total DNA was extracted using the QIAamp DNA minikit (Qiagen) according to the manufacturer's instructions. Library construction was conducted using a NEBNext Ultra DNA Library Prep kit (New England Biolabs, Ipswich, MA, USA), and high-throughput sequencing of the viral DNA was performed using the Illumina NovaSeq 6000 (pair-end sequencing, 2 Â 150 bp) platform (Novogene Bioinformatics Technology Co., Ltd., Nanjing, China).
Quality control, assembly, and identification of viral contigs. The raw reads were removed from the adapters using Cutadapt (66), and the high-quality and paired-end reads were filtered by Perl scripts using the following conditions: (i) no more than 20% of bases with a quality score less than 20 and (ii) no more than 30% of bases with a quality score less than 30 (67). High-quality reads in each sample were assembled using metaSPAdes (v3.12.0) (68). Contigs shorter than 3 kb were removed. The quality of these metagenomic assemblies was evaluated using MetaQUAST (v5.0.2) (69). Then, VirFinder and VirSorter (v1.0.6) were used to identify the viral contigs (70,71). Identification criteria for viral contigs: (i) VirFinder with a score $ 0.9 and P , 0.05; (ii) VirSorter in categories 1, 2, 4, and 5; (iii) VirFinder with a score $ 0.7 and P , 0.05 and VirSorter in categories 1 to 6; and (iv) contigs detected as virus by CAT, which utilizes the last common ancestor (LCA) of conservative open reading frames (ORFs) to determine the taxon (72).
The viral taxonomic classifications of the vOTUs were annotated by CAT (72); the ORFs were predicted using Prodigal and then searched against the NCBI-nr database (version of 18 June 2020), and the viral taxonomic classifications were generated based on the LCA scores of ORFs. The NCBI database was manually searched against to correct the taxonomic information when necessary. The lifestyle of the viruses (including the lytic and lysogenic lifestyle) was determined by VIBRANT (v1.2.1) using the default parameters (27).
Functional annotation and abundance of ORFs and AMGs. ORFs were predicted using metaProdigal (80). VIBRANT annotations were performed on viral contigs, and the categories "metabolic pathways" and "sulfur relay system" were considered potential AMGs (27). To further identify the protein domains, Pfamscan was applied by comparing viral ORFs to the PfamScan database (E value , 1 Â 10 25 ; bit score . 30; version Pfam33.1) (81). To detect the AMGs of viral contigs, the viral ORFs were blastp against the CAZymes database (version of 30 July 2020) to find glycol metabolismrelated genes using DIAMOND (e value , 1 Â 10 25 ; bit score . 50) and searched for sulfur and nitrogen metabolic genes using GhostKOALA against the KEGG GENES database (https://www.kegg.jp/ ghostkoala/, version 2.2). The relative abundance of viral ORFs was calculated for each sample by summing the relative abundance of each ORF and normalizing to the abundance of the vOTUs in which it was encoded.
Co-occurrence network analysis. To cluster and place the vOTUs ($10 kb) in the context of known viruses, the predicted proteins were clustered with predicted proteins from viral sequences in public databases (v201, version of 31 July 2021). All proteins were compared using all-versus-all DIAMOND (v0.9.29.130) BLASTp (E value# 1 Â 10 25 ; query coverage $ 50%; identity $ 25%) (82). A similarity score for each pair was calculated as the negative logarithmic score by multiplying the hypergeometric similarity P value by the total number of pairwise comparison using vConTACT2 (https://bitbucket.org/MAVERICLab/vcontact2, accessed 8 May 2020). For comparison with other environmental viromes, we also placed the vOTUs ($ 10kb) with the viral reference (4,086 vOTUs) and high-quality (21,148 vOTUs) predicted complete viral genomes (vOTUs) from marine, freshwater, terrestrial soil, wastewater environment, and algae-host-associated genomes in IMG/VR v3 data set (29) using vConTACT2.
The network structure was used to determine the potential interactions between viral and host communities based on the Pearson index between viral clusters (VCs) and host OTUs. Only robust (r . 0.8 or r , 20.8) and statistically significant (P , 0.05) correlations were included into the network analysis.
Statistical analysis. All statistical analyses were performed in R. a-Diversity and b-diversity of viral communities were calculated using vegan package v2.5. PCA and CCA based on Bray-Curtis dissimilarities were generated from vOTU tables with viral abundances (TPM) using the vegdist function (method "bray"). For analysis of vOTUs and VCs dynamics, higher than the mean value of abundance (TPM) for a vOTU/VC was considered belonging to a peak of abundance (83). The vOTUs/VCs had three kinds of peaks in relative abundance: (i) single peak (including vOTU/VC peaks of relative abundance on days 165 and 181, single 1; on days 193 and 208, single 2; on days 243 and 271, single 3; and on days 165 and 271, single 4), (ii) wide peaks, which means the peak occurs in 1 or 2 adjacent Julian days on the basis of single types (again, including all four subtypes), and (iii) several peaks (except to single and wide peaks types) of abundance.
Availability of data. The raw reads data reported in this paper were previously deposited in the NCBI database (under Bioproject number PRJNA797266).

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.