Giant Virus Infection Signatures Are Modulated by Euphotic Zone Depth Strata and Iron Regimes of the Subantarctic Southern Ocean

ABSTRACT Viruses can alter the abundance, evolution, and metabolism of microorganisms in the ocean, playing a key role in water column biogeochemistry and global carbon cycles. Large efforts to measure the contribution of eukaryotic microorganisms (e.g., protists) to the marine food web have been made, yet the in situ activities of the ecologically relevant viruses that infect these organisms are not well characterized. Viruses within the phylum Nucleocytoviricota (“giant viruses”) are known to infect a diverse range of ecologically relevant marine protists, yet how these viruses are influenced by environmental conditions remains under-characterized. By employing metatranscriptomic analyses of in situ microbial communities along a temporal and depth-resolved gradient, we describe the diversity of giant viruses at the Southern Ocean Time Series (SOTS), a site within the subpolar Southern Ocean. Using a phylogeny-guided taxonomic assessment of detected giant virus genomes and metagenome-assembled genomes, we observed depth-dependent structuring of divergent giant virus families mirroring dynamic physicochemical gradients in the stratified euphotic zone. Analyses of transcribed metabolic genes from giant viruses suggest viral metabolic reprogramming of hosts from the surface to a 200-m depth. Lastly, using on-deck incubations reflecting a gradient of iron availability, we show that modulating iron regimes influences the activity of giant viruses in the field. Specifically, we show enhanced infection signatures of giant viruses under both iron-replete and iron-limited conditions. Collectively, these results expand our understanding of how the water column’s vertical biogeography and chemical surroundings affect an important group of viruses within the Southern Ocean. IMPORTANCE The biology and ecology of marine microbial eukaryotes is known to be constrained by oceanic conditions. In contrast, how viruses that infect this important group of organisms respond to environmental change is less well known, despite viruses being recognized as key microbial community members. Here, we address this gap in our understanding by characterizing the diversity and activity of “giant” viruses within an important region in the sub-Antarctic Southern Ocean. Giant viruses are double-stranded DNA (dsDNA) viruses of the phylum Nucleocytoviricota and are known to infect a wide range of eukaryotic hosts. By employing a metatranscriptomics approach using both in situ samples and microcosm manipulations, we illuminated both the vertical biogeography and how changing iron availability affects this primarily uncultivated group of protist-infecting viruses. These results serve as a foundation for our understanding of how the open ocean water column structures the viral community, which can be used to guide models of the viral impact on marine and global biogeochemical cycling.

Algavirales, and Pimascovirales, and that the Mesomimiviridae family made up .80% of all giant virus transcripts across the samples and was prevalent from the surface to a 200-m depth. Giant virus metabolic gene transcripts were abundant throughout the euphotic zone, suggesting they may contribute to key metabolic processes such as photosynthesis and nutrient acquisition. Finally, we found evidence for enhanced infection signatures under iron-limited and iron-replete conditions by leveraging metatranscriptomes generated from incubations across a range of iron availabilities.

RESULTS
Physicochemical status of the water column at the SOTS. A temporal and depthresolved metatranscriptomic data set was generated at the Southern Ocean Time Series (SOTS), located alongside the northern edge of the SAZ near the subtropical front (STF). An overview of the cruise track has been previously described by Schallenberg et al. (31), noting frequent intersection during our expedition with an eddy that had surface temperatures of .14°C. Sea surface temperatures during the first three dates ranged from 10.5 to 11.5°C, whereas the final time point sampled was ;13°C, indicating potential interference with this eddy during the March 17 time point. Across all depths sampled, water column temperature ranged between 9 and 13°C and salinity ranged from 34 to 35 up to a 500-m depth ( Fig. S2 and Table S1). For only the March 5, 7, and 9 time points, NO 3 varied between 5 and 15 mmol L 21 , NO 2 ranged from 0 to 0.4 mmol L 21 , NH 4 from 0 to 0.5 mmol L 21 , PO 4 between 0.6 and 1.2 mmol L 21 , and Si from 0.5 to 5 mmol L 21 to 500 m deep ( Fig. S2 and Table S1). Dissolved iron concentrations in the upper 50 m ranged between 0.15 and 0.35 nmol L 21 , and its variability mirrored salinity and temperature in this stratum. A spike in relative chlorophyll a fluorescence occurred around a 25-to 50-m depth on March 5, 7, and 9; and around a 50-to 100-m depth on March 17 (Fig. S2).
Composition of the microeukaryote community. Taxonomic distribution of metatranscriptomic reads in the assembly revealed a dominance of Dinophyceae (dinoflagellates), Prymnesiophyceae (haptophytes), and Bacillariophyta (diatoms) in the transcriptionally active pool of microeukaryote protists across the spatiotemporal in situ sampling (Fig. S3). There were also reads assigned to Pelagophyceae (pelagophytes), Spirotichea (ciliate protozoa), and Mamiellophyceae (chlorophytes) which were present but were proportionally lower in transcript abundance throughout the in situ samples (Fig. S3). The proportions of each taxon remained relatively consistent throughout dates and depths; however, we saw decreased proportions of haptophyte reads and/or increased proportions of dinoflagellates and diatom reads with depth (;100 to 200 m, Fig. S3).
Detection of the active giant virus community. We detected 100 unique, transcriptionally active giant virus genomes/metagenome assembled genomes (MAGs) across the sampling set (Fig. S4), contributing 0.0005 to 0.0335% of the total reads. Of these genomes, half had .50% of their total genes detected and four had .80% of their genes detected (Fig. S4). The total normalized reads across these genomes were each proportional to their mean gene count (Fig. S4).
Defining the vertical biogeography of active giant viruses. To resolve vertical patterns associated with giant virus phylogeny across depth strata sampled at SOTS,   Fig. S6). Examining the proportions of transcripts assigned, summarized by family, showed patterns with depth sampled as well (Fig. 2). Generally, transcripts assigned to Mesomimiviridae family genomes dominated the proportion of the transcript pool from the surface (5 to 15 m) to 200-m depth across the sampling dates ( Fig. 2A). Transcripts assigned to genomes within the IM_09 family displayed a higher proportion at 100 m (March 5), 125 m (March 7), 150 m (March 9), and 150 m (March 17) compared to their proportions at shallower depths ( Fig. 2A). Genomes assigned to the proposed family Prasinoviridae dominated the transcript pool within the order Algavirales, and generally AG_04 transcripts decreased in proportion with depth ( Fig. 2A). Depth-specific patterns for transcripts assigned to PM_01 genomes were not as prominent due to their low contribution to the transcript pool across all depths and dates; however, their highest contribution occurred at ;90 m depth on March 7 ( Fig. 2A). Summed transcripts across all giant virus MAGs/genomes were highest in the surface layer of the water column and decreased with increasing depth (Fig. 2B). The shifting distribution of nutrients in the upper water column measured during our sampling period correlated with the transcript abundance patterns of giant viruses (Fig. 3C). Where the nutrients nitrate, phosphate, and silicate increased at ;100 to 200 m, there was a decrease in relative transcript abundance across all giant virus families (Fig. 2C). This decrease in giant virus transcripts also coincided with a decreased abundance of total transcripts assigned to the gene encoding the eukaryotic DNA-directed RNA polymerase large subunit marker protein (RPB1, Fig. 2C). Five giant virus genomes of the Mesomimiviridae family were positively correlated with chlorophyll a fluorescence (Spearman's r $ 0.8, P adj , 0.1), one was positively correlated with NH 4 (r $ 0.8, P adj , 0.1), and two were negatively correlated with NO 3 (r # 0.8, P adj , 0.1) (Fig. S7). One IM_07 genome was also negatively correlated with NO 3 (r # 0.8, P adj , 0.1) and one IM_06 genome was positively correlated with chlorophyll a fluorescence (r $ 0.8, P adj , 0.1) (Fig. S7).
Transcribed giant virus structural and metabolic genes with depth. We assessed depth-specific patterns of the pool of transcribed functional metabolic genes harbored within the giant virus genomes to gain insight into other metabolic consequences of infection across the depth strata. Here, transcripts for the genes encoding "core" giant virus proteins used to generate the concatenated phylogeny ( (Fig. 3). Transcripts for the giant virus MCP were present across all depths and dates, and MCP was the most abundant of the core proteins (Fig. 3A).
Transcripts recruited to genes harbored within the giant virus genome categorized as "Cytoskeleton," "Glycolysis/Gluconeogenesis," "Light Harvesting/Energy Production," "Nutrient Metabolism," "Oxidative Stress," "Pentose Phosphate Pathway," "Citric Acid Cycle," "Transcription," and "Transport" were queried (Fig. 3). Transcripts of giant virus genes encoding "Light Harvesting/Energy Production" proteins-chlorophyll a binding protein, bacteriorhodopsin, and cytochrome b6-f complex (PetC)-were present throughout all depths and dates (Fig. 3A). Specifically, transcripts for the gene encoding chlorophyll a binding protein were most prevalent at the surface depths (15 to 50 m) but could also be detected at 200 m depth (March 5) and 70 to 90 m depth (March 17, Fig. 3A). Transcripts for genes encoding bacteriorhodopsin-like proteins were consistently present at the surface depths (15 to 40 m, excluding March 9) and for PetC (excluding March 7, Fig. 3A). The phylogenetic assignment of transcripts recruited to Viruses Infecting Protists in the Southern Ocean mSystems genomes with genes encoding "Light Harvesting/Energy Production" proteins belonged to the Mesomimiviridae, IM_06, IM_12, and Prasinoviridae families (Fig. 3B). Transporters indicating nutrient acquisition were prevalent across the genomes (Fig. 3). Abundant transporters include an ABC transporter detected from 15 to 90 m depth (March 5 and 17), an ammonium transporter detected from 15 to 90 m depth, and a phosphate transporter detected from 15 to 90 m deep (Fig. 3A). Other transporters (Co/Mg transporter, drug/metabolite transporter, sulfite transporter, ion transporter, and a type-2 periplasmic binding transport component) had transcripts detected sporadically across all depths and dates (Fig. 3A). All giant virus families, except IM_06, IM_08, IM_12, and PM_01, had transporters encoded within their genomes that recruited transcripts (Fig. 3B). Genes encoding various nutrient metabolism pathways were also transcribed across the water column (Fig. 3A). The most prominent of these include the genes encoding glutamine synthetase (present from 15-to 90m depth) and the phosphate starvation-inducible protein (PhoH) (present at 15-m depth on March 7 and 40-to 90-m depth on March 17, Fig. 3A). This pathway was encoded by the Prasinoviridae, Mesomimiviridae, and IM_07 giant virus families (Fig. 3B).
Transcripts encoding genes within central carbon metabolism pathways (citric acid cycle, pentose phosphate pathway, and glycolysis/gluconeogenesis) were less prevalent throughout the water column. Transcripts for phosphoglycerate mutase 2 protein and malate synthase genes had transcripts detected from 15 to 200 m deep across the sampling series (Fig. 3A). Detection of transcripts for other central carbon genes was more sporadic across depth and date (Fig. 3A). The Mesomimiviridae and IM_07 giant virus families harbored genes with detected transcripts belonging to all three central carbon metabolic pathways (Fig. 3B).
Genes encoding heat shock proteins (HSP70, HSP90), thioredoxins, and cold shock proteins (CSP) within the "Oxidative Stress/Transcription" categories were prevalent and abundant throughout depth and dates sampled (15-100 m depth. Figure 3A). All giant virus families, except the Mimiviridae and PM_01, harbored genes related to oxidative stress that had transcripts detected (Fig. 3B). The Mesomimiviridae, IM_06, IM_07, and IM_09 families harbored genes for the CSP protein which had detected transcripts within the data set ( Fig. 3B).
Impact of iron availability on giant viruses revealed by on-deck iron incubations. Because the Southern Ocean is known to be seasonally limited by the trace micronutrient iron (29), we investigated the response of active giant viruses across a gradient of iron availability using previously published experimental incubations (33). Out of the 754 total dereplicated MCPs from the co-assembly and the giant virus genomes/MAGs (Fig. 4, Table S3), a total of 47 giant virus MCPs had significantly different (P # 0.05, Mann-Whitney U test) normalized transcript values (variance stabilizing transformation [vst]) when comparing desferrioxamine-B (DFB)-added versus Fe-added incubations. There were 30 MCPs that had statistically significantly (P # 0.05) higher normalized transcript values across the DFB-added incubations versus that in the FeCl 3 -added incubations (Fig. 4). Out of these, seven MAG MCPs were assigned as Mesomimiviridae, two within the g343 genus that contains the cultured isolate reference genomes, CeV and PgV (Fig. 4, Table S3). The genome MCP with the highest relative transcript values across the incubations and in situ t = 0 h sample was assigned to the Mesomimiviridae g335 genus (no cultured isolates) and was significantly (P # 0.05) higher within the DFB-added condition (Fig. 4, Table S3). There were 17 MCPs that had statistically significantly (P # 0.05) higher vst values across the FeCl 3 -added incubations versus that in the DFB-added incubations (Fig. 4). Only one phylogenetically assigned giant virus MAG was found within this group and was assigned to the IM_12 g300 genus that contains the chlorophyte-infecting Pyramimonas orientalis Virus (PoV01), Tetraselmis virus 1, and Dishui Lake large algae virus 1 isolates (Fig. 4).

DISCUSSION
In this study, we describe the presence and patterns across environmental gradients of active viruses within the phylum Nucleocytoviricota (giant viruses) within an HNLC Southern Ocean system. Samples of opportunity collected across various depths and  Viruses Infecting Protists in the Southern Ocean mSystems community observed previously (26), we proposed that giant viruses were important contributors to top-down viral predation on the protist community in the Southern Ocean.
As in a study performed within a coastal marine system (30), we found that the phylogenetically rich virus orders Imitervirales and Algavirales were the predominant types within the detected giant viruses at SOTS. We detected only 7 genomes with relatively low transcript abundances in our data set that were assigned to the Algavirales proposed family, the Prasinoviridae, and "AG_04." However, the proportion of transcripts assigned to Imitervirales, specifically the Mesomimiviridae (IM_01) family, which contains giant viruses of Chrysochromulina (35) and Phaeocystis (36), made up .80% of all giant virus transcripts, whereas a more equal contribution of Algavirales-assigned viruses was observed in the coastal system (30). This suggested that the Mesomimiviridae may contribute primarily to protist-virus infection dynamics within the SAZ during this season, as haptophytes like Phaeocystis spp. have been shown to be the dominant protist taxon at SOTS (34). Further, one giant virus genome within the Mesomimiviridae g329 proposed genus contributed up to 15% of the total giant virus transcripts. Because only one genus within the Mesomimiviridae contains cultured representatives, it was difficult to infer the specific host range of these dominant giant viruses, and thus, further characterization of this diverse family of viruses is necessary. Regardless, we have evidence that a community of active giant viruses, primarily those that infect marine protists, like the Mesomimiviridae, may play a significant role in viral infection dynamics within this HNLC region and across systems of diverging productivity (30). That this family has been detected as the dominant active member in coastal and HNLC marine systems warrants further investigation of the host range of this phylogenetically rich group of viruses.
Using this phylogeny-guided approach, we tracked the depth-dependent patterns of giant viruses to infer taxon-specific lifestyles and putative host ranges. The Mesomimiviridae family dominated the proportion of giant virus transcripts within all surface samples. Furthermore, several genomes within the Mesomimiviridae were positively correlated with chlorophyll a fluorescence. This suggests that some members of the Mesomimiviridae primarily infect hosts residing at the sunlit surface, such as phototrophs. However, we also observed representation of Mesomimiviridae in samples collected up to a 200-m depth, suggesting active infection of hosts past the surface layer. This further illustrates the unknown dynamics of uncultivated Mesomimiviridae-like viruses as it may indicate infection of a wide variety of metabolically diverse hosts (photo-, hetero-, or mixotrophic) that persist with depth. Despite this, the observation of signatures of active infection of giant viruses past the sunlit surface layer is notable because most of our knowledge of virus-host interactions in the environment originates from surface-derived samples.
In contrast to the widespread contribution of active Mesomimiviridae giant viruses at SOTS, we saw taxon-specific localization with depth. Namely, the IM_09 viral family, which contains the isolated pelagophyte-infecting virus Aureococcus anophagefferens Virus (37), had increased transcript representation around the 100-to 150-m zone. This could signify a potential depth "hot spot" of the widespread algal group, Pelagophyceae (38), or related hosts around these depths. It is known that certain pelagophytes are adapted to living at attenuated light levels, specifically within deep chlorophyll maxima (DCM [39]). We did not observe a prominent DCM around this zone; however, the small peaks in relative chlorophyll a fluorescence around these depths could indicate smaller or less abundant fluorescent cells. Because total giant virus transcripts decreased in abundance past ;90 m, it is likely that this host community is present in low abundance around this zone. Indeed, we saw an overall reduction of host signatures concurrent with increased nutrients (e.g., nitrate, phosphate and silicate) with depth (suggesting reduced drawdown by the community). We thus hypothesize that the decreased proportions of active giant viruses with depth is more tightly linked to host abundance and physiological state and that they are most infective within the productive surface layers. This is in contrast to the observation of subsurface viral particle maxima seen frequently in stratified oceanic systems (40), where harmful UV radiation may play a role in modulating bacteriophage distributions in the euphotic zone.
Giant viruses carry genes indicative of cellular metabolic pathways (12,41) that may allow hijacking of host metabolism for viral replication or even provide an ecological advantage to the host under certain environmental conditions (42). Thus, insight into the presence and contribution of giant virus metabolic genes with depth can provide evidence of the putative host ranges and potential biogeochemical impacts of viral infection in the open ocean environment aside from cellular lysis and death. We observed signatures of cellular metabolism contributed by giant viruses transcribed from 15-to 200-m depth at SOTS. For example, transcripts of genes related to photosynthetic processes were abundant within the Mesomimiviridae, and these transcripts were detected from the surface down to ;90 m depth, where we found signatures of this giant virus family, providing further evidence of a host range that includes phototrophs. Contrastingly, we did not observe transcripts of energy-generating processes encoded in the genomes of IM_09, a family of viruses which putatively infect pelagophyte-like hosts, whereas these transcripts were detected in abundance during the coastal study (30). This could reflect the depth localization of this group of viruses (and/or hosts) in which they persist at attenuated light levels, or it may be due to low levels of viral infection that occur with depth. Likewise, transcripts belonging to central carbon metabolic processes (citric acid cycle, pentose phosphate pathway, glycolysis/ gluconeogenesis) were sparsely detected in the depth profiles compared to those in a coastal study where these genes were consistently expressed throughout the diel time series (30). This could be a feature unique to open-ocean environments compared to more productive coastal systems or because virally encoded central carbon metabolic genes are not as readily detected within bulk metatranscriptomic data.
Interestingly, we detected numerous transcripts indicative of nutrient acquisition processes. The most prevalent were assigned to ammonium and phosphate transporters, which may indicate elevated virocell nutrient requirements for viral production or a response to nutrient availability in the water column (43,44). Additionally, the presence of abundant transcripts assigned to an "ABC-transporter" indicates active transport of some unknown substrate. Although we cannot currently determine the specific substrate, it is possible that giant viruses contribute to the active uptake of ferric iron sources in the water column, as we have shown active uptake of ferric iron and putative iron-siderophores by cyanobacteria, indicating iron-limited conditions during the expedition in the SAZ (33). In addition to nutrient acquisition processes, we saw transcription of genes encoding internal nutrient regulators such as PhoH and glutamine synthetase (45,46). The detection of these metabolic genes in our data set suggests that giant viruses contribute to host nutrient-cycling processes throughout the water column, and thus raises the question of how an infected virocell contributes differently than an uninfected cell to water column biogeochemical cycles in the open ocean. However, we recognize that we lack a more direct understanding of the actual/realized contributions of giant virus-encoded metabolic genes to their host, so further laboratory experiments confirming the functionality of these genes is needed to interpret these observations.
One of the defining biogeochemical features of the Southern Ocean is the emergence of a seasonally iron-limited microbial community in HNLC regions (29). Indeed, an increase in the photosynthetic health of the phytoplankton community in response to iron additions via on-deck incubations indicated that the community in the SAZ during the time of sampling was partly iron-limited (31,33). Here, we were also able to stress the community for iron by adding the iron chelator desferrioxamine-B, resulting in profound physiological and transcriptomic responses within the cellular community (33,47). To date, the impact of the trace metal iron on the virus community has been predominantly characterized for dsDNA bacteriophage, demonstrating the importance of iron to viral reproduction and the regeneration of bioavailable iron (21,23,(48)(49)(50)(51). Iron limitation has been previously shown to negatively impact protist-infecting viruses, such as diatom-infecting RNA viruses (52) and two cultured giant virus-host model systems (28). Therefore, we tested the hypothesis that the availability of iron in the SAZ would negatively influence giant virus infection signatures using the metatranscriptomes generated from the iron incubations. Interestingly, we saw divergent responses to iron availability across the giant virus community, where one subset of the lytic marker gene encoding the structural protein MCP increased in transcript abundance under low-iron conditions (1DFB) and another subset increased under high-iron conditions (1Fe). This "high-iron" subset included an MCP originating from a giant virus MAG within the IM_12 family that clusters with isolated chlorophyte-infecting viruses. This increase in transcripts for MCP under high-iron conditions was expected, because it was hypothesized that viral infection would increase under iron-replete conditions due to the improved physiological state of the hosts, particularly photoautotrophs, which require large amounts of iron for optimal photosynthetic functioning (28,53). Another study hypothesized that a change in iron redox state and heightened antioxidant production in diatoms undergoing iron limitation was linked to reduced RNA virus infection (52). The mechanisms underlying the elevated signatures of giant virus infection under iron-replete conditions remain unclear, but suggest that at least a subset of the community responds positively to enhanced iron availability in the SAZ region of the Southern Ocean.
The observation of elevated giant virus MCP transcripts under low-iron conditions was unexpected but provides an alternate explanation as to why iron additions trigger bloom formation and sustained abundances of protists living in iron-limited environments (29). Indeed, all of the MCPs elevated under iron-stressed conditions which we were able to phylogenetically classify belonged to the Mesomimiviridae family, whose putative hosts, the haptophytes, dominate the protist community in the SAZ (34). We hypothesize that the physiological history and state of the host during infection play an important role in determining the outcome of virus replication under different iron regimes. For example, a lab-culture-based study which reduced available iron to the host-virus systems Micromonas pusilla-MpV and Phaeocystis globosa-PgV showed a reduction in burst size under iron-stress, although the infectivity of the PgV was unaffected (28). It was hypothesized that P. globosa displayed more efficient virus production during these conditions because it could sustain growth at lower iron concentrations than M. pusilla (28). Because it has been shown that phytoplankton within the SOTS community may be uniquely adapted to chronic low-iron conditions (33,47), the appearance of active giant viruses in situ and under DFB-added incubations suggests that these viruses may also be adapted to the iron regime in the SAZ. Alternatively, it is possible that the timescale of the lytic cycle was extended due to iron limitation, as proposed during a mesoscale iron fertilization experiment, although those findings were primarily in phage populations (49). Lastly, this observation could be the result of a reduced capacity for the host to defend against viral infection under iron-stressed conditions, leading to increased infection under iron limitation and suppressed infection under iron-rich conditions. However, because host defenses against viral infections are understudied in protist-virus models, it is unclear which systems were affected.
Overall, the patterns in giant virus transcriptional activity with respect to iron availability at SOTS suggest a potentially important role of key nutrients in altering infection dynamics within this seasonally iron-limited system. We provide evidence of enhanced infection signatures under both iron-limited and iron-replete conditions, although we cannot disentangle whether these are indirect effects due to the host cell physiological status or direct effects on the virus itself. Further research evaluating giant virus infection across a natural iron gradient with experimental incubations should be undertaken in areas of differing iron regimes to test whether host adaptation to varying iron availabilities impacts the response to changing conditions. Because our data reflect the short-term responses in gene expression to changes in iron availability (;72 h), we could only capture the early stages of infection and/or multiple stages of infection in the population. The application of more quantitative approaches (such as quantitative PCR), along with high-resolution time series, is needed to better resolve giant virus infection under iron limitation.
Conclusions. Overall, we identified 100 "active" giant viruses by recruiting metatranscriptomes to giant virus genomes/MAGs collected across a spatiotemporal sampling scheme within an HNLC region in the Southern Ocean. This work highlights the diversity of giant viruses of protists in a nutrient-limited system and provides evidence of active infection and host metabolic rewiring occurring vertically throughout the euphotic zone. Furthermore, we found that iron may play a key role in governing giant virus-host dynamics within this system, which has implications for the response of Southern Ocean microbial communities to changing iron regimes. Since viruses can constrain oceanic biogeochemical cycles and regulate host abundance and ecology, future work should incorporate the isolation and study of giant viruses within the Southern Ocean. Because giant viruses can infect a wide range of protists (e.g., grazers, phototrophs, etc.) that play important roles in the microbial food web, we propose that they should be considered integral components within Southern Ocean microbial communities.

MATERIALS AND METHODS
Water column sample collection and environmental parameters. Water column samples were collected at SOTS in March 2018, onboard the RV Investigator. Water column structure during the Autumn season is typically stratified with a shallow mixed layer and contains elevated abundances of dominant microeukaryote groups (34). These "samples of opportunity" were collected along a spatiotemporal gradient during the expedition. Samples were collected prior to sunrise using 12-L Niskin bottles on a conductivity, temperature and depth (CTD) rosette at 47°00901. , and ammonium (NH 4 ) for all sampling dates except for March 17 were determined for unfiltered samples using a Seal AA3 segmented flow system following previous procedures (54). Samples for RNA were collected by filtering approximately 1 L of water on 0.2-mm pore Sterivex filter units and immediately flash-frozen and stored in liquid nitrogen in the field, and then at 280°C until further processing. Filtration was typically completed within a 10-to 15-min time frame.
Iron incubation experimental design. An on-deck iron incubation experiment was performed ("Growout 1" or GRW1) to elucidate the response of the surface microbial community to a gradient in iron availability. A subset of these data (focusing on cyanobacteria) has been previously published (33). All procedures were performed under trace metal-clean conditions. Briefly, on March 5, unfiltered seawater for GRW1 was collected at 5 m deep from a trace-metal clean pump and poured into 2-L polycarbonate bottles which were cleaned and prepared prior to the incubation by soaking with acid (reagent-grade hydrochloric, 10%) and rinsing three times with seawater. This depth was chosen to capture surface-level community iron limitation, as we assumed this layer would typically be irondepleted due to decreased inputs from deep waters as well as rapid uptake. These bottles were amended with either iron chloride (FeCl 3 ) or DFB to increase or reduce available iron (Fig. S1) (55,56). Here, either FeCl 3 or DFB was added to the following final concentrations: 0.25, 0.5, 1.0, or 2.5 nM (Fig. S1). All treatments were performed in biological duplicates. We included multiple concentrations of both FeCl 3 and DFB to provide a stepwise gradient of iron availability from iron-stressed to ironreplete. The concentrations were chosen in order to sufficiently stress or replenish the community with iron and were based on a previous iron amendment experiment performed within a similar system near New Zealand (56). An unamended treatment (control) was included and a sample for a t = 0 h time point was collected from the water used for the incubations (Fig. S1). Bottles were incubated at ;33% incident irradiance and at ambient surface temperature (;11°C). After 72 h, bottles were destructively sampled to collect cells for RNA by filtering 1 L of water on 0.2-mm pore Sterivex filter units and immediately flash-frozen and stored in liquid nitrogen in the field, and then at 280°C until further processing. The time of incubation was chosen to examine the short-term responses, mainly to reflect changes on the transcriptional level, of the microbial communities. Physiological measurements of the cellular community (chlorophyll a, Fv/Fm) can be found in Gilbert et al. (33).
RNA extraction, library preparation, and metatranscriptome pre-processing. RNA was extracted using a publicly available phenol-chloroform based protocol (57) and DNA was reduced using the Turbo DNA-free kit (Ambion), performing several rounds if necessary until all DNA was removed. Due to low total RNA mass (,300 ng total RNA), the following samples were pooled: three depths from Viruses Infecting Protists in the Southern Ocean mSystems from the BBtools packages (58). Within each set of samples (in situ versus GRW1 incubation), trimmed reads were concatenated and co-assembled (assembling multiple libraries together) using MEGAHIT v1.2.9 (59) with the kmer size parameter set at -k-list 23,43,63,83,103,123. Open reading frames (ORFs) were called from each combined assembly using the gene finding algorithm MetaGeneMark v3.38 (60) using the metagenome style model. Metatranscriptomic characterization of the microeukaryote community. The ORFs were taxonomically annotated to classify the active microeukaryote protist community. Here, the ORFs (amino acid) were aligned to the Marine Microbial Eukaryotic Transcriptome Sequencing Project database (61) using the software package EUKulele v.2.0.1 (62) with default parameters. Trimmed reads were recruited to the assembly using BBMap v38.84 (58) with default parameters, and read counts were tabulated using the ORF coordinates using featureCounts (63). Eukaryotic-like reads were normalized using the transcripts-per-million (TPM) method (64).
Nucleocytoviricota (giant viruses) genome database and detection. To characterize active giant viruses within our data set, we used an approach similar to that of Ha et al. (30)  where the genome had "low" contamination (hallmark gene copy number had a deviation of .1.2 from the mean of the superclade), $90% of core nucleocytoplasmic virus orthologous genes, ,30 contigs, minimum assembly size of 100 kb, and at least one contig of .30 kb (12). This approach not only allows characterization of transcripts recruited to fulllength giant virus genomes and therefore better phylogenetic resolution, but it also captures giant viruses that are missing hallmark genes (65). The genome set was dereplicated using MASH v2.0 (66) with single-linkage clustering at a MASH distance of #0.05 (corresponding to ;95% average nucleotide identity [ANI]), and the genome with the highest N 50 contig length was chosen as the representative. The representative genomes were decontaminated using ViralRecall v2.0 (67), whereby contigs with negative scores were removed, resulting in a total of 1,177 genomes in the database. ORFs from each genome were called using Prodigal v2.6.3 (68), and trimmed reads from the in situ profiles were mapped to the resultant ORFs (masking low complexity regions) using CLC Genomics Workbench 10.0 with a 95% similarity fraction and 90% length fraction for more stringent identification of giant virus genomic transcripts (30). In contrast to Ha et al. (30), we mapped reads to the nucleotide ORFs to remain consistent with the 95% ANI genomic dereplication. Next, only genomes which had $10% of their genes mapped to transcripts were retained to avoid those with spurious read recruitments (69) resulting in 100 "detected" genomes. These genomes were functionally annotated using eggNOG-mapper v2.1.4 with a DIAMOND alignment E value threshold of 1e-5. Read counts were normalized using the TPM method.
A concatenated protein tree was constructed to assign putative phylogenetic origin of the 100 giant virus genomes, as done previously for phylogenetically benchmarked Giant Virus Orthologous Groups protein markers (32). Here, protein sequences from the 1,177 representative genomes were used to create a concatenated alignment with ncldv_markersearch.py v1.1 (github.com/faylward/ncldv _markersearch). The tree was constructed using IQ-TREE (70) with the LG1F1I1G4 model using 1,000 ultrafast bootstraps to calculate support values (71). Delineation of phylogenetic groups was done according to Aylward et al. (67).
Assessing giant virus activity in relation to iron availability. The gene encoding the Nucleocytoviricota MCP was queried to assess the magnitude of infection in relation to iron availability, using the GRW1 incubation metatranscriptomes to expand upon the patterns seen in situ with the giant virus genome/MAG approach. We built upon MCPs found within the MAG functional annotations by also targeting giant virus MCP sequences within the GRW1 co-assembly. To do this, a database of MCP protein sequences from cultured giant virus isolates (NCBI RefSeq manually curated from Moniruzzaman et al. [72]) was aligned against the contigs from the GRW1 co-assembly using DIAMOND BLASTx v2.6.01 with an E value threshold of 1e-10 (73). The resultant hits were aligned to the NCBI nonredundant database (downloaded May 2021) using DIAMOND BLASTx v2.6.01 with an E value threshold of 1e-10 to retain only those with top hits to a virus. Then, these resultant MCPs were combined with MCPs from the 100 detected giant virus genomes/MAGs and clustered at 95% amino acid identity using CD-HIT (74) to remove redundancy between the MCPs originating from the genomes/MAGs and the assembly. The nucleotide sequences of the final MCPs were remapped competitively at 95% nucleotide sequence similarity and 90% read length in CLC Genomics Workbench 10.0. Here, we used the DESeq2 variance stabilizing transformation normalization approach (75) to account for the resulting transcript abundances of one MCP affected by the changes in transcript abundances of others due to changes in iron availability. Because using the MCP protein alone is a weak indicator of phylogenetic origin (32), we did not perform a phylogenetic analysis of the MCP sequences originating from the co-assembly to determine taxonomy, and thus present them as an "unassigned" giant virus.
Statistical analyses. All statistical analyses and data visualizations were performed using the R statistical software (76). To assess overall trends across dates and depths sampled, a principal-component analysis was performed using the prcomp function in R on log-transformed TPM values of the eukaryotic gene encoding the DNA directed RNA polymerase large subunit protein (RPB1), a gene used as a marker to assess the relative presence of a eukaryotic taxon within metatranscriptomes (72,77). A PCA was also performed on log-transformed TPMs summarized across each giant virus genome/MAG. To correlate environmental variables to giant virus expression, a Spearman correlation (cor function in R, method = "spearman") followed by P value adjustment for multiple comparison using the Holm method ("corr.adjust" from the RcmdrMisc package v2.7.1, type = "spearman" [78]) was computed between TPM counts summed across each giant virus genome/MAG and the environmental variables. Because there were no nutrient data for the March 17 sampling date, this date was omitted from nutrient-giant virus comparisons. To identify giant virus MCP transcripts (assembly 1 genome/MAG) that were significantly different between FeCl 3 -added and DFB-added incubations in GRW1, a Mann-Whitney U test on the vst-normalized transcripts of each MCP was performed (wilcox.test).
Data availability. Raw and processed data for the combined assemblies are publicly available through the JGI Data Portal (https://data.jgi.doe.gov) under Project ID no. 1260737 (in situ profiles) and 1260740 (GRW1). To access individual assemblies, see Table S4 at https://zenodo.org/record/7457861#.Y6BrIuzML0o for Project ID numbers. The final set of representative genomes used to for giant viruses read recruitment are available for download on Zenodo (https://zenodo.org/record/6382754#.Y-Pxw-zML0s).

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.