Unmasking cellular response of a bloom-forming alga to viral infection by resolving expression profiles at a single-cell level

Infection by large dsDNA viruses can lead to a profound alteration of host transcriptome and metabolome in order to provide essential building blocks to support the high metabolic demand for viral assembly and egress. Host response to viral infection can typically lead to diverse phenotypic outcome that include shift in host life cycle and activation of anti-viral defense response. Nevertheless, there is a major bottleneck to discern between viral hijacking strategies and host defense responses when averaging bulk population response. Here we study the interaction between Emiliania huxleyi, a bloom-forming alga, and its specific virus (EhV), an ecologically important host-virus model system in the ocean. We quantified host and virus gene expression on a single-cell resolution during the course of infection, using automatic microfluidic setup that captures individual algal cells and multiplex quantitate PCR. We revealed high heterogeneity in viral gene expression among individual cells. Simultaneous measurements of expression profiles of host and virus genes at a single-cell level allowed mapping of infected cells into newly defined infection states and allowed detection specific host response in a subpopulation of infected cell which otherwise masked by the majority of the infected population. Intriguingly, resistant cells emerged during viral infection, showed unique expression profiles of metabolic genes which can provide the basis for discerning between viral resistant and susceptible cells within heterogeneous populations in the marine environment. We propose that resolving host-virus arms race at a single-cell level will provide important mechanistic insights into viral life cycles and will uncover host defense strategies.


Introduction
Marine viruses are recognized as major ecological and evolutionary drivers and have immense impact on the community structure and the flow of nutrients through marine microbial food webs [1][2][3][4][5]. The cosmopolitan coccolithophore Emiliania huxleyi (Prymnesiophyceae, Haptophyta) is a widespread unicellular eukaryotic alga, responsible for large oceanic blooms [6,7]. Its intricate calcite exoskeleton accounts for~1/3 of the total marine CaCO 3 production [8]. E. huxleyi is also a key producer of dimethyl sulfide [9], a bioactive gas with a significant climate-regulating role that seemingly enhances cloud formation [10]. Therefore, the fate of these blooms may have a critical impact on carbon and sulfur biogeochemical cycles. E. huxleyi spring blooms are frequently terminated as a consequence of infection by a specific large dsDNA virus (E. huxleyi virus, EhV) [11,12]. The availability of genomic and transcriptomic data and a suite of host isolates with a range of susceptibilities to various EhV strains, makes the E. huxleyi-EhV a trackable host-pathogen model system with important ecological significance [13][14][15][16][17][18][19][20].
Recent studies demonstrated that viruses significantly alter the cellular metabolism of their host either by rewiring of host-encoded metabolic networks, or by introducing virus-encoded auxiliary metabolic genes (vAMG) which convert the infected host cell into an alternate cellular entity (the virocell [21]) with novel metabolic capabilities [22][23][24][25][26][27]. A combined transcriptomic and metabolomic approach taken during E. huxleyi-EhV interaction revealed major and rapid transcriptome remodeling targeted towards de novo fatty acid synthesis [18] fueled by glycolytic fluxes, to support viral assembly and the high demand for viral internal lipid membranes [28,29]. Lipidomic analysis of infected E. huxleyi host and purified EhV virions further revealed a large fraction of highly saturated triacylglycerols (TAGs) that accumulated uniquely within distinct lipid droplets as a result of virus-induced lipid remodeling [27]. The EhV genome encodes for a unique vAMG pathway for sphingolipid biosynthesis, never detected before in any other viral genome. Biochemical characterization of EhV-encoded serine palmitoyl-CoA transferase (SPT), a key enzyme in the sphingolipid biosynthetic pathway, revealed its unique substrate specificity which resulted in the production of virus-specific glycosphingolipids (vGSLs) composed of unusual hydroxylated C17 sphingoid-bases [30]. These viralspecific sphingolipids are essential for viral assembly and infectivity and can induce host programmed cell death (PCD) during the lytic phase of infection [14,31]. Indeed, EhV can trigger hallmarks of PCD, including production of reactive oxygen species (ROS), induction of caspase activity, metacaspase expression, changes in ultrastructure features and compromised membrane integrity [32][33][34].
The high metabolic demand for building blocks required to support synthesis, replication and assembly of large viruses with high burst size as EhV [34][35][36] point to high dependence of viruses on their host metabolic state for optimal replication [21,37]. Consequently, heterogeneity in host metabolic states as a result of complex interactions between nutrient availability and stress conditions may affect the infection dynamics. However, almost all of our current understanding of the molecular mechanisms that govern host-virus interactions in the ocean, is derived from experiments carried out at the population level, assuming synchrony and uniformity of the cell populations and neglecting any heterogeneity. Additionally, averaging the phenotypes of a whole population hinders the investigation of essential life cycle strategies to evade viral infection that can be induced only by rare subpopulations [38]. Understanding microbial interactions at a single-cell resolution is an emerging theme in microbiology. It enables the detection of complex heterogeneity within microbial populations and has been instrumental to identify novel strategies for acclimation to stress [39][40][41]. The recent advancement of sensitive technologies to detect gene expression from low input-RNA allows quantification of heterogeneity among cells by analyzing gene expression at the single cell level [42,43]. High-throughput profiling of single-cell gene expression patterns in mammalians and plant cells led to the discovery of new cell types, detection of rare cell subtypes, and provides better definition and cataloging of developmental phases in high resolution [44][45][46][47][48]. Importantly, the role of cell-to-cell communication and variability in controlling infection outcomes has only been recently demonstrated in cells of the mammalian immune system in response to bacterial pathogens [49][50][51][52]. Cell-to-cell variability in host response to viral infection was observed in several mammalian viruses and was attributed to several factors, including intrinsic noise (e.g. stochasticity of biochemical interactions involved in the infection process), the number of viral genomes initiating the infection process and the specific cell-state before the infection [52][53][54][55][56][57][58]. The existence of cell-to-cell variability during infection suggests that key events in host response are masked by conventional bulk cell expression profiling and that detection of gene expression on single cell resolution may uncover hidden host responses. Recently, simultaneous detection of host and pathogen gene expression profile was suggested as a powerful tool used to gain a better understanding of the molecular mechanisms underlying the infection process and to identify host defense responses [21,[59][60][61].
Here, we used multiplex single cell qPCR to quantify the dynamics of host and virus gene expression profiles of individual cells during infection of E. huxleyi populations. We provide strong evidence for heterogeneity within the population and discern between cells at different infection states based on their viral gene expression signatures. We unravel an unrecognized phase of early host response that preceded viral gene expression within infected cells. We suggest that examining host and virus gene expression profiles at the single cell resolution allows to infer the temporal dynamic of the infection process, thereby it serves as an attractive approach to decipher the molecular mechanism underlying host-virus interaction.

Results and discussion
To examine the variability within infected E. huxleyi cells, we measured the expression levels of selected host and viral genes over the course of infection at a single-cell resolution. Cells were isolated during infection of E. huxleyi CCMP2090 at different phases, at 0, 2, 4, 24 hours post infection (hpi) (Fig 1, Fig A in S1 Text). We used the C1 single-cell Auto Prep System to sort and extract RNA from single E. huxleyi cells during viral infection by EhV201. The presence of a single cell captured in an individual isolation chamber was confirmed by microscopic inspection of emitted chlorophyll auto-fluorescence (Fig 2A). In order to detect variability in viral infection states, we conducted simultaneous measurements of expression profiles of host and virus genes at a single-cell level by using multiplexed qPCR. We selected viral genes encoding for sphingolipid biosynthesis as well as gene markers for early and late infection [18,62]. Selected genes involved in host metabolic pathways were targeted based on previous reports which demonstrated their functional role during infection, including primary metabolism (glycolysis, fatty acid biosynthesis), sphingolipid and terpenoid metabolism, autophagy and antioxidant genes [18,27,33,34]. In addition, we examined the expression of host genes associated with life cycle [63], meiosis and PCD [32] that exhibited induction during infection. (In total expression of 107 host genes and 10 viral genes was measured, see S1 Table for primers  list).
To test for the sensitivity in detection of gene expression on a single cell level, we spiked-in, to each C1 well, a set of External RNA Controls Consortium (ERCC) molecules that span a wide range of RNA concentrations (from~0.5 to~100 molecules per well). We subsequently quantified their concentration using similar qPCR amplification setup as used for the host and virus genes. Pairwise correlation between spike concentrations and Et (Et = 30-Ct) values obtained from the qPCR was >0.98 (Pearson correlation coefficient, p-value = 4.2_10 −12 , Fig  2B). We found a highly sensitive level of detection with 40% probability to detect an RNA spike that is at a level of 1 molecule per sample (Fig 2C), similar to the detection level reported for mammalian cells [64]. Mean expression of viral and host genes in all examined cells were found to be 11.8 ± 4.0 and 6.96 ± 2.5 (Et values ± SD), respectively ( Fig 2D).
We detected a high variability in viral expression profiles among individual cells within the same infected population. For example, heterogeneity in the expression levels of virus-encoded ceramide synthase (vCerS, EPVG_00014), a key enzyme in sphingolipid biosynthesis [18,30] was detected during early phase of infection (2 and 4 hpi of CCMP2090, Fig 3A). Similar results were obtained for the average expression of 10 viral genes (Fig 3B). At the onset of viral   We further realized that averaging host gene expression over the course of infection might hinder our ability to observe the initial response of the host to viral infection and that singlecell analysis could significantly increase the resolution for sensitive detection of host response at this early stage of infection. We therefore re-ordered infected cells based on their viral infection index (PC1), rather than the actual time of infection (i.e. hpi), resulting in "pseudotemporal" hierarchy of single cells (Fig 4). Intriguingly, we unmasked a fraction of cells that were exposed to the virus but did not exhibit any detectable expression of viral genes. These cells had similar infection index values as control cells, with PC1 values < -10 ( Fig 4A). We found that 33/179 (17%) of infected cells of CCMP2090 were at this distinct "lag phase" of viral infection. These individual cells were analyzed for their respective host gene expression levels based on a sliding window approach (Fig 4B and 4C), as it is less sensitive to technical noise, often observed in single cell data. We also used a statistical model to test for genes that are differentially expressed at these early stages of viral infection. This model incorporates the two types of heterogeneity that usually appear in single cell data, namely, the percentage of cells expressing a gene in a given population (e.g. Et value > 0) and the variability in expression levels in cells exhibiting positive expression values [65]. Up-regulation of several host genes in infected cells was detected in this subpopulation (Fig 4C and S2 Table). An intriguing example is the metacaspase-2 gene (p = 0.0000027) which was previously suggested to be induced and recruited during EhV lytic phase and activation of E. huxleyi PCD [32]. We also found early induction of triosephosphate isomerase (TPI, p = 0.00063) and phospholipid:diacylglycerol acyltransferase (PDAT, p = 0.0018) which are involved in glycolysis and TAG biosynthesis respectively. In addition, genes involve in autophagy [34] and de novo sphingolipid biosynthesis [18,30] were detected in this unique early phase of host response. Since major alteration in these specific metabolic pathways were recently shown to be essential for EhV infection [14,18,20,21,27,30,31,33,34], early induction of these pathways may serve as an effective viral strategy to prime optimal infection. Alternatively, this phase of early host response prior to viral gene expression may represent a newly unrecognized phase of immediate host anti-viral defense response. At the late stages of infection (infection index >10), we observed induction of several meiosis-related genes, including HOP1 and MND, two SPO11 homologues and MYB in CCMP2090 (Fig 4B). These results are in agreement with previous studies that suggested a phenotype switch of E. huxleyi to evade viral infection [38] and propose the induction of meiosis-related genes as part of transcriptomic reprogramming of during infection [63].
Further inspection of the PCA analysis showed the cells exhibiting low to moderate level of PC1 were highly variable in their PC2 level (Fig 3C). To identify the viral genes that contribute to this variability, we further examined the correlation coefficients between the viral gene expression and principal components 2 (PC2). Interestingly, this analysis revealed a positive correlation (r = 0.53) between PC2 and the expression level of viral RNA polymerase gene (EPVG_00062) which was previously reported to be expressed at early-mid phases of infection [18,62], while a negative correlation (r = -0.44) was found for a viral gene (EPVG_00010) that is known to be expressed at late phases of infection. Accordingly, cells with low PC2 levels expressed EPVG_00010 and not EPVG_00062, while cells with high PC2 values exhibited the opposite trend (as compared with Fig 5A and 5B).
To further characterize host gene expression during different phases of infection, we manually clustered CCMP2090 cells according to their infection index (PC1) and the expression of either early or late viral genes (PC2) (Fig 5C) and examined the expression of host metabolic genes in these clusters (Fig 5D). This analysis showed that induction of most of host metabolic genes occurred in cells that expressed predominantly late viral genes (Fig 5D, CL5, Alga-virus interactions at single-cell resolution -10<PC1<10, PC2>-5) and in cells with moderate expression of viral genes (Fig 5D, CL6,  10<PC1<36). Down-regulation of many host genes was found in cells exhibiting high viral expression (Fig 5D, CL7, PC1>37), suggesting that these cells were at the final stages of infection.
In order to further explore the link between optimal host metabolic state and efficient viral infection, we infected CCMP2090 stationary culture and subjected single cells to dual gene expression analysis at 2 hpi (Fig 6A). While most of the exponential growing cells exhibited Alga-virus interactions at single-cell resolution viral expression, we detected only moderate viral expression in 3/27 (11%) of the stationary phase cells (Fig 6A), while the rest of the cells had viral expression patterns similar to uninfected cells (control). In parallel, stationary phase cells (either control or infected) exhibited down-regulation of most of the examined host metabolic genes, in contrast to their general up-regulation in infected exponential phase cells (Fig 6B). These data suggest that the cell-tocell variability in host metabolic state may play important role in determining susceptibility to infection by large viruses with high metabolic demand. "Kill the Winner" is a key theory in microbial ecology which suggests that viruses shape diversity of microbial populations by infecting the most dominant proliferative host [66]. We propose that "Kill the Winner" may even act within isogenic populations based on the variability in the metabolic state, which will lead to differential susceptibility to viral infection, forming continuous host-virus co-existence [67]. It is possible that cell-to-cell heterogeneity in the metabolic activity is shaped by the tradeoff between complex abiotic stress conditions (e.g. nutrient availability [68][69][70] and light regime [71]) and biotic interactions (e.g. bacterial pathogenicity [72]), and may result in differential susceptibility to viral infection in the marine environment.
We further investigated whether uninfected susceptible and resistant E. huxleyi cells exhibited altered expression profiles in the host metabolic genes that showed variable expression during infection (S3 Table). We exposed E. huxleyi cultures to viral infection and isolated cells that acquired resistance to subsequent viral infection of diverse EhV isolates (Fig 7A, [63]). We compared the expression profiles of recovered resistant cells (n = 18) to their mother cells that were highly susceptible to viral infection (n = 76). The tendency of resistant cells to aggregate make it difficult to isolate single cells, therefore for these analysis also included doublet cells. Intriguingly, resistant and susceptible cells tend to cluster distinctively along the PC2 dimension (Fig 7B). Among the genes that drive the separation along the PC2 dimension and were differentially expressed in resistant and susceptible cells were TPI, diphosphomevalonate decarboxylase (MVD1) and ceramidase-3 ( Fig 7C) which are key enzymes in glycolysis, terpenoid and sphingolipid metabolism, respectively. Since de novo ceramide biosynthesis is uniquely encoded in the EhV genome, activation of ceramidase may serve as an anti-viral host response [18,30]. Interestingly resistant cells also exhibited high expression of metacaspase2 which was also highly expressed in cells with no viral expression in early phase of infection ( Fig 7C). This data suggests that susceptibility to viral infection has a clear signature in expression profiles of host genes detected on a single-cell level.
Although the mechanism for resistance of E. huxleyi to viral infection requires further investigations, the differential regulation of host metabolic genes suggests a unique specialized metabolism that differs from that of susceptible cells [73][74][75]. Future single-cell RNA-seq transcriptomic studies will provide high throughput identification of gene markers that are specific for resistant strains as well as new mechanistic insights into the molecular basis for resistance mechanisms.
Tracking host-virus dynamics at the single cell resolution provides a novel approach to characterize the continuum viral infection states and host responses which is commonly masked in whole population analysis [76]. By applying dual gene expression profiling during algal host-virus interactions, we uncovered an early host transcriptional responses. This newly defined phase can result in different scenarios including, resistant cells, cells infected by defective virions, cells exposed to chemical cues released during infection and cells at the very early stage of infection. The new ability to define distinct "infection states" on a pseudo-temporal manner can provide valuable information regarding the dynamics of active viral infection in "real time" in the natural environment. Clustering of individual cells based on their specific transcriptomic signatures will uncover the relationship between host metabolic states and specific phenotypes associated with differential levels of viral infection or modes of resistance in huxleyi cells that were isolated from the sensitive CCMP2090 and resistant CCMP2090 culture (CCMP2090-R). Doublet cells are visualized by slightly bigger dots.(C) Violin plots of selected host genes that highly contributed to the separation of cells along PC2. The SingleCellAssay R package [62] was used to test significant changes in gene expression and all presented genes had p-value < 0.05 of hurdle test (See S3 Table). natural populations. In situ quantification of the fraction of infected cells, their infection and metabolic states and the fraction of resistant cells will provide important insights into the infection dynamics and may provide fundamental understating of host-virus co-existence strategies in the ocean. Resolving host-virus interaction on a single cell will promote discovering of novel sensitive biomarkers to assess the ecological impact of marine viruses and their role in regulating the fate of algal blooms in the ocean.

Culture growth and viral infection dynamics
Cells of the non-calcified CCMP2090 E. huxleyi strain were cultured in K/2 medium [77] and incubated at 18˚C with a 16:8 h light-dark illumination cycle. A light intensity of 100 μM photons�m -2 �s -1 was provided by cool white LED lights. Experiments were performed with exponential phase (5�10 5 -1�10 6 cells�ml-1) or stationary phase (5�10 6 cells�ml-1) cultures. E. huxleyi virus EhV201 (lytic) used for this study was isolated originally in [12]. E. huxleyi was infected with a 1:50 volumetric ratio of viral lysate to culture. By using plaque assay we counted the infectious particles of EhV lysate commonly use in our lab and found that the concentration of infectious particles is around 2.5 � 10 7 -5 � 10 7 �ml -1 (which is around 20% of particles counted by flow-cytometry). Thus, at the time of infection, there is about one infectious particle per cell.
For virus deactivation, 15 ml of 0.45 μm filtered viral lysate was placed in a Petri plate and radiated on uvitec (Cambridge, UK) ultra-violate light table, with 312 nm light for 15 min. To evaluate the infectivity of the deactivated viral lysate plaque assay was performed indicating a reduction of 8 fold in the number of infective particle per mL in the deactivated (~20 infective particles per mL) in comparison to the active viral lysate (10 8 infective particles per mL). For single-cell analysis, E. huxleyi cells were concentrated to 2.5�10 6 cells�ml-1 by gentle centrifugation (3000 RPM, 3 min) prior to single-cell isolation. To compare between viral infection in exponential and stationary phases, stationary phase cells were diluted to similar concentration of exponential phases cells using stationary conditioned medium (5�10 5 -1�10 6 cells�ml-1) and then infected by EhV. The growth dynamics of E. huxleyi CCMP2090 were monitored in seawater-based K/2 medium in control conditions and in the presence of the lytic viral strain EhV201. Resistant single cells were isolated after infection by mouth-pippetting over multiple passages through fresh medium under an inverted microscope as described in [63]. Single isolates were maintained in K/2 medium.

Enumeration of cell and virus abundance
Cells were monitored and quantified using a Multisizer 4 Coulter counter (Beckman Coulter, Nyon, Switzerland). For extracellular viral production, samples were filtered using 0.45 μM PVDF filters (Millex-HV, Millipore). Filtrate was fixed with a final concentration of 0.5% glutaraldehyde for 30 min at 4˚C, then plunged into liquid nitrogen, and stored at -80˚C until analysis. After thawing, 2:75 ratio of fixed sample was stained with SYBER gold (Invitrogen) prepared in Tris-EDTA buffer as instructed by the manufacturer (5 μl SYBER gold in 50 mL Tris-EDTA), then incubated for 20 min at 80˚C and cooled down to room temperature. Flow cytometric analysis was performed with excitation at 488 nm and emission at 525 nm.
Calculation of infectious particles during infection (Fig C in S1 Text) was done by using the most probable number (MPN) method as described in [20]. Briefly, medium was of infected culture was subjected to a series of fivefold dilutions for each sample. Each dilution (10 μl) was then added, in six technical replicates, to 100 μl of exponentially growing E. huxleyi cultures in multiwell plates over four or five days. MPN was calculated using the MPNcalc software.

Single-cell Quantitative RT-PCR
Single cells were captured on a C1 STA microfluidic array (5-10 μm cells) using the Fluidigm C1 and imaged on IX71S1F-3-5 motorized inverted Olympus microscope (Tokyo, Japan) to examine chlorophyll autofluorescence (ex:500/20 nm, em:650 nm LP). Only wells that exhibited chlorophyll autofluorescence signal emitted from single cells were further analyzed. External RNA Controls Consortium (ERCC) spikes were added to each well in a final dilution of 1:40,000. Cells were lysed and pre-amplified cDNA was generated from each cell using the Single Cells-to-CT Kit (Life Technologies). Pooled qPCR primers and Fluidigm STA reagents were added according to manufacturer's recommendations. Preamplified cDNA was then used for high-throughput qPCR measurement of each amplicon using a Bio-Mark HD system. Briefly, a 2.7 μl aliquot of each amplified cDNA was mixed with 3 μl of 2X SsoFast EvaGreen Supermix with Low ROX (Bio-Rad) and 0.3 μl of 20X DNA Binding Dye Sample Loading Reagent (Fluidigm), and 5 μl of each sample mix was then pipetted into one sample inlet in a 96.96 Dynamic Array IFC chip (Fluidigm). Individual qPCR primer pairs (50 μM, S1 Table) in a 1.08 μl volume were mixed with 3 μl Assay Loading Reagent (Fluidigm) and 1.92 μl Low TE, and 5 μl of each mix was pipetted into one assay inlet in the same Dynamic Array IFC chip. Subsequent sample/assay loading was performed with an IFC Controller HX (Fluidigm) and qPCR was performed on the BioMark HD real-time PCR reader (Fluidigm) following manufacturer's instructions using standard fast cycling conditions and melt-curve analysis, generating an amplification curve for each gene of interest in each sample (cell). Data was analyzed using Real-time PCR Analysis software (Fluidugm) with the following settings: 0.65 curve quality threshold, linear derivative baseline correction, automatic thresholding by assay (gene), and manual melt curve exclusion. Cycle threshold (Ct) values for each reaction were exported. As seen in other applications of this technology [65], the data had a bimodal distribution with some cells ranging from 2.5 Ct to 30 Ct, and another set of cells with Ct >40. Similar bimodal distribution was also observed for the ERCC spikes. Accordingly, we set the minimal threshold level of detection to 30 Ct and calculated expression threshold values (Et) by linear transformation of the data so that minimal Et was zero (30 Ct). For heat map visualization, expression data was normalized by subtracting the mean of each gene and dividing it with its standard deviation across cells. Single-cell PCR data was analyzed and displayed using MATLAB (MathWorks). Additional statistical analyses were performed using The SingleCel-lAssay R package [65]. Calculation of number of spike molecule per Fluidigm C1 well was performed according to [64].
Supporting information S1