Single-cell transcriptional dynamics of flavivirus infection

Dengue and Zika viral infections affect millions of people annually and can be complicated by hemorrhage and shock or neurological manifestations, respectively. However, a thorough understanding of the host response to these viruses is lacking, partly because conventional approaches ignore heterogeneity in virus abundance across cells. We present viscRNA-Seq (virus-inclusive single cell RNA-Seq), an approach to probe the host transcriptome together with intracellular viral RNA at the single cell level. We applied viscRNA-Seq to monitor dengue and Zika virus infection in cultured cells and discovered extreme heterogeneity in virus abundance. We exploited this variation to identify host factors that show complex dynamics and a high degree of specificity for either virus, including proteins involved in the endoplasmic reticulum translocon, signal peptide processing, and membrane trafficking. We validated the viscRNA-Seq hits and discovered novel proviral and antiviral factors. viscRNA-Seq is a powerful approach to assess the genome-wide virus-host dynamics at single cell level.


INTRODUCTION
Flaviviruses , which include dengue (DENV) and Zika (ZIKV) viruses, infect several hundred million people annually and are associated with severe morbidity and mortality (Bhatt et al. 2013;Rasmussen et al. 2016;Guzman and Kouri 2003) . Attempts to develop antiviral drugs that target viral proteins have been hampered in part by the high genetic diversity of these viruses. Since viruses usurp the cellular machinery at every stage of their life cycle, an alternative therapeutic strategy is to target host factors essential for viral replication (Bekerman and Einav 2015) . To this end it is paramount to understand the interaction dynamics between viruses and the host, to identify pro-and antiviral host factors and to monitor their dynamics in the course of viral infection. The current model of flavivirus infection suggests that the virus enters its target cells via clathrin-mediated endocytosis, followed by RNA genome uncoating in the early endosomes and trafficking to ER-derived membranes for translation and viral RNA replication. Following assembly, viral particles are thought to bud into the ER lumen and are then released from the cell via the secretory pathway (Screaton et al. 2015) . This pattern notwithstanding, an exhaustive list of host genes that interact, either directly or indirectly, with DENV or ZIKV is still lacking. Several high-throughput approaches have been applied to screen all 20,000 human genes for interactions with flaviviruses, including knockdown screens based on RNA interference (Sessions et al. 2009;Kwon et al. 2014;Le Sommer et al. 2012) , knockout screens via haploid cell lines or CRISPR libraries (Marceau et al. 2016; , and bulk transcriptomics via microarrays or RNA-Seq (Sessions et al. 2013;Moreno-Altamirano et al. 2004;Fink et al. 2007;Conceição et al. 2010;Becerra et al. 2009;Liew and Chow 2006) . While these approaches have provided important insights, our understanding of infection-triggered cellular responses is far from complete.
Knockdown, knockout, and population-level transcriptomics screens are extremely valuable tools but also share some limitations. First, because they are bulk assays, the heterogeneity of virus infection in single cells is obscured in the averaging process; differences in timing of virus entry and cell state across the culture and the fraction of uninfected cells are not accounted for. Second, because each population is a single data point and experiments cannot be repeated more than a handful of times, reproducibility and batch effects represent a challenge. Third, in knockout and knockdown screens the temporal aspect of infection is largely ignored, because successful knockdown can take days and recovery of the culture after infection in knockout screens lasts even longer. Fourth, because both knockdown and knockout can impair cellular viability and cannot probe essential genes, only a subset of genes can be probed by these techniques.
Here we report the development of viscRNA-Seq, an approach to sequence and quantify the whole transcriptome of single cells together with the viral RNA (vRNA) from the same cell. We applied this platform to DENV and ZIKV infections and investigated virus-host interactions in an unbiased, high-throughput manner, keeping information on cell-to-cell variability (i.e. cell state) and creating statistical power by the large number of single cell replicates while avoiding essential gene restrictions. By correlating gene expression with virus level in the same cell, we identified several cellular functions involved in flavivirus replication, including ER translocation, N-linked glycosylation and intracellular membrane trafficking. By comparing transcriptional dynamics in DENV versus ZIKV infected cells, we observed great variability in the specificity of these cellular factors for either virus, with a few genes including ID2 and HSPA5 playing opposite roles in the two infections. Using loss-of-function and gain-of-function screens we identified novel proviral (such as RPL31, TRAM1, and TMED2) and antiviral (ID2, CTNNB1) factors that are involved in mediating DENV infection. In summary, viscRNA-Seq sheds light on the temporal dynamics of virus-host interactions at the single cell level and represents an attractive platform for discovery of novel candidate targets for host-targeted antiviral strategies.

RESULTS
viscRNA-Seq recovers mRNA and viral RNA from single cells To study the dynamical behaviour of human cells upon flavivirus infection, we developed an approach named viscRNA-Seq that captures and sequences the transcriptome of single cells (mRNA) together with the virus RNA (vRNA) from the same cell. This approach is modified from the commonly used Smart-seq2 for single cell RNA-Seq (Picelli et al. 2014) . Briefly, single human cells are sorted into 384-well plates pre-filled with lysis buffer (Fig. 1C). In addition to ERCC (External RNA Controls Consortium) spike-in RNAs and the standard poly-T oligonucleotide (oligo-dT) that captures the host mRNA, the lysis buffer contains a DNA oligo that is reverse complementary to the positive-strand virus RNA (Fig. 1D). The addition of a virus-specific oligo overcomes limitations of other approaches and enables studying of viruses that are not polyadenylated (Russell, Trapnell, and Bloom 2017) . Reverse transcription and template switching is then performed as in Smart-seq2, but with a 5'-blocked template-switching oligo (TSO) that greatly reduces the formation of artifact products (TSO concatemers). The cDNA is then amplified, quantified, and pre-screened for virus presence via a qPCR assay (Fig.  1E). Since many cells are not infected, this enables us to choose wells that contain both low and high vRNA levels and then to sequence their cDNA on an illumina NextSeq at a depth of ∼ 400,000 reads per cell (Fig. 1F). This provides high coverage of transcriptome and allows high-quality quantitative detection of gene expression and intracellular viral load together with relatively large numbers of cells.
We applied viscRNA-Seq to an infection time course in cultured cells. We infected human hepatoma (Huh7) cells with DENV (serotype 2, strain 16681) at multiplicity of infection (MOI) of 1 and 10. In a separate experiment, Huh7 cells were infected with ZIKV (Puerto Rico strain, PRVABC59) at an MOI of 1. In both cases, uninfected cells from the same culture were used as controls (Fig. 1A). At four different times after infection -4, 12, 24, and 48 hours -cells were harvested, sorted, and processed with viscRNA-Seq (Fig. 1B). Recovery of the ERCC spike-ins and number of expressed genes per cell confirmed that the libraries had high quality (Supplementary Fig. S1A-B). From each experimental condition, 380 cells were screened for virus and~100 of those were sequenced. In total~7500 single cells were screened and~2100 were sequenced (see Supplementary Table S1). Intracellular viral load and gene expression are heterogeneous across cells First, we focused on infection by DENV. As expected, qPCR measured an increase over time in the fraction of infected cells with both MOIs (Fig. 1G). Whereas most genes were rather homogeneously expressed, both intracellular viral load (number of vRNA reads per million transcripts) and expression of a subset of genes varied widely both across experimental conditions and cells from the same condition (Fig. 1H). Overall, between zero and a quarter of all reads from each cell is vRNA-derived, hence the dynamic range for intracellular viral load is extremely wide (5 decades). The distribution of both intracellular viral load and gene expression are rather symmetric in logarithmic space (Fig. 1H); as a consequence, mean expression as measured in a bulk assay is higher than the median and over-represents highly infected cells. The high coverage sequencing enables a quantitative measurement of the variation in the expression level of thousands of genes in each cell ( Supplementary Fig. S1B). As a next step, we aimed at identifying which elements of this variation are induced by the infection.

Correlation between intracellular viral load and gene expression within single cells tracks infection-triggered host response
In a bulk assay each of the experimental conditions would be an average of all cells, making it difficult to extract clear statistical patterns. Leveraging both single-cell resolution and high throughput, we directly computed Spearman's rank correlation coefficient between each gene expression and intracellular viral load across all cells. This metric does not require an explicit noise model for either expression or virus load and is therefore insensitive to outlier cells. As expected, most genes do not correlate with vRNA level and the distribution of their correlation coefficients decays rapidly away from zero ( Fig. 2A). In panels 2B-D examples of strong anticorrelation, strong correlation, and absence of correlation are shown. Both the level of vRNA at which each gene starts to correlate and the slope of the response vary across genes and may reflect different infection stages (see below). Genes with extreme correlation consistently represent specific cellular functions. Most of the top correlated genes ( Fig. 2A right inset) are involved in the ER unfolded protein response (UPR) (see e.g. DDIT3 in Fig. 3C), consistent with ER stress response triggered by flavivirus translation and RNA replication on ER-derived membranes (Medigeshi et al. 2007) . Numerous strongly anticorrelated genes ( Fig. 2A left inset) are components of actin and microtubules, indicating cytoskeleton breakdown (as an example, see ACTB in Fig. 2B). Molecular chaperones are found in both categories suggesting a more nuanced regulation.
To understand whether correlated genes may represent pathways that are important for virus infection, we focused on the 1% most correlated subset of the transcriptome (correlation in excess of 0.3 in absolute value) and performed Gene Ontology (GO) enrichment analysis using the online service PANTHER (Mi et al. 2017) . This statistical analysis confirms the qualitative picture emerging from the top correlates. At 4 hours post-infection upregulation of genes involved in translation and suppression of mRNA processing is demonstrated. At 48 hours post-infection there is upregulation of UPR, protein degradation via ERAD, and ER-to-Golgi anterograde transport via COPII-coated vesicles, and downregulation of cell cycle and cytoskeleton organization (supplementary tables ST1-ST4).
Several genes switch role during dengue infection Naturally, cells that are infected for longer tend to harbor more vRNA. To disentangle the effect of time since infection from the vRNA level within each cell, we computed the same correlation coefficient within single time points. We discovered that most correlated genes exhibit either positive or negative correlation, but not both. This behaviour is expected for generic stress response genes; the sign of the differential expression is a hardwired component of their physiological function. However, a group of 17 "time-switcher" genes show both an anticorrelation of less than −0.3 and a correlation in excess of +0.3 at different time points post-infection, suggesting a more specific interaction with DENV. Of these, 6 genes transition from anticorrelation to correlation (e.g. COPE, Fig. 2E-F), 10 show the opposite trend, and a single gene (PFN1) follows a nonmonotonic pattern (Fig. 2G). Since 4 and not 2 time points were sampled, a consistent increase (or decrease) in correlation likely stems from a biological change rather than a technical noise. Of the six genes which switch from anticorrelated to correlated, RPN1 and HM13 localize to the ER. RPN1 is a non-catalytic member of the oligosaccharide transfer (OST) complex, which is required for N-linked glycosylation of some ER proteins, whereas HM13 is a protease that cleaves the signal peptide after translocation into the ER. Both of these factors have been shown to be essential for DENV infection (Marceau et al. 2016) . SQSTM1 is a scaffold protein involved in selective autophagy of polyubiquitinated substrates and has been shown to have a bimodal behaviour in DENV infection (Metz et al. 2015) , whereas UBC is a major source of ubiquitin. Lastly, GORASP2 and COPE play a role in Golgi assembly and/or membrane trafficking. In particular COPE is a subunit of the coatomer complex (COPI) that mediates both intra-Golgi and Golgi-to-ER retrograde vesicle transport; another subunit of this complex, COPB1, has recently been shown to be essential for DENV (Iglesias et al. 2015) . Interestingly, COPE also appears to be downregulated during early infection in "bystander" cells; i.e. cells that originate from an infected culture but are themselves not infected ( Supplementary Fig. S4).

Host response differs for DENV and ZIKV
Next we sought to address the question of which elements of the host response are common between DENV and ZIKV, and therefore potentially common with other evolutionarily related viruses as well. To do so, we replicated the time course experiment with ZIKV at MOI 0 (control) and 1. Although Huh7 cells were also infected at an MOI of 10, cell death precluded sorting. Fig.  3A shows the correlations between gene expression (each dot is a human gene) and vRNA for both experiments and represents the two-dimensional equivalent of Fig. 2A. We discovered that the majority of genes are not correlated with either virus (contour lines indicate density of genes). Nevertheless, a clear pattern along the positive diagonal emerged with genes in this region, such as ATF3 (Fig. 3C) and ACTG1 (Fig. 3D), demonstrating a similar behaviour upon infection with either virus. A minority of genes are scattered away from the diagonal, indicating discordant behaviour between DENV or ZIKV infection. For instance, ID2 expression decreses at high DENV level but increases at high ZIKV RNA level (Fig. 3B), while the opposite trend is observed with the chaperone HSPA5 ( Fig. 3E and see below). A number of genes at the outskirts of the correlation plot are labeled and highlighted in red as they exhibit potentially interesting expression patterns upon infection: i.e. either an extremely strong correlation with both viruses or a high degree of virus specificity. These outliers include two subunits of the SEC61 complex (B, and G), several subunits of the TRAP complex and the OST, previously shown to be essential for DENV and/or WNV infection (Marceau et al. 2016; , and other genes that may be relevant to infection with either virus. To understand how these correlated genes share the heterogeneity of infected cells, we selected all genes with a correlation coefficient above 0.4 or below -0.4 and performed t-SNE dimensionality reduction (Maaten and Hinton 2008) , coloring each cell by its intracellular viral load (Fig. 4A, left) or by time post-infection (right). Although uninfected cells form a mixed, heterogeneous cloud, infection pushes cells into more stereotypic states that are distinct for DENV and ZIKV infection (black arrows indicate average positions for cells at increasing intracellular viral load). Plotting gene expression dynamics on top of these visualizations enables to connect single cellular pathways to the global changes in cell state defined by virus infection (Fig. 4B). Remarkably, the temporal behaviour of a few genes is inconsistent with the global transcriptomics shifts: for instance the expression of HSPA5 increases until 24 hours after ZIKV infection, but is then sharply decreased at 48 hours post-infection and with higher intracellular viral load. To compare the temporal dynamics of gene expression during DENV and ZIKV infection, we identified "time switchers" for Zika infection (Fig. 4C). Although the number of genes that show both correlation and anticorrelation is similar between the two viruses, the 11 Zika time switchers exhibit no correlation at 4 hours post-infection, followed by a non-monotonic behaviour as time passes. HSPA5 is included in this list, in agreement with its t-SNE visualization; this gene is therefore not only subject to opposite regulation in DENV versus ZIKV infection, but may play different roles at different times during the same infection. Among the other temporally regulated genes in ZIKV infection, is the circadian clock gene PER2 that resembles HSPA5 (Moni and Lio' 2017) .

Validation of proviral and antiviral host factors
To probe the functional relevance of genes demonstrating interesting correlations with DENV viral load, we first conducted loss-of-function screens. We measured the effects of siRNA-mediated depletion of 32 individual genes in Huh7 cells on DENV infection as well as on cellular viability ( Fig. 5A and Supplementary Fig. S5A). Using a cutoff of greater than 40% inhibition of viral infection as measured by luciferase assays normalized to cell viability in two independent screens, we identified multiple host factors essential for viral infection. These include a few components of the translocon previously shown to be essential for DENV: HM13 (Marceau et al. 2016) or WNV: SPCS2 ) as well as two novel components of the ER translocon: RPL31 and TRAM1 (Ng, Oresic, and Tortorella 2010) . Depletion of two proteins involved in membrane trafficking, TMED2 (secretory pathway) and COPE (retrograde, Golgi to ER) as well as the ER-resident chaperone and ERAD protein HSPA5 and the multifunctional transcription factor in ER stress, DDIT3 also reduced DENV infection.
In contrast, siRNA-mediated depletion of two genes that anticorrelate with intracellular virus load, ID2 and CTTNB1 (β-catenin), increased DENV infection, indicating that these proteins function as antiviral restriction factors, as previously reported in HIV (Kumar et al. 2008) . Notably, ID2 and CTTNB1 are known interacting partners (Rockman et al. 2001) , which may be acting via the interferon I pathway (Hillesheim et al. 2014) . Suppression of another subset of overexpressed or underexpressed genes demonstrated no effect on DENV infection, suggesting that they were either non-essential or not restricting (possibly due to redundancy in host factors requirement) or that their siRNA-mediated depletion was ineffective.
To determine whether host factors found to be proviral are also rate limiting for infection, next we conducted a gain-of-function screen. Huh7 cells ectopically expressing 30 of the 32 individual gene products were infected with DENV. Using a cutoff of greater than 30% increase in viral infection normalized to cell viability in two independent screens, we identified HSPA5, TMED2, SPCS2, and DDIT3 as factors whose overexpression increased DENV infection ( Fig.  5B and Supplementary Fig. S5B), indicating rate limitation associated with these important proviral factors. In contrast, ectopic expression of other proviral factors, such as COPE and TRAM1, decreased DENV infection, suggesting that DENV might be evolutionarily optimized for the natural expression level of these genes.

DISCUSSION
We have developed a new approach, designated viscRNA-Seq, to simultaneously quantify the whole transcriptome and intracellular viral load at the single cell level. This approach probes the natural gene expression dynamics of virus infections and is therefore complementary to knockout and knockdown genetic screens, which induce a controlled perturbation (Marceau et al. 2016;Sessions et al. 2009;Kwon et al. 2014;Le Sommer et al. 2012) . However, unlike those loss-of-function assays, viscRNA-Seq is able to fully discern cell-to-cell variation within a single experimental condition, is compatible with time-resolved sampling, and can be used to study essential genes. Although similar in spirit to a recent droplet-based approach that uses only a polyT capture oligonucleotide (Russell, Trapnell, and Bloom 2017) , our approach can be easily adapted to any RNA virus, whether polyadenylated or not, by swapping one single oligonucleotide. Moreover, since RNA capture is highly efficient compared to droplet-based methods, an accurate quantification of both gene expression and viral RNA (vRNA) can be obtained with as little as 400,000 sequencing reads per cell. Since full-length transcripts are recovered as in the original Smart-seq2 (Picelli et al. 2014) and unlike in droplet-based protocols, viscRNA-Seq can be combined with enrichment PCRs before sequencing to focus on specific host or viral factors at a fraction of the sequencing cost.
We have applied this high-throughput technique to study the temporal infection dynamics of DENV and ZIKV, two major global health threats (Bhatt et al. 2013) . Our first finding is that although the number of infected cells in the culture gradually increases with time as expected, there is a large heterogeneity across cells from the same Petri dish. Since flavivirus replication is not synchronized, such heterogeneity might reflect host-responses at different stages of viral life cycle. The single-cell distributions of both intracellular viral load and gene expression indicate that mean values measured via bulk assays tend to over-represent highly infected cells. Moreover, bulk transcriptomics studies cannot account for uninfected cells and are therefore limited to high MOI (Sessions et al. 2013) ; in contrast, we are able to study both high-MOI and low-MOI cultures equally well and to separate the effect of MOI from the actual infection state of each cell.
We then leveraged the statistical power of sequencing thousands of cells to correlate intracellular viral load with gene expression across the whole human transcriptome. The genes with the strongest positive correlation with both viruses are members of the unfolded protein response (UPR), particularly the PERK branch, including DDIT3, ATF3, and TRIB3. The strongest negative correlates with both viruses are components of the actin and microtubule networks (e.g. ACTB, ACTG1, TUBB1) as well as members of nucleotide biosynthesis, suggesting a disruption of both cytoskeleton and cellular metabolism. The URP response starts abruptly once 2,000 virus transcripts are present per million of total transcripts (i.e. when virus RNA comprises only 0.2% of the cellular mRNA); a threshold that is reached in most cells between 24 and 48 hours post-infection. Downregulation of cytoskeleton and metabolism, however, starts only at 10,000 virus transcripts per million of total transcripts; this higher threshold is reached in most cells at 48 hours post-infection. This delayed response may happen either because of direct cytopathic effects or as a consequence of the earlier UPR response. Interestingly, a recent transcriptomics study also found ER stress pathways to be differentially regulated during DENV infection (Sessions et al. 2013) . However, because thousands of host genes were classified as differentially expressed in that study, this overlap may be in part coincidental due to the sheer number of "hits" reported by those authors. Indeed, the more quantitative statistics resulting from the large number of single cell replicates as opposed to bulk transcriptomics was a key factor that enabled us to narrow down the list of potentially relevant genes to a small number that could be subsequently validated.
A number of host genes correlate strongly with one virus but correlate less or do not correlate with levels of the other virus. Examples include subunits of several complexes involved in ER translocation and N-linked glycosylation ( Supplementary Fig. S3): SEC61G, a subunit of the translocon; SSR3, a member of the TRAP complex; and OSTC, a subunit of the OST. Components of these three complexes were identified as essential host factors for DENV replication in a recent CRISPR-based knockout screen (Marceau et al. 2016) . SEC11C, a subunit of the signal peptidase complex (SPCS), also behaves in this way, in agreement with the prior finding that this complex is essential for flavivirus infection ) . Strikingly, we do not observe a dominant enrichment of interferon-related genes among the most strongly upregulated during flavivirus infection (Fink et al. 2007) . This results may be caused by virus-induced blocking of the interferon-induced signaling cascade (Muñoz-Jordán et al. 2003) ; moreover, Huh7 cells are known to react more mildly than other culture systems to interferon stimulation (Guo, Zhu, and Seeger 2003) .
The expression of some host genes shows discordant correlation with DENV and ZIKV infection. Among the genes that are overexpressed during DENV infection but underexpressed during ZIKV infection are the molecular chaperone HSPA5 which has been shown to interact directly with the dengue E protein in liver cells (Jindadamrongwech, Thepparit, and Smith 2004) , other members of the translocation machinery (SEC61B) or the TRAP complex (SSR1, SSR2), and ATF4, an ER-stress induced gene that interacts with DDIT3 and TRIB3. On the opposite end of the spectrum, genes that are underexpressed during DENV infection and overexpressed during ZIKV infection include the transcriptional regulator ID2. Both ID2 and cyclin D1 (CCND1), which is also strongly anticorrelated with DENV load, have been reported to be a target of β-catenin (Rockman et al. 2001;Shtutman et al. 1999) .
A few host genes (17 for DENV, 11 for ZIKV) show a complex dependence on time and intracellular viral load; at early time points, gene expression correlates positively (or negatively) with viral load, but this behaviour is reversed at later time points. Among these genes are HM13, COPE, and SQSTM1 for DENV and HSPA5 for ZIKV. We speculate that these genes may play multiple roles during the virus replication cycle, acting as antiviral factors during certain phases of infection (e.g. cell entry) and as proviral during others (e.g. virion release). These genes may also represent virus triggered host-responses that were counteracted by viral proteins. Of these interesting hits, HM13 or signal peptide peptidase is involved in processing of signalling peptides after ER membrane translocation, a pathway that has been reported to be critical for several flaviviruses including dengue  . Furthermore SQSTM1, which is involved in selective autophagy, has been reported to affect DENV infection in a time-dependent manner, in agreement with our results (Metz et al. 2015) , and to interact with the unrelated Chikunguya virus (Judith et al. 2013) .
From the viscRNA-Seq screen we selected 32 candidate genes to determine whether they may play proviral or antiviral roles during DENV infection. The three genes HSPA5, SPCS2, and TMED2 showed clear proviral effects, reducing DENV replication upon knockdown and increasing it when ectopically expressed. The first two are known essential factors of DENV infection (Jindadamrongwech, Thepparit, and Smith 2004; , whereas TMED2, which is involved in coatomer complex (COPI) vesicle-mediated retrograde trafficking and trafficking from the golgi to the plasma membrane (Fiedler et al. 1996;Goldberg 2000) , has not been reported before. Further in-depth studies are warranted to elucidate the role of this host factor in flavivirus infection. Unlike these proviral factors, the hits SSR3, COPE, and TRAM1 reduced viral replication under both knockdown and ectopic expression, although to different degrees depending on the direction of the perturbation. This suggests that DENV might be evolutionarily adapted to intermediate, wild type expression levels of these genes. Furthermore, it is striking that both TMED2 and COPE are involved in COPI-coated vesicle transport but produce opposite outcomes on DENV infection when ectopically expressed. Taken together with the time-switching correlation of COPE with intracellular DENV load, this result suggests a dual role for coatomer-coated vesicles during replication.
Overall, our study highlights the potential of single-cell level, high-throughput analyses to elucidate the interactions of human viruses with host cellular processes (Russell, Trapnell, and Bloom 2017) . Combining temporal information, cell-to-cell variability, cross-virus comparison and high-quality expression data has allowed us to identify pathways that react similarly to infection by dengue and Zika viruses, such as the unfolded protein response, and others that Open reading frames (ORFs) encoding 26 hits were selected from the Human ORFeome library of cDNA clones (Rual et al. 2004) (Open Biosystems), 3 from Addgene and one from DNASU (Seiler et al. 2014) and recombined into a pFLAG (for FLAG tagging) vector using Gateway technology (Invitrogen). A Renilla reporter DENV2 (NGC) plasmid was a gift from Pei-Yong Shi (University of Texas Medical Branch, Galveston, Texas, USA) (Zou et al. 2011) , and the DENV 16681 infectious clone (pD2IC-30P-NBX) used to produce virus for the single cell transcriptomic assays was a gift from Claire Huang (Centers for Disease Control and Prevention, Public Health Service, US Department of Health and Human Services, Fort Collins, Colorado, USA) (Huang et al. 2010) . Zika virus, PRVABC59 (Puerto Rico strain) was obtained from BEI Resources.
RNA interference siRNAs (100 nM) were transfected into cells using silMPORTER (Millipore) 72 hours prior to infection with luciferase reporter DENV at MOI of 0.05. Custom Cherry-Pick ON-TARGETplus siRNA library against 32 genes was purchased from Dharmacon (see Supplementary Table S4 for gene and siRNA sequence details).

Gain-of-function assays
Plasmids expressing ORFs encoding human genes or empty vector control were expressed ectopically in Huh7 cells by transfection with TransIT-LT1 (Mirus). Twenty-four hours after transfection, cells were infected with luciferase reporter DENV at MOI of 0.05 for 4 hours and incubated for 48 hours prior to viability assays and luciferase assays.

Viability assays
Viability was assessed using alamarBlue reagent (Invitrogen) according to the manufacturer's protocol. Fluorescence was detected at 560 nm on an InfiniteM1000 plate reader (Tecan).

Infection assays
Huh7 cells were infected with DENV or ZIKV for 4 hours at different MOIs (0, 1, 5, 10) and harvested at various time points post-infection. For the functional screens, Huh7 cells were infected with DENV in triplicates for 4 hours at MOI of 0.05. Overall infection was measured at 48 hours using standard luciferase assays.

Single cell sorting
At each time point, cells were trypsinized for 10 minutes, lifted them from the culture plate, pelleted and resuspended in 1 ml fresh media. After around 15 minutes, cells were pelleted again and resuspended in 2 ml 1X phosphate-buffered saline (PBS) buffer at a concentration of around 1 million cells per ml. Cells were filtered through a 40 um filter into a 5 ml FACS tube and sorted on a Sony SH800 sorter using forward and backscatter to distinguish living cells from dead cells and debris. Sorts were done into 384-well PCR plates containing 0.32-0.5 ul of lysis buffer (see below) using "Single cell" purity mode. A total of 12 384-well plates of single cells were sorted for the Dengue time course (4 uninfected, 4 MOI 1, and 4 MOI 10), and 8 plates for the Zika time course (4 uninfected and 4 MOI 1), yielding a total of about 7500 cells.

Lysis buffer, reverse transcription, and PCR
To capture and amplify both mRNA and viral RNA (vRNA) from the same cell, the Smart-seq2 protocol was adapted (Picelli et al. 2014) . All volumes were reduced by a factor 12 compared to the original protocol to enable high-throughput processing of 384-well plates. ERCC spike-in RNA was added at a concentration of 1:10 of the normal amount. The lysis buffer contained, in addition to the oligo-dT primer at 100 nM final concentration, a virus specific reverse primer to capture the positive-stranded virus RNA at a concentration of 1 nM. The capture primer sequences were the following:

Virus
Capture primer

Zika AAGCAGTGGTATCAACGCAGAGTACTCCRCTCCCYCTYTGGTCTTG
Different virus-specific primers and higher primer concentrations were tested but resulted in a large fraction of primer dimers. In order to reduce interference between the virus-specific primer and the Template Switching Oligo (TSO) used to extend the RT products, a 5'-blocked biotinylated TSO was used at the standard concentration. A large fraction of TSO concatemers was observed when testing reactions with a standard, non-biotinylated TSO. Reverse transcription (RT) and polymerase chain reaction (PCR) of the cDNA were performed in 1 ul and 2.5 ul, respectively: cells were amplified for 21 cycles. Lambda exonuclease was added to the PCR buffer at a final concentration of 0.0225 U/ul and the RT products were incubated at 37 C for 30 minutes before melting the RNA-DNA hybrid as it was observed that this reduced the amount of low-molecular weight bands from the PCR products. After PCR, the cDNA was diluted 1 to 7 in Tris buffer for a final volume of 17.5 ul. This dilution was used instead of the DNA purification by magnetic beads. In fact, we have tried to optimize purification by magnetic beads in 384-well plates but discovered that good libraries can be obtained without this step so we dropped it to maximize yield throughout the protocol, which allows fewer PCR cycles. All pipetting steps were performed using a TTPLabtech Mosquito HTS robotic platform.

cDNA quantification
To quantify the amount of cDNA in each well after PCR, a commercial fluorimetric assay was used (ThermoFisher QuantIt Picogreen). Briefly, 80-300 nl of cDNA and 25 ul of 1:200 dye-buffer mix were pipetted together into a flat-bottom 384-well plate (Corning 3540). Six wells were used for a blank and 5 standard concentrations (0.1 to 2 ng/ul) in the same amount as the sample. The plate was briefly mixed, centrifuged, incubated in the dark for 5 minutes, and measured on a plate reader at wavelength 550 nm. cDna concentrations were calculated via an affine fit to the standard wells.

Detection of infected cells by qPCR
Depending on the conditions (MOI and time since infection), the fraction of infected cells in each 384-well plate varies widely. In order to optimize sequencing on the widest possible dynamic range of virus amount per cell, we screen the amplified cDNA with a primer-probe based qPCR. Primer sequences are as follows: Virus Forward primer Reverse primer Probe The qPCR sequences for Dengue and Zika virus were adapted from (Gurukumar et al. 2009) and (Faye et al. 2013) . For Zika, a minor groove binder (MGB) probe was used instead of the LNA probe of the original publication. Notice that for both viruses, conserved regions in the virus genome are selected and degenerate bases are used to ensure that the qPCR assay works independently on the mutations happening in the virus population during the cell culture. In addition to the virus-specific primers-probe, a commercial primer-probe assay for ACTB with a VIC fluorophore is used in the same reactions as an additional checkpoint for bona fide cDNA quality. 250 nl of each cell's cDNA were pipetted into a 5 ul reaction. The cycling protocol is 45 cycles of 95 C for 5 seconds followed by 60 C for 30 seconds. Synthetic single-stranded DNA sequences matching the qPCR primers-probe combinations were used in 3 concentrations (10 pM, 1 pM, 0.1 pM) and together with an additional blank well to calibrate the quantification (total of 4 wells for standards/blank). Each standard well included 250 nl of virus synthetic ssDNA and 250 nl of ACTB synthetic ssDNA covering the commercial assay, both at the same concentration. Notice that although RT-qPCR can be used directly on cell lysates to obtain an accurate quantification of cellular RNAs, this assay is performed on preamplified cDNA instead, hence it is expected to be at best semi-quantitative. Nonetheless, we found it useful both as an early quality control step during the experiments and as a rough screening criterion to cherry pick cells for sequencing (see below). Because we obtained a great dynamic range of number of virus reads from single cells after sequencing, the qPCR results were not used in the downstream data analysis.
Cherry picking of cDNA Not all 7,500 sorted cells were sequenced; rather, to improve coverage at the same cost, around 2,000 cells were cherry picked for sequencing. With the results of the cDNA quantification and the virus and ACTB qPCR at hand, cells were selected such that they cover the largest possible set of conditions. For instance, in a plate with infected cells we ensured that both qPCR negative cells (ACTB but no virus), cells with little virus, and cells with a high amount of vRNA were all represented in the sequencing data. The selection was designed in a semi-automatic way via JavaScript and Python scripts and implemented on TTPLabtech Mosquito HTS and X1 HV robotic platforms. At the same time as cherry picking, the cDNA from each cell was also diluted to around 0.4 ng/ul for Tn5 endonuclease library prep. Although this concentration is slightly higher than usual or this type of libraries, the cDNA was not purified so that a certain fraction of the DNA is residual short oligos from previous reactions, which is most likely too short to end up on the sequencer.

Library prep and sequencing
Sequencing libraries were prepared using the illumina Nextera XT kit following manufacturer's instructions, with the following exceptions: (1) we used a smaller reaction volume (around 1 ul per cell); (2) we chose a slightly higher cDNA concentration (0.4 ng/ul) as input, to compensate for the lack of bead purification upstream; (3) we designed, tested, and used a custom set of Nextera-compatible barcodes to increase plexity to 1,536 cells per sequencing run, at an average depth of 250,000 reads per cell. The latter efforts allowed us to sequence each time course on a single illumina NextSeq sequencing run, reducing batch effects related to sequencing quality. We used the commercial 24 i7 barcodes and the 64 new i5 barcode sequences (see Supplementary Table S6). We noticed a low level of cross-talk between these barcodes, indicated by up to 5 virus reads found in a few uninfected cells. However, considering that a sizeable fraction of cells in the same sequencing run (late infected and high MOI) had tens or even hundreds of thousand of virus reads, the amount of cross-talk between barcodes appears to be of the order of 1 in 10,000 or less. In terms of sequencing lengths, we sequenced 8 bases from the standard i7 barcodes, 12 bases from the custom i5 barcodes, and 74 bases from each end of the insert (paired-end sequencing) using an illumina 150 cycles High Output kit for each of the two time courses.

Bioinformatics pipeline
After sequencing was completed, we converted BCL files into gzipped FastQs via illumina's bcl2fastq. Because this software struggles with very high plexity libraries, we wrote a custom demultiplexer that copes better with the ∼ 1, 000 cells per sequencing run of each time course. We then mapped the reads against the human GRCh38 genome with supplementary ERCC sequences using STAR Aligner (Dobin et al. 2013) and counted genes using htseq-count (Anders, Pyl, and Huber 2015) . Because the latter software was unmaintained at the time, one of us (FZ) took over the maintenance of the project, refactored the code, and added automated testing to check for software bugs. The reads that did not mapto the human genome were remapped to the Dengue/Zika genome with rather permissive criteria using Stampy (Lunter and Goodson 2011) , filtered via custom scripts to eliminate artifacts, and counted to determine the viral reads per million transcripts (see below). The stanford high-performance computing clusters Sherlock and Sherlock 2.0 were used for the computations. Once the gene/virus counts were available, the downstream analysis was performed on a laptop using both custom Python scripts and the library singlet (https://github.com/iosonofabio/singlet), which is a second from-scratch implementation of the same functionality to minimize software bugs. The scientific data libraries numpy and scipy (van der Walt, Colbert, andVaroquaux 2011) , pandas (McKinney 2011) , xarray (Hoyer and Hamman 2017) , SeqAn (Döring et al. 2008) and its derivative seqanpy ( https://github.com/iosonofabio/seqanpy ) were used for number crunching. Matplotlib (Hunter 2007) and seaborn (Waskom et al. 2014) were used for plotting. The virus particles and cell culture images in Fig.1 are used under a Creative Common license from user Nossedotti and Y tambe at https://commons.wikimedia.org.