Absolute quantitation of individual SARS-CoV-2 RNA molecules: a new paradigm for infection dynamics and variant differences

Despite an unprecedented global research effort on SARS-CoV-2, early replication events remain poorly understood. Given the clinical importance of emergent viral variants with increased transmission, there is an urgent need to understand the early stages of viral replication and transcription. We used single molecule fluorescence in situ hybridisation (smFISH) to quantify positive sense RNA genomes with 95% detection efficiency, while simultaneously visualising negative sense genomes, sub-genomic RNAs and viral proteins. Our absolute quantification of viral RNAs and replication factories revealed that SARS-CoV-2 genomic RNA is long-lived after entry, suggesting that it avoids degradation by cellular nucleases. Moreover, we observed that SARS-CoV-2 replication is highly variable between cells, with only a small cell population displaying high burden of viral RNA. Unexpectedly, the B.1.1.7 variant, first identified in the UK, exhibits significantly slower replication kinetics than the Victoria strain, suggesting a novel mechanism contributing to its higher transmissibility with important clinical implications. Graphical Abstract In brief By detecting nearly all individual SARS-CoV-2 RNA molecules, we quantified viral replication and defined cell susceptibility to infection. We discovered that a minority of cells show significantly elevated viral RNA levels and observed slower replication kinetics for the Alpha variant relative to the Victoria strain. Highlights Single molecule quantification of SARS-CoV-2 replication uncovers early infection kinetics There is substantial heterogeneity between cells in rates of SARS-CoV-2 replication Genomic RNA is stable and persistent during the initial stages of infection B.1.1.7 variant replicates more slowly than the Victoria strain

We examined the vRNA replication dynamics and found the ratio of sgRNA/gRNA ranged from 1 0.5-8 over time ( Figure 3G), consistent with a recent report in diagnostic samples 2 (Alexandersen et al., 2020). Notably, the sgRNA/gRNA ratio increased between 2 and 10 hpi, 3 followed by a decline at 24 hpi, indicating a shift in preference to produce gRNA over sgRNA 4 in later stages of infection. A similar trend was observed in RDV-treated cells, with a reduced 5 sgRNA/gRNA peak at 8-10 hpi. We estimated the sgRNA/gRNA ratio for individual cells and 6 found that sgRNA synthesis is favoured in the 'partially resistant ' and 'permissive' cells, 7 whereas the 'super-permissive' cells had a reduced ratio of sgRNA/gRNA ( Figure 3H). In 8 summary, these results indicate that gRNA synthesis is favoured in the late phase of infection, 9 that may reflect the requirement of gRNA to assemble new viral particles. Positive-sense RNA viruses, including coronaviruses, utilise host membranes to generate viral 12 factories, which are sites of active replication and/or virus assembly (Wolff et al., 2020). Our 13 current knowledge on the genesis and dynamics of these factories in SARS-CoV-2 infection 14 is limited. We exploited the spatial resolution of smFISH to study these structures, which we 15 define as spatially extended foci containing at least 4 gRNA molecules. We observed 1-2 16 factories per cell at 2 hpi, which increased to ~30 factories/cell by 10 hpi ( Figure 3I). In addition, 17 the average number of gRNA molecules within these factories, although variable, increased Super-permissive cells are randomly distributed. 24 Our earlier kinetic analysis of infected Vero E6 cells identified a minor population of 'super-25 permissive' cells containing high gRNA copies at 8 hpi. A random selection of ~300 cells 26 allowed us to further characterise the infected cell population ( Figure 4A-B). To extend these 27 observations we examined the vRNAs in two human lung epithelial cell lines, A549-ACE2 and 28 Calu-3, that are widely used to study SARS-CoV-2 infection (Chu et al., 2020;Hoffmann et 29 al., 2020). In agreement with our earlier observations with Vero E6, 3-5% of A549-ACE2 and 30 Calu-3 cells showed a 'super-permissive' phenotype ( Figure 4C-D). An important question is 31 how these 'super-permissive' cells are distributed in the population, as the pattern could 32 highlight potential drivers for susceptibility (Healy et al., 2020). Infection can induce innate 33 signalling that can lead to the expression and secretion of soluble factors such as interferons 34 that induce an anti-viral state in the local cellular environment (Belkowski and Sen, 1987; 35 . CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Schoggins and Rice, 2011). Regulation can be widespread through paracrine signalling or 1 affect only proximal cells. We considered three scenarios where 'super-permissive' cells are: 2 randomly distributed, evenly separated spatially in the population or clustered together. We 3 compared the average nearest neighbour distance between 'super-permissive' cells and 4 simulated points that were distributed either randomly, evenly or in clusters ( Figure S4A-C).

5
In summary, our results show conclusively that the 'super-permissive' infected Vero E6, A549-6 ACE2 and Calu-3 cells were randomly distributed ( Figure 4E-F and S4D-E). We interpret 7 these data as being consistent with an intrinsic property of the cell that defines susceptibility 8 to virus infection. The data also argue against cell-to-cell signalling mechanisms that would 9 either lead to clustering (if increasing susceptibility) or to an even distribution (if inhibiting) of 10 infected cells.

12
Differential replication kinetics of the Alpha B.1.1.7 and Victoria strains. 13 The recent emergence of SARS-CoV-2 VOC, which display differential transmission,  Figure 5H and S5A-B) and their distribution was random ( Figure S5C). RDV treatment 33 ablated the differences between the viral strains, demonstrating that the observed phenotype 34 is replication-dependent ( Figure 5B, E, G).

35
. CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. To confirm that our results with the B.1.1.7 variant are not limited to Vero E6 cells, we 1 assessed replication fitness of both variants in A549-ACE2 cells that were recently reported to 2 be immunocompetent (Li et al., 2021). Both Victoria and B.1.1.7 infection resulted in 3 comparable numbers of infected cells and similar gRNA copies/cell at 2 hpi, demonstrating 4 similar cell entry properties ( Figure S6A). However, infection with the B.1.1.7 variant led to a 5 reduced gRNA and sgRNA cellular burden at 8 and 24 hpi ( Figure 6A-B and S6B-C).

6
Moreover, fewer 'super-permissive' cells were detected at these time points ( Figure 6C). We 7 interpret these results as showing that the Alpha B.1.1.7 variant has reduced replication 8 kinetics in cells with an active antiviral response.

24
To explore the differences in RNA replication between B.1.1.7 and Victoria strains we 25 assessed the abundance of the different vRNAs. As our RNA-seq analysis was performed 26 with cDNA produced from ribosome-depleted total RNA, we were able to quantify the negative 27 sense viral RNAs that lack poly(A) tail and will be depleted in oligo(dT)-based approaches. In 28 agreement with our smFISH results ( Figure 2G), negative sense viral RNAs represented a 29 small fraction of the total vRNA present in the cell ( Figure 6F). Small numbers of negative 30 sense vRNA transcripts were already detectable at 2 hpi, supporting our earlier conclusion 31 that viral replication occurs very early, at least in the ~60% of the cell population identified by 32 smFISH ( Figure 3C). The level of negative sense vRNAs increased through the time course 33 for the Victoria strain, but in the case of B.1.1.7 we observed a modest reduction between 8 34 and 24 hpi ( Figure 6F).

35
. CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 To assess whether B.1.1.7 showed a delayed expression of sgRNAs, we quantified the reads 1 mapping to the junctions derived from RNA-dependent RNA polymerase discontinuous 2 replication (Kim et al., 2020;V'Kovski et al., 2021). In agreement with smFISH data, sgRNAs 3 were detected in low quantities at 2 hpi ( Figure 6D and G) and we observed a rise and fall in 4 sgRNA/gRNA count ratio in the Victoria samples between 8-24 hpi ( Figure 6G), in agreement 5 with our single cell smFISH measurements ( Figure 3G). In B.1.1.7 infected samples the 6 sgRNA/gRNA ratio was significantly reduced at 8 hpi, suggesting a delayed expression of 7 sgRNAs. Interestingly, B.1.1.7 maintained the sgRNA/gRNA ratio of ~ 4.5 until 24 hpi ( Figure   8 6G), suggesting an extended window of sgRNA expression. In summary, these transcriptomic 9 data confirm our smFISH analyses and highlight the altered kinetics of sgRNA/gRNA between 10 the Alpha B.1.1.7 and Victoria strains.

13
Our spatial quantitation of SARS-CoV-2 replication dynamics at the single molecule and single 14 cell level provides important new insights into the early rate-limiting steps of infection.

15
Typically, analyses of viral replication are carried out using 'in-bulk' approaches such as  qPCR and conventional RNA-seq. While very informative, these approaches lack spatial 17 information and do not allow single cell analyses. Although single cell RNA-seq analyses can 18 overcome some of these issues (Fiege et al., 2021;Ravindra et al., 2021), their low coverage 19 and lack of information regarding the spatial location of cells remains significant limitations. In 20 this study we show smFISH is a sensitive approach that allows the absolute quantification of 21 SARS-CoV-2 RNAs at single molecule resolution. Our experiments show the detection of 22 individual gRNA molecules within the first 2h of infection that most likely reflect incoming viral 23 particles. However, we also observed small numbers of foci comprising several gRNAs that 24 were sensitive to RDV treatment, demonstrating early replication events. We believe these 25 foci represent 'replication factories' as they co-stain with FISH probes specific for negative 26 sense viral RNA and sgRNA.

28
These data provide the first evidence that SARS-CoV-2 replication occurs within the first 2h 29 of infection and increases over time. This contrasts to our observations with the J2 anti-dsRNA 30 antibody where viral dependent signals were apparent at 6 hpi (Cortese et al., 2020;Eymieux 31 et al., 2021). We noted that co-staining SARS-CoV-2 infected cells with J2 and ORF1a FISH 32 probe set showed a minimal overlap, suggesting that infection may induce changes in cellular 33 dsRNA. The observation that infection can perturb mitochondrial function provides a possible 34 explanation for these observations (Appelberg et al., 2020;Mullen et al., 2021). Importantly,

35
. CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 mitochondrial dsRNAs can engage MDA5-driven antiviral signalling (Dhir et al., 2018), a 1 recently identified key sensor of SARS-CoV-2 ( Thorne et al., 2021b;Yin et al., 2021). These 2 findings highlight the utility of smFISH to uncover new aspects of SARS-CoV-2 replication that 3 are worthy of further study.

5
We found that SARS-CoV-2 gRNA persisted in the presence of RDV, suggesting a long half-6 life that may reflect the high secondary structure of the RNA genome that could render it 7 refractory to the action of nucleases (Simmonds et al., 2021). smFISH revealed complex 8 dynamics of gRNA and sgRNA expression that resulted in a rapid expansion of sgRNA 9 (peaking at 8hpi), followed by a shift towards the production of gRNA (24 hpi), results that 10 were confirmed by RNAseq. Since a viral particle is composed of thousands of proteins and 11 a single RNA molecule, we interpret the high synthesis of sgRNAs as aiming to fulfill the high  Our study shows that cells vary in their susceptibility to SARS-CoV-2 infection, where the 17 majority of cells had low vRNA levels (<10 2 copies/cell) but a minor population (4-10% for 18 different tissue cell lines) had much higher vRNA (>10 5 copies/cell) at 10 hpi. In contrast the 19 cellular vRNA copies at 2 hpi were comparable, suggesting that this phenotype is not  CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ; https://doi.org/10. 1101/2021 Given the current status of the pandemic, there has been a global effort to understand the 1 biology of emergent VOC with high transmission rates and possible resistance to neutralizing 2 antibodies. The majority of studies have focused on mutations mapping to the Spike 3 glycoprotein as they can alter virus attachment, entry and sensitivity to vaccine induced or 4 naturally acquired neutralizing antibodies. However, many of the mutations map to other viral 5 proteins, including components of the RNA-dependent RNA polymerase complex that could 6 impact RNA replication. Our smFISH analysis revealed that the Alpha B.1.1.7 variant shows 7 slower replication kinetics than the Victoria strain, resulting in lower gRNA and sgRNA copies 8 per cell, fewer viral replication factories and a reduced frequency of 'super-permissive' cells.  CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ; https://doi.org/10.1101/2021.06.29.450133 doi: bioRxiv preprint   CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ; https://doi.org/10.1101/2021.06.29.450133 doi: bioRxiv preprint RNA prior to probe hybridisation. Representative full z-projection (8 µm) confocal images are 1 shown. Scale bar = 10 µm. 2 (B) Calu-3 (top panels) and Huh-7.5 (lower panels) cells were infected with SARS-CoV-2 3 (Victoria) and HCoV-229E, respectively, at an MOI of 1, fixed at 24 hpi and hybridised with the 4 SARS-CoV-2-specific +ORF1a probe. In addition, cells were stained with anti-dsRNA (J2) to

34
. CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ; https://doi.org/10.1101/2021.06.29.450133 doi: bioRxiv preprint (E) Comparison of anti-dsRNA (J2) and gRNA smFISH. Full z-projected images of infected 1 Vero E6 cells co-stained with J2 and smFISH are shown. Scale bar = 10 µm.  CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ; https://doi.org/10.1101/2021.06.29.450133 doi: bioRxiv preprint        Student's t-test.

32
. CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ; https://doi.org/10.1101/2021.06.29.450133 doi: bioRxiv preprint (E) Bigfish quantification of +ORF1a and +ORF-N smFISH counts per cell. Due to bimodality 1 of the data, statistical significance was determined using two-sample Kolmogorov-Smirnov 2 test to compare cumulative distribution of +ORF1a counts between the two strains. (n=3).      supplemented with 10% fetal bovine serum, 2mM L-glutamine, 100 U/mL penicillin and 4 10μg/mL streptomycin and non-essential amino acids. All cell lines were maintained at 37oC 5 and 5% CO2 in a standard culture incubator. (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Single molecule fluorescence in situ hybridisation (smFISH). smFISH was carried out as 1 previously reported (Titlow et al., 2018;Yang et al., 2017) with minor modifications. Briefly, 2 cells were grown on #1.5 round glass coverslips in 24-well plate or in µ-Slides 8 well glass 3 bottom (IBIDI) and fixed in 4% paraformaldehyde (Thermo Fisher) for 30 min at room 4 temperature. Cells were permeabilised in PBS/0.1% Triton X-100 for 10 min at room 5 temperature followed by washes in PBS and 2x SSC. Cells were pre-hybridised in pre-warmed 6 (37˚C) wash solution (2x SSC, 10% formamide) twice for 20 min each at 37˚C. Hybridisation  In the experiment to detect viral negative strands, double-stranded RNA (dsRNA) was 22 denatured using DMSO, formamide or NaOH. After the permeabilisation step, cells were 23 rinsed in distilled water and were treated with 50 mM NaOH for 30s at room temperature, 70% 24 formamide at 70˚C for 1 h or 90% DMSO at 70˚C for 1 h. Following the treatments, cells were 25 quickly cooled on ice, washed in ice-cold PBS and subjected to standard smFISH protocol. were singly labelled with ATTO633, ATTO565, Cy3, or ATTO488 at 3' ends according to a 35 . CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ; https://doi.org/10.1101/2021.06.29.450133 doi: bioRxiv preprint published protocol (Gaspar et al., 2017) and were concentration normalised to 25 µM. All 1 probe sets used in this study had degree of labelling > 0.94. CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ; https://doi.org/10.1101/2021.06.29.450133 doi: bioRxiv preprint processed in 2D due to memory constraints, results were indistinguishable from 3D-processed 1 images). Threshold setting for smFISH spot detection was set specifically for each set of 2 images collected in each session. Viral factories were resolved using `decompose_cluster()` 3 function to find a reference single-molecule spot in a less signal-dense region of the image, 4 which was used to simulate fitting of the gaussian modelised reference spot into viral factories 5 until the local signal intensities were matched. Decomposed spots were grouped into clusters 6 with previously reported radii of double-membrane vesicles (DMV) measured by electron 7 microscopy (150nm pre-8hpi and 200nm post-8hpi) (Cortese et al., 2020). ImageJ. Anti-dsRNA (J2) stain was quantified by integrating fluorescence signal across the z-23 stacks of cellular region of interest divided by the cell volume to obtain signal density. Signal 24 density was normalised to the average signal density of uninfected "Mock" condition cells.

25
Fluorescence intensity profiles were obtained using ImageJ "plot profile" tool across 3 µm 26 region on 1 µm maximum intensity projected images. To assess colocalisation of N protein 27 with SARS-CoV-2 RNA, ellipsoid mask centred around centroid xyz coordinates of smFISH 28 spots were generated with the size of the point-spread function (xy radius=65 nm, z 29 radius=150 nm) using ImageJ 3D suite. Integrated density of N-protein channel (background 30 subtracted, radius=5px) fluorescence within the ellipsoid mask was measured and compared 31 to the equivalent signal in the uninfected condition. if Covid19 superinfection follows a random distribution. The general strategy was to test the 35 . CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ; https://doi.org/10.1101/2021.06.29.450133 doi: bioRxiv preprint complete spatial randomness hypothesis by comparing the average nearest neighbour 1 distance of superinfected cells to an equal number of randomly selected coordinates (Ripley, 2 1979). 2D spatial coordinates of superinfected cells were obtained from the 3D-obect counter 3 (ImageJ) as described above. Cell nuclei were segmented with the DAPI channel and 4 placement of random coordinates was confined to pixels that fell within the DAPI segmentation 5 mask. Nearest neighbour distances were calculated using the KDtree algorithm 6 (Maneewongvatana and Mount, 1999) implemented in python (scipy.spatial.KDTree).

7
Pseudo-random distributions were simulated by randomly placing the first coordinate, then 8 constraining the placement of subsequent coordinates to within a defined number of pixels. (2.7.3a) (Dobin et al. 2013). We also used STAR to assign uniquely mapping reads in strand-30 specific fashion to the ENSEMBL human gene annotation and the two SARS-CoV-2 strands.

33
First, we performed library size correction and variance stabilisation with regularized-34 . CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 logarithm transformation implemented in DESeq2 (1.28.1) (Love et al. 2014). This corrects for 1 the fact that in RNA-seq data, variance grows with the mean and therefore, without suitable 2 correction, only the most highly expressed genes drive the clustering. We then used the 500 3 genes showing the highest variance to perform PCA using the prcomp function implemented 4 in the base R package stats (4.0.2) (R Core Team 2020). was used in R, and "Numpy" and "Pandas" python packages were used in Jupyter notebook 18 for data wrangling. Following R packages were used to create the presented visualisation:   CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is  Host transcriptome   01  02  03  04  05  06  07  08  09  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47

CoV-2 +ORF1a
Immobilised virus on coverslip CoV-2 +ORF1a 5 µm . CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is  . CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. . CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021.  (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. . CC-BY-NC-ND 4.0 International license made available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprint this version posted June 29, 2021. ; https://doi.org/10. 1101/2021