Natural history of Ebola virus disease in rhesus monkeys shows viral variant emergence dynamics and tissue-specific host responses

Summary Ebola virus (EBOV) causes Ebola virus disease (EVD), marked by severe hemorrhagic fever; however, the mechanisms underlying the disease remain unclear. To assess the molecular basis of EVD across time, we performed RNA sequencing on 17 tissues from a natural history study of 21 rhesus monkeys, developing new methods to characterize host-pathogen dynamics. We identified alterations in host gene expression with previously unknown tissue-specific changes, including downregulation of genes related to tissue connectivity. EBOV was widely disseminated throughout the body; using a new, broadly applicable deconvolution method, we found that viral load correlated with increased monocyte presence. Patterns of viral variation between tissues differentiated primary infections from compartmentalized infections, and several variants impacted viral fitness in a EBOV/Kikwit minigenome system, suggesting that functionally significant variants can emerge during early infection. This comprehensive portrait of host-pathogen dynamics in EVD illuminates new features of pathogenesis and establishes resources to study other emerging pathogens.

Uganda and of other filovirus diseases, such as Marburg virus disease, underscore the importance of addressing filovirus threats.EVD is a prototypical viral hemorrhagic fever (VHF) with clinical manifestations including fever, severe gastrointestinal involvement, hemodynamic dysfunction, and multiorgan failure leading to death. 7Notably, the host-pathogen determinants of this severity remain relatively obscure, and we lack comprehensive insight into the molecular pathobiology underlying severe EVD.
Genomic technologies let us better understand the molecular basis of infection, but their application has been centered on a few well-studied pathogens.][10][11] Comparative analyses of these signals between pathogens and populations can identify pathogen-agnostic and pathogen-specific responses, thereby indicating pathways of potential evolutionary and therapeutic significance. 12Despite the important roles genomics and transcriptomics have played in our understanding of diseases, including coronavirus disease 2019 (COVID-19), [8][9][10][11] many severe viral threats have not been studied as extensively, in particular high-containment pathogens.Thus, there is a need for improved datasets and analytical methods integrating transcriptomics data to build a comprehensive understanding of molecular factors involved in diverse pathologies.
Previous studies of EBOV infection in non-human primate (NHP) models have largely focused on immune-related organs, with limited temporal or spatial resolution and overlooking pathogen dynamics.4][15] An extended time course further identified early and conserved blood transcriptional responses, 16 with tissue-specific and temporal-specific gene expression changes observed in some solid tissues. 17Single-cell RNA sequencing (scRNA-seq) and protein quantification by mass cytometry (CyTOF) of peripheral immune cells revealed emergency myelopoiesis and suppression of anti-viral responses in infected cells. 18RNA viruses, including EBOV, have a high mutation rate, allowing better resolution of inter-tissue viral spread and evolution.Emerging variations may allow the virus to better infect and replicate in a host; 19 biologically meaningful EBOV variants have emerged during animal studies 20 and recent outbreaks, 21,22 and varying levels of evolutionary constraint and adaptive potential have been described across the viral genome. 235][26][27] Determining the shared and specific host dynamics across tissues and associating them with the corresponding viral dynamics promises to yield a more holistic view of disease progression.
Here, we present the first comprehensive spatiotemporal characterization of host and viral dynamics in a key NHP model of severe EVD.This dataset-the largest of its kind for any maximum-containment pathogen-provides novel insights into the establishment and progression of EVD and a rich resource for understanding host-pathogen interactions.To explore this dataset, we developed and applied ternaDecov, a computational tool to infer cell type proportions from bulk RNA-seq datasets with continuous covariates, and demonstrated its broader applicability.This study elucidates global and tissue-specific changes that may contribute to pathogenesis and illuminates potential routes of viral adaptation, circulation, and compartmentalization in peripheral tissues.
We obtained high-quality sequencing data from over 300 samples despite variable RNA quality, likely arising from challenges intrinsic to biosafety level 4 (BSL-4) containment conditions.We employed rigorous filtering and quality control methods to ensure the accuracy of this large dataset (Table S2).Briefly, we removed 13 samples that had insufficient total reads (<0.5 million reads), and eight additional samples that did not match the expected animal or tissue from NHP genotype fingerprinting, chromosome X:Y read ratios, or dimensionality reduction clustering (Figure S2).Host gene expression patterns across the sample set were driven primarily by the tissue identity (Figure 1C), and within each tissue group, host expression clustering patterns were driven by DPI (Figures S3 and S4).We assembled complete EBOV genomes from many tissues and identified variants in samples with high coverage depth (Figure 1E).

Host-virus analysis, using time-regularized deconvolution, reveals the contribution of direct infection and monocyte infiltration to tissue-specific viral loads and host responses
The host and virus data from this study provide a spatiotemporal picture of how EBOV establishes infection and spreads to multiple organ systems.Viral loads increased over time across all tissues, but the rate of increase differed (Figure 2A).Spleen and liver had the sharpest rise in viral load; these tissues were likely the primary sites of infection and replication after intramuscular exposure, putatively seeding infections throughout the body. 14,29,30Lymph nodes, whole blood, and serum had high terminal viral loads ($10 5 copies/mL) but peaked later in infection (Figure 2A); these tissues likely accumulated infected cells.Other tissues (including brain, ovary/testis, skin, lung, kidney, and adrenal) had generally lower peak viral loads (<10 3 copies/mL) and slower rates of increase in viral RNA burden.In most tissues, we found that several host genes were correlated with viral RNA load.The top genes that correlated with viral load were interferon gamma and alpha ISGs (such as CXCL10/11, IF16, and IFI27) and those thought to be involved in viral defense (KCNH, OASL, and OAS2) (Figure 2B).The top genes anticorrelated with viral load included epigenetic and cell division-related genes, such as a H3K27 methyltransferase (EZH1) and a Yippee-like protein (YPEL) as well as a cell adhesion protein (NCAM1) involved in cell-matrix interactions and expansion of lymphocytes. 31e sought to further determine the factors driving differences in viral load across tissues.The viral load of a given tissue is determined by the efficiency with which EBOV infects and spreads within that tissue, the propensity of infected monocytes-the main infected immune cell population in vivo 18 -to infiltrate the tissue during infection, and/or the virus load present in circulating blood.We noted that the expression of canonical monocyte genes demonstrated a trend toward positive correlation with viral load in most tissues (Figure 2C) but not in tissues in which monocytes/monocyte-derived-macrophages are either normally abundant (blood and spleen) or a low viral load is detected (brain).We observed no consistent correlation (correlation < 0.45) between non-monocyte blood cell marker genes and viral load (Figure S5), suggesting that recruitment of infected monocytes is a significant driver of the viral load.This finding led us to investigate the role that intra-tissue changes in cell type proportion may play during pathogenesis.
Despite the availability of several deconvolution methods, which allow inference of cell type composition in bulk RNA-seq samples based on an scRNA-seq reference set, [32][33][34][35] most approaches are computationally inefficient.Furthermore, existing approaches provide only single-point estimates and do not use continuous covariates (such as time, age, developmental stage, or location) that are common features of large sequencing datasets.To address these limitations, we developed and applied a novel computational method to characterize tissue-specific changes in cell type proportions over the course of disease.We reasoned that continuous processes result in smooth trajectories that can simultaneously improve deconvolution (by sharing information between samples in close temporal proximity) and provide more information about the underlying biological process by inferring a specific parametric form of the cellular change trajectory.In our generalizable model for trajectorybased deconvolution, ternaDecov (temporal RNA deconvolution), the cellular proportions at each data point for every sample are drawn from a continuous function (Figure 2D).The form of the continuous function is not fixed and can be derived from alternative parametric and non-parametric trajectory models (STAR Methods).(legend continued on next page) Cell Genomics 3, 100440, December 13, 2023 5

Resource
We confirmed the accuracy and biological relevance of terna-Decov's cellular proportion estimates and showed that trajectory models have advantages over individual point estimates made by existing methods.We benchmarked ternaDecov using a published bulk RNA-seq dataset from human pancreatic islets 36 and an scRNA-seq reference dataset. 37We used expression of HbA1C as the covariate for trajectory regularization because levels of this gene are known to be related to changes in cell proportions. 32Estimated cell proportions from ternaDecov showed a high correlation with results from an established deconvolution method, MuSiC, 32 including a negative correlation of b cell abundance with HbA1C levels (Figure S6).To further assess the biological relevance of ternaDecov's outputs, we used the wholeblood samples in our study.Deconvolution of bulk whole-blood RNA sequencing with ternaDecov identified an increase in the proportion of neutrophils that peaked at 4 DPI (Figure 2E).This peak mirrored the observed increase in neutrophils as measured by fluorescence flow cytometry 28 (Figure 2F), scRNA-seq (0.2%-65.1% of cells between baseline and late EVD), 18 and CyTOF (9.3%-49.8%). 18Results were again consistent between terna-Decov and MuSiC (Figure S6), but ternaDecov showed faster runtimes.In addition, the trajectory models used by ternaDecov allow inference of unmeasured time points and reduce L1 error of estimates for measured time points (STAR Methods).
We next applied ternaDecov to estimate monocyte infiltration across tissues.For each tissue, we created a joint atlas of tissue-specific cell types and blood cell types (STAR Methods), and deconvolved their blood monocyte, blood non-monocyte, and tissue-specific cell type fractions.The proportion of monocytes/monocyte-derived macrophages varied across tissues, with the highest peak occurring in the lymph nodes following infection.Several tissues-most notably the lymph node, lung, kidney and liver-showed a sharp increase in the proportion of monocytes beginning around 4 DPI (Figure 2G).In contrast, the proportions of other blood cell types remained stable, and this change was not observed in tissues that are large reservoirs of monocytes at baseline (Figures 2E and S6), indicating a specific increase in monocytes in certain tissues and not an increase in circulating blood.This finding suggests that infiltrating monocytes influence the transcriptional signatures observed at this stage of infection.Deconvolution further illuminated changes in tissue-specific cell types during infection (Figure S6), such as the decrease of chromaffin cells in the adrenal gland (Figure 2H), a cell type that is infected during EVD. 38Chromaffin cells produce epinephrine, an essential hormone for the host response to infection, whose depletion could be associated with severe disease.

A tissue atlas illuminates the spatiotemporal dynamics of interferon and cytokines during EVD
To further discover molecular signatures of infection, we identified genes whose expression changed upon infection in at least one tissue or fluid.We identified differentially expressed genes (DEGs) between infected and non-infected samples (DPI % 0) independently for every tissue (false discovery rate [FDR] < 0.05 and log2 fold change (FC) > 2), resulting in the identification of between 35 and 974 DEGs per tissue (Figure 3A; Table S3).To avoid tissue sampling effects, we excluded tissue marker genes when interpreting genes across tissues (Figure S7; STAR Methods).Principal component analysis (PCA) using the log2 FCs of DEGs showed separation of tissues, indicating tissuespecific differences in response to infection (Figure 3B).Interestingly, the primary axis of variation (PC1; 12.3% variance explained) across tissues is driven by several genes related to the interferon response (Figure 3B).
We confirmed the key role of interferons and cytokines in the host response during EVD across tissues.0][41] Similarly, we found that interferon and related genes were upregulated in EVD and demonstrate that this trend is recapitulated in our extensive set of 15 distinct tissues (Figures 3C and S4).We observe a similar increase in some cytokine genes, especially in the whole blood, spleen, and skin (Figure 3C).These responses are common to viral infections in general, and their increased expression across multiple tissues is present in the well-established clinical manifestation of ''cytokine storm/cytokine release syndrome,'' which occurs during EVD. 42,43hile these genes were upregulated across distinct tissues, the degree and temporal dynamics of this upregulation differed.Indeed, although many of these genes were globally upregulated across tissues, they were also represented as the top genes driving the separation of tissues, underscoring the distinct dynamic profiles (Figure 3B).To further explore differences in the interferon and cytokine response across tissues, we examined DEGs changing over time in each tissue.Among these genes globally upregulated in response to infection, ISGs and cytokines had different dynamics between tissues across time, with an early increase in spleen, lymph nodes, liver, and whole blood and a delayed increase in secondary organs such as the brain (Figures 3D and S8).This indicates a broadly conserved interferon and cytokine response across tissues, albeit with distinct dynamics likely associated with the circulation of the virus and recruited immune cells during pathogenesis.

Tissue-specific transcription profiles reveal novel genes and pathways dysregulated in EVD
We uncovered novel transcriptional signatures of disease, identifying differences in the host responses across tissues and intertissue heterogeneity (Figures 3D, 3E, and S8).Among the DEGs with the greatest fold change in each tissue, several genes were differentially expressed in only a subset of tissues.For example, we observed changes in apoptosis-and inflammation-related genes particularly in the whole blood and kidneys.We also  Apoptosis Resource noted increased expression of PARP-family genes (PARP12, ZC3HAV1, PARP15, PARP6, and PARP11) in kidney and skin (Figure 3C).Members of the PARP family are responsible for functions including DNA repair and chaperoning 44,45 and can have pro-viral effects.For instance, PARP11 acts as a pro-viral factor in vesicular stomatitis virus infection by inhibiting the strength of interferon (IFN)-I-activated signaling. 46It is possible, therefore, that the PARP family may contribute to pathogenesis during EVD.
To nominate underlying pathogenic processes of EVD that might be indicated by DEGs, we used Gene Ontology enrichment analysis to interpret tissue-conserved and tissue-specific signals.
We identified common pathways enriched across tissues during infection, including ''negative regulation of viral genome replication'' and ''defense response to virus'' (Figure 3F).These pathways likely represent an enrichment of general antiviral defense genes common to all tissues, including genes related to the conserved IFN and cytokine responses we identified previously.Additionally, we identified enriched tissue-specific pathways, including cell migration, matrix formation, and organization (Figure 3F).These pathways suggest differential remodeling of tissues as a driver or consequence of EVD progression.
We observed significant changes in expression of genes encoding tissue connectivity-and extracellular matrix (ECM)-related proteins.Specifically, we saw a significant decrease in expression over time for tissue connectivity-related genes such as laminin, cartilage, and collagen (CILP, LAMA3, and COL17A1) in lymph nodes and spleen (Figures 3E and S9).These genes have not been reported as molecular signatures of disease but are consistent with the histological changes in vascular structure and function observed during EVD. 42We observed similar changes in ECMrelated genes in other organs, specifically in skin/muscle samples, as well as an increase in the expression of genes encoding metallopeptidases proteins (MMP2, MMP3, and MMP8) in the skin, brain, and whole blood (Figure S9).These results suggest that onset of multiorgan failure, increase in vascular permeability, and internal bleeding associated with EVD may be related to weakening of tissue connectivity associated with a downregulation of ECM genes, in addition to the known increase of tissue factor (F3) in the blood 30 (Figure S9).

Viral variants reveal patterns of compartmentalization and circulation among tissues
Given the high viral loads in several tissues in this study and the promiscuous tropism of EBOV, 47 we sought to elucidate how the virus spreads in vivo using viral variants that emerge over infection.We attempted viral genome assembly on all sequenced samples and obtained complete (>95% unambiguous nucleotides) viral genomes from 95 samples for further comparisons.Among all complete genomes, there was a single consensuslevel (>50% variant frequency) mutation.The variant, which fell at position 10,343 (in the viral protein 24 [VP24] 5 0 UTR), was detected in the sex organ of an animal sacrificed 6 DPI.The lack of consensus-level variants was expected, given the short duration of infection and absence of specific selective pressure.We also profiled minor variants in 45 samples that had sufficient viral coverage (>400x mean depth) (Figure S10; Table S4).Across the sample set, minor variants ranged from 2%-22% frequency and fell at a total of 111 unique nucleotide positions.Of these 111 variants, 5 variants were present in the infecting stock at more than 2% frequency, and an additional 3 variants were present at a more conservative threshold of 0.5% frequency (Figure 1E).To focus our analysis only on variants that arose within animals, we filtered out these 8 variants, leaving variants at 103 nucleotide positions for further study.
We first assessed global patterns in the number and frequency of variants in different tissues.We analyzed all samples available but specifically focused on whole blood, spleen, and the three distinct lymph nodes because high-coverage viral genomes were available for many animals in each of these tissues.The lymph nodes had a large number of variants that emerged within animals with high frequency; 37% of variants in the inguinal lymph node and 43% of variants in the axial lymph node had more than 5% frequency (Figure 4A).The number of variants was also consistently high in the lymph node samples across animals but with variable DPI (Figure 4B).Conversely, spleen and whole blood consistently had the fewest variants detected across animals (Figure 4B).We observe that, compared with spleen and whole blood, lymph nodes harbor more variants, and these variants also tend to be observed at higher frequencies.We find an apparent skew in the ratio of nonsynonymous to synonymous mutations in high-frequency (>5%) vs. low-frequency (<5%) variants in the inguinal lymph nodes by permutation test (5 vs. 0.11 in inguinal, p = 0.006; 1 vs. 1.36 in mesenteric, p = 0.58; 1.3 vs. 1.7 in axial, p = 0.43), suggesting that selective pressure may contribute to differences in variant frequencies between tissues.
We probed further to investigate the cause of the higher viral population diversity observed in the lymph nodes compared with that of the whole blood and spleen.For the 6 animals (2 animals from each of the 6-, 7-, and 8-DPI cohorts), we assessed the overlap of all variants observed across tissues.Globally, we found that samples from each of the three lymph nodes had several variants that were unique to that tissue, while spleen and whole blood variants were almost always shared with at least one other tissue (Figure 4C).In fact, many of the variants identified in the whole blood and spleen samples were identified in every other tissue profiled (Figure 4D).Generally, we observed a high degree of similarity between variant profiles in the whole blood and spleen and more similarity between these two tissues and each lymph node than among the lymph nodes (Figure 4D).
To investigate the source of viral diversity in the lymph nodes, we considered all tissues, noting that the sex organ samples have variant profiles that are most distinct from other tissues.For example, in the animal with a consensus-level (>50% frequency) variant, we found that there were multiple high-frequency variants in the sex organ and ovary samples, which were at an elevated frequency in the mesenteric lymph node sample, but were not detected or at low frequency (<5%) in any other sample from that individual.Previous studies have suggested that infection can be compartmentalized to the sex organs and ovaries. 48,49Our data more directly confirm the occurrence of compartmentalized infections in these tissues.The variants rising to high frequency in these sites were likely spread to the more proximal mesenteric lymph node (Figure 4E).This hypothesis may be generalized to explain why lymph nodes harbor many high-frequency, unshared variants; they likely traffic between a subset of peripheral tissues with high-frequency variants that have emerged in compartmentalized infections.

Viral variants and functional analysis suggest adaptation during EBOV infection
The viral variants that emerged over the course of infection can also help us understand viral evolution and dynamics.Emergent variants may positively or negatively impact virus biology, including altering tropism, infectivity, and escape potential. 20,50 examined the distribution and types of emerging mutations across the viral genome.UTRs showed a higher number of variants per 1,000 bp than coding regions (8.1 versus 5.9), consistent with findings of intra-host diversity in human cases. 23mong genes, we observed the highest number of mutations per 1,000 bp in VP40 (14.3), which is involved in virion assembly and immune evasion, 51 and glycoprotein (GP) (6.9), which is immunogenic and critical for infectivity 52 (Figure 5A).VP40 and GP also had the highest proportions of nonsynonymous variants.We observed narrower regions of other genes that, with high  Cell Genomics 3, 100440, December 13, 2023 9 Resource proportions of nonsynonymous variants, including the C-terminal end of the nucleoprotein (NP) and N-terminal end of the viral polymerase (L), which are each part of the ribonucleoprotein (RNP) complex that performs viral replication and transcription (Figure 5A).We find evidence of negative selection in the L gene by binomial test (p = 2.6 3 10 À5 ) but no evidence of ratio skew in VP40, GP, or NP (respective p values of 0.24, 0.53, and 0.13).Across the genome, A-to-G and T-to-C mutations were more frequent than G-to-A or C-to-T mutations, with a particularly high proportion of these mutations in two specific animals (Figure S11).We did not observe clear tissue-specific trends in variant location or type (Figure S12).We adapted a well-established transcription-and replicationcompetent virus-like particle (trVLP) minigenome system 53 to assess the functional effects of eight coding mutations (in GP, L, and VP30) and four non coding mutation (in the UTR of VP24) across the complete viral life cycle (Figure 5B).This system allows the study of EBOV genes outside of BSL-4 laboratories by separating the RNP complex into three separate plasmids (L, NP-P2A-VP35, and VP30) that drive replication of a T7-driven minigenome composed of reporter genes and the remaining three EBOV genes (VP40, GP, and VP24) (Figure 5C).We recloned the entire system to encode the EBOV/ Kikwit backbone, as established previously trVLP systems encoded EBOV genes from variants that diverge in sequence from Kikwit by hundreds of nucleotides.Co-transfection of all four plasmids into mammalian host cells results in transcription and replication of the multicistronic minigenome, including a fluorescent marker, which we detected by flow cytometry.These cells also produce GP-coated trVLPs, which can infect any target cell that expresses the viral RNP complex.For testing, we prioritized variants that emerged in multiple animals or rose to high frequency or changed in frequency relative to the infecting stock and were in genes or regions likely to be important for viral fitness (Figure 5B).
Because mutations in viral glycoproteins are often under selection, we prioritized these variants for functional effects.Of the five GP variants we tested, four had a significant effect on viral fitness (Figures 5D and S13).Consistent with the role GP plays during viral entry, additional testing with a GP-pseudotyping assay that specifically models this step suggests that this fitness difference is likely due to a difference in productive host-receptor interactions (Figure S13).The convergent mutations at amino acid position 65 (S65A and S65P) resulted in an increase in infectivity.Notably, a mutation at this position was present in viral sequences from a human case (GenBank: MH121168.1)and has been shown previously to be important for establishing mouse-adapted variants of EBOV/Mayinga and EBOV/Makona, [54][55][56] further supporting a key role played by this position.On the other hand, the variants H139R and N506T resulted in a significant loss of infectivity.Interestingly, a published crystal structure of GP bound to the human receptor NPC1 showed that H139R is proximal to this interaction, 57 and the region surrounding N506T is the binding site of the neutralizing antibody KZ52, derived from a human survivor of the 1995 Kikwit outbreak. 58ext, we leveraged our ability to simulate the full viral life cycle with the trVLP minigenome system to study mutations in genes that impact transcription and replication.Functionally relevant mutations have emerged during human outbreaks of EBOV in genes involved in viral replication and transcription as well as in regulatory regions. 22,23,59,60Of the four VP24 UTR variants we tested, only G10243A showed a slight impact on viral fitness, potentially because of the more subtle ways in which UTR variants could affect viral fitness, which are outside the limit of detection for this system.Among the three variants we tested in the RNP complex, we found that mutations in VP30 showed no significant effect on viral fitness; however, a mutation (N1649T) on the viral polymerase (L) has a significant effect on viral fitness (Figures 5D and S13).N1649T is located in the predicted MTase domain of the viral RNA dependent RNA polymerase (RdRp) 61 and decreased viral fitness.Despite recent elucidation of the complete RdRp structure, 61 the MTase domain has yet to be experimentally resolved.Our results suggest that it might play a role in maintaining viral fitness, warranting further studies of its structure and function.

DISCUSSION
Here, we apply high-depth, unbiased sequencing, complemented by newly established experimental and computational approaches, to a large natural history study in rhesus monkeys to provide insights into the molecular basis of disease.3][64][65][66] By following these dynamics over time, we can further observe how infection drives disease progression and virus adaptation.Together, these perspectives show widespread, systemic changes during acute disease.
Emerging variants at over 100 positions across the viral genome illuminated potential sites of adaptation and compartmentalization during acute infection.Shared patterns of minor variants suggest a model where the spleen and blood spread virus systemically, likely mediated by recruitment of infected monocytes, while the lymph nodes traffic virus among locally compartmentalized infections.Compartmentalized infections in EVD, particularly in immune-privileged sites like the reproductive tract, could promote persistent infection and sustained evolution and pose a risk for reactivation and onward transmission. 67sing genomic data, we show that, after viral dissemination in EBOV-exposed NHPs, 48,68,69 viral populations are actively maintained and compartmentalized in these tissues, distinct from infection in other organs.Several features of this emerging viral variation, including a higher frequency of T-to-C mutations, have been observed in human outbreaks 23,70,71 and in response to therapeutic agents. 20The higher frequency of T-to-C and A-to-G mutations relative to G-to-A mutations may suggest host RNA editing activity, and past studies indicate that T-to-C mutations are clustered in specific regions. 70In contrast, VP40, which here had the highest frequency of nonsynonymous mutations (Figure 5A), has been suggested previously to be strongly conserved in human outbreaks. 23The differences in the distribution of mutations across some viral genes may reflect rapid initial adaptation of the virus, similar to that seen immediately after zoonotic spillover.The number of unique viral variants we detect in tissues highlights the importance of animal models for providing insights into selective pressures in different compartments.
Of the 12 variants we tested in our minigenome system, six were found to significantly alter viral fitness, with the majority of these (4 of 6) falling in the GP gene, indicating viral entry as a mechanism.Half of the variants we tested did not have any observed impact on viral fitness.This is unsurprising because variants could have increased in frequency by chance because of genetic drift, further highlighting the importance of experimental assays that can rapidly and easily screen for functional effects of mutations.3][74][75][76] Although further mechanistic and structural studies are needed to determine the impact the emerging mutations detected in this study have on viral fitness, our results support the potential of trVLPs to uncover novel mutations that affect viral entry, replication, and infection, which could guide future rational design approaches in drug discovery.
Our analysis of host transcriptional responses across tissues adds further dynamic and tissue-specific context to known features of pathogenesis and identifies intriguing novel responses related to tissue connectivity.Beyond expected changes in ISG and cytokine expression, 14 the comprehensive nature of our dataset enabled us to identify differential dynamics across tissues.This study also revealed previously unknown features of disease.We observed changes in ECM genes in most tissues, with widespread dysregulation of collagen-, laminin-, and cartilage-related gene families in several tissues as well as an increase in collagen cleaving enzymes such as metallopeptidase (MMP8, MMP3, and MMP2) in the blood, skin, and brain.These findings provide new molecular insight into the etiology of vascular endothelial and connective tissue disruption (i.e., vascular leak syndromes, characteristic of severe EVD) and may suggest molecular pathobiology common to other hemorrhagic fevers; for example, similar dysregulations in ECM have been reported in other hemorrhagic fevers, such as dengue virus infection, 77 and ECM cleaving enzymes play a key role in venominduced hemorrhage. 78Interestingly, these enzymes have also been reported to play a role in cell-to-cell viral transmission in West Nile virus 79 and influenza virus, 80 warranting further investigation into the roles of these genes in EVD.
Characterizing host and pathogen dynamics in this large serial sacrifice study required establishing new computational and experimental tools that we believe will be of broad use in future studies.ternaDecov fills a key gap among available deconvolution tools [32][33][34][35] when time-series bulk RNA-seq data are available.By incorporating time as a variable in its deconvolution model of bulk data from a single-cell reference, ternaDecov better models gene expression dynamics.While studying changes over the course of infection was our primary motivation in developing ternaDecov, any continuous covariates can be used, demonstrating the broader applicability of this method.Similarly, existing trVLP minigenome systems were not adapted to the EBOV variant used in this and many other animal studies of EVD.TrVLP minigenomes are powerful systems because they allow the full viral life cycle to be modeled at lower levels of biosafety containment and have been used previously to functionally characterize mutations in other EBOV variants. 22,53Because the EBOV Kikwit variant is recognized as the standard challenge virus for testing clinical countermeasures in animal studies, we believe that the EBOV/Kikwit trVLP system we adapted will be a valuable community resource for future assessment of emerging mutations.
Through this study, we add further spatial and temporal granularity to known signatures of EVD while also suggesting new molecular drivers of pathogenesis.We illustrate relationships between host and viral signatures during EVD and propose potential mechanisms that may generate these signatures.Finally, we provide computational and experimental tools to not only facilitate further investigations of EBOV infections but also provide a model for future studies seeking to nominate and validate molecular bases of disease progression.

Limitations of the study
The major limitations of this study arise from the constraints inherent to working in maximum containment, and there are several areas where the study could be expanded to increase the breadth and depth of characterization.In particular, many liver samples had low RNA quality, restricting the insights we could obtain for this tissue.The liver harbors many enzymes that degrade RNA, and degradation was likely exacerbated by the constraints of working in maximum containment.Improved preservation methods as well as even broader sampling of clinically relevant tissues, such as the gastrointestinal tract, 81,82 would be of interest for future investigations.Additionally, the timing of host transcriptional changes suggests that the recruitment of infected circulating monocytes is a major contributing factor to the spread of the virus to secondary organs.Future studies using scRNA-seq on tissue samples would allow changes in cell type proportions and the impact of infection on specific cell types to be measured more directly, as shown previously in peripheral blood mononuclear cells from this study. 18inally, uniformly lethal animal models like the one used here restrict the study of persistence, acute recovery, and long-term effects of the infection.New experimental challenge models with different routes of inoculation and heterogeneity in outcomes could enable a better understanding of these features in surviving NHPs.TernaDecov offers a modular model structure in which the cell type proportions of each sample are obtained from one of several alternative trajectory modules.The trajectory modules take as input the sampling time covariate and return a draw of sample-specific cell proportions (p nc ) as a result in different ways depending on their internal structure.Trajectory modules currently implemented in ternaDeCov include: (1) simple polynomial trajectories, (2) Legendre polynomial trajectories, (3) Gaussian process with different kernel functions, and (4) a ''trivial'' trajectory model that does not take into account sample collection time, effectively producing independent deconvolution of samples similar to traditional deconvolution algorithms.

STAR+METHODS
The cell-type proportions (p nc ) are multiplied with the summarized single-cell reference (.) after scaling by learnable gene specific capture rate coefficients (b g ) to produce location parameter for a Negative Binomial distribution from which the observed count matrix is sampled from using gene specific dispersion parameters (4 g ).
The full model is specified as follows: Resource

Figure 1 .
Figure 1.Study overview (A) Description of the animal study and dataset, including the number of animals, time points, and samples collected.(B) Schematization of study design and experimental and analytical workflow.(C) t-distributed stochastic neighbor embedding (tSNE) plot of transcriptional signatures, demonstrating that unique tissues cluster together and with commercial controls of the same type.(D) Viral load across time in whole blood (top) and across tissues and other fluids at necropsy (bottom) for each animal, ordered by time between infection and necropsy.Colors represent viral RNA as log10(copies/mL), as assessed by qRT-PCR; gray represents no data.(E) Viral variants across the EBOV genome identified in infecting viral stock and infected animals.Variants, designated by lines, are colored by their presence in stock (top) and frequency in infected animals (bottom).Images were created with BioRender.

Figure 2 .
Figure 2. Correlating viral dynamics and host response to infection (A) Viral loads, as determined by qRT-PCR, plotted versus time.The trajectories for different tissues were separated into three distinct patterns using K-means longitudinal data clustering, yielding groups of tissues with similar viral load dynamics.(B) Gene expression across tissues (separated by the clusters in A) for the top 8 correlated and anti-correlated DEGs and 3 representative viral genes.Samples are ordered along the x axis by tissue and DPI.On the y axis, DEGs are clustered and labeled by direction.(C) Correlation between viral load and canonical monocyte marker expression across each tissue.(D) Overview of modular deconvolution framework used in ternaDecov.The output proportions from the models are then used to draw observed sample counts from a negative binomial distribution based on the provided single-cell profiles.

Figure 3 .
Figure 3. Host transcriptomics across tissues and time (A) Number of DEGs between non-infected and infected samples; tissues with more than 5 DEGs are shown in the plot.(B) PCA of log2 fold changes of significantly DEGs between infected and uninfected samples.Top contributing genes for PC1 and PC2 are highlighted.(C) Heatmap of fold-changes of top DEGs across tissues, stratified by meaningful gene categories; stars marks significant differential expression (FDR < 0.05).(D)Left: heatmap of genes changing significantly across time for brain.Right: gene expression changes across time for selected genes.Colors atop plots designate gray (light red) and white matter (dark red).(E) Same as (D) but for lymph nodes (shades of orange) and spleen (purple); colors atop plots designate tissues.(F) Gene Ontology (GO) term analysis of genes differentially expressed (top 100 FDR < 0.01) across time as determined by ImpulseDE2.Enriched terms were determined per tissue, and the top 3 GO terms, as determined by Kolmogorov-Smirnov (KS) test, per tissue were selected for display.Colors of circles correspond to Àlog10(KS pval) of the enriched term within tissue, and sizes of circles correspond to odds ratio.

Figure 4 .
Figure 4. Minor viral variants show compartmentalization and circulation (A) Frequencies of all nonsynonymous (red) and synonymous/noncoding (gray) variants that emerged during infection, plotted and separated by tissue; the percentage of variants above 5% frequency (dotted line) is given above each tissue.(B) For each animal (ordered by DPI), the number of variants that emerged in every tissue (samples with >4003 mean viral coverage).(C) Violin plot showing the proportion of shared viral variants, separated by tissue; each point represents a unique animal, and symbols demonstrate DPI.(D) Schematic representing variants that are shared (numbers displayed in overlapping circles) and not shared (numbers displayed in non-overlapping circles) in all tissues available for 6 animals (2 of each the D6, D7, and D8 cohorts).(E) Left: schematic of viral circulation among tissues, based on the variant profiles (image created with BioRender).Right: a Spearman correlation of different tissues' variant profiles, concatenated across animals.

Figure 5 .
Figure 5. Viral adaptation and fitness effects (A) Top: number of emergent variants per 1,000 bp (gray) were quantified for each gene-coding region as well as proportion of nonsynonymous variants (red).Bottom: accumulation of total (gray) and nonsynonymous (red) variants in specific gene regions was quantified using a sliding window of 200 bp.(B) Genomic locations of variants selected for further functional testing (red) among all variants identified across the EBOV genome (black).(C) Schematic of the EBOV/Kikwit transcriptionand replication-competent virus-like particle (trVLP) minigenome system that recapitulates the wild-type and variant viral life cycle in a host cell (image created with BioRender).(D) Flow cytometry analysis of the percentage of GFP+ cells 48 h post minigenome transfection as a percentage of infected host cells by seed stock (wild type [WT]) or viral variants in GP, RNP, and VP24.Error bars represent standard deviation.