Cell-intrinsic and -extrinsic effects of SARS-CoV-2 RNA on pathogenesis: single-cell meta-analysis

ABSTRACT Single-cell RNA-seq has been used to characterize human COVID-19. To determine if preclinical models successfully mimic the cell-intrinsic and -extrinsic effects of severe disease, we conducted a meta-analysis of single-cell data across five model species. To assess whether dissemination of viral RNA in lung cells tracks pathology and results in cell‐intrinsic and ‐extrinsic transcriptomic changes in COVID-19. We conducted a meta-analysis by analyzing six publicly available, scRNA-seq data sets. We used dual mapping (host and virus) and differential gene expression analyses to compare viral+ and viral− cell populations. We conducted a principal component analysis to identify successful models of human COVID-19. We found expression of viral RNA in many non-epithelial cell types. Fibroblasts, macrophages, and endothelial cells exhibit clear evidence of viral-intrinsic and -extrinsic effects on host gene expression. Using viral RNA expression, we found that K18-hACE2 mice most closely modeled severe human COVID-19, followed by hamsters. Ferrets and macaques are poor models of human disease due to the low presence of viral RNA. Moreover, we found that increased transcripts of certain key inflammatory genes such as IL1B, IL18, and CXCL10 are not restricted to virally infected cells, suggesting these genes are regulated in a paracrine or autocrine fashion. These data affirm widespread dissemination of viral RNA in the lung, which may be key in the pathogenesis of severe COVID-19 and demonstrate ferrets and Rhesus macaques are poor models of human COVID-19. IMPORTANCE We conducted a high-resolution meta-analysis of scRNA-seq data from humans and five animal models of COVID-19. This study reports viral RNA dissemination in several cell types in human data as well as in some of the pre-clinical models. Using this metric, the K18-hACE2 mouse model, followed by the hamster model, most closely resembled human COVID-19. We observed clear evidence of viral-intrinsic effects within cells (e.g., IRF5 expression) as well as viral-extrinsic cytokine modulation (e.g., IL1B, IL18, CXCL10). We observed proinflammatory chemokine expression in cells devoid of viral RNA expression, suggesting autocrine/paracrine interferon regulation. This report serves as a resource-synthesizing data from COVID-19 humans and animal models and suggesting improvements for relevant pre-clinical models that may aid future diagnostic and therapeutic development projects.

Bonferroni-adjusted p value of <0.05.To perform DEGs analysis, Seurat 'subset' function were used to extract the certain population/cell types of clusters using the Wilcoxon rank-sum test.

Venn Diagram
'Venn Diagram Maker' ( https://goodcalculators.com/venn-diagram-maker/) was used to generate symmetric Venn diagrams to compute common epithelial cell type sDEGs for human and three animal models.Venn diagram consists of multiple overlapping closed curves, each representing a species.The curves are overlapped in every possible way, showing all possible overlapping sDEGs between the species.

PCA analysis
PCA is an unsupervised machine learning technique.Besides using PCA as a data preparation technique, we also used it to visualize data.PCA was performed with the R package 'Plotting PCA' at default parameters based on the viral transcripts' dissemination of each model.The

IGV analysis
Integrative Genomics Viewer (IGV) v2.12.3 was used to analyze the coverage profiles of human and animal single-cell RNA-sequence over the sgRNA-N gene.Cellranger output (.bam) file was inputted to mapped with customized genome reference file for each human BALF and animal models data.that we did not find significant genes following DEG analysis of these clusters.The table demonstrates significant genes from DEG analysis within the viral+ and -cells.The percentage indicated is the numeric fraction of viral +/-cells within the same cell type cluster.Cd8 + T cells clusters.DEG analysis did not reveal any differences withing the Cd8 + T cell cluster.

Supplementary Figures
The tables enumerate significant genes following DEG analysis within the viral+ and -cells.The percentage indicated is the numeric fraction of viral +/-cells within the same cell type cluster.Supplementary Tables Table S1: Clinical data information of the enrolled COVID-19 severe/critical patients autoplot function (generic function to explore the genomic data) of this package was used to plot the model object in R. PC1 and PC2 are evaluated for each model vector and plotted.The percentages of variation accounted by each PC1 and PC2 were displayed on the axes.Line chart R package 'Plotly' (https://plotly.com/r/3d-line-plots/)was used to generate the 3-D line chart by using the number of twelve viral transcripts for patient and animal datasets.Where vertical (value) axis indicate the column hight and horizontal (catagory) axis indicates the number of twelve viral transcripts.The depth axis indicates the six patient and animal models.
Figure S1.Differential expression analysis comparing viral + and viral -in various cell types from infected mouse lungs.(A-G) UMAP showing viral+ and -cells (Orf10) in the (A) T cell, (B) C1qa + macrophage, (C) erythroid cell, (D) B cell, (E) monocyte, (F) alveolar macrophage, and (G) ciliated cell clusters.The absence of gene tables for alveolar macrophages and ciliated cells indicates

Figure S3 .
Figure S3.Differential expression analysis comparing viral + and viral -cells of different types derived from human patients with severe COVID-19 and infected AGM BALFs.(A-C) UMAP showing the viral+ and -cells (Human: N, AGM: Orf10) in (A) human neutrophils, (B) human APOC1 + macrophages, and (C) human T cells clusters.(D) UMAP plot displayed the major cell types in eight clusters for AGM BALF samples, (E-K) UMAP shown the viral+ and -cells in the (E) AGM APOC1 + macrophages, (F) AGM goblet cells II, (G) AGM B cells, (H) AGM endothelial cells, (I) AGM alveolar macrophages, (J) AGM T cells, and (K) AGM fibroblasts.The tables demonstrate significant genes from DEG analysis within the viral+ and -cells.The percentage indicated is the numeric fraction of viral +/-cells within the same cell type cluster.

Figure S4 .
Figure S4.DEG analysis of ferret and macaque BALF cells.(A) UMAP plot of identified cell populations in ferret integrated (COVID & control) BALF samples.Colors represent individual cell types and are described in the legend.(B) UMAP showing the viral+ and -cells in the ferrets CD68 + & S100A12 + macrophage cluster with no significant DEGs.(C) UMAP plot of identified cell clusters in macaque integrated (COVID & control) BALF samples.Ten individual cell types were profiled in this dataset.(D) UMAP shown the viral+ and -cells in the macaques MARCO + macrophage cluster.Tables show the viral gene ORF10 Is DE however, no host genes are DE between these groups.

Figure S5 .
Figure S5.Venn diagram comparing epithelial cell type sDEGs for human and three animal models.Numbers represent the sDEGs between viral + and viral -epithelial cells for each species.Human gene nomenclature was used for generating the diagram.Ferret and Rheses macaque data were excluded due to minimal viral loads.

Figure S6 .
Figure S6.Coverage profiles of human and animals single-cell RNA-sequence over the sgRNA-N gene (203 base pairs (bp)).The annotation shown for the 3'-UTR elongated by 20 bp.

Figure S7 .
Figure S7.Analysis of the sgRNA-N expression in the infected animal's lung and BALF.(A-E) violin plots for the expression in (A) mouse lung, (B) hamster lung, (C) AGM BALF, (D) ferret, and (E) macaque BALF.The plots represent the expression of viral sgRNA-N across different cell types found within the infected data.(F-J) UMAPs showing differentially expressed genes between sgRNA-N+ and -cells in the (F) mouse fibroblasts, (G) hamster Marco + alveolar macrophages, (H) mouse C1qa + macrophages, (I) mouse erythroid cells, and (J) hamster myocyte cells.The tables provide the significant genes upregulated in the sgRNA-N + fraction.A gene is considered significant if it exhibits an adjusted P < 0.05 (P-value adjusted by multiple testing in the Wilcoxon rank-sum test).sgRNA-N = subgenomic RNA.

Figure S8 .
Figure S8.Examining the absence of sgRNA-N across different cell clusters described within infected animal lung and BALF samples.(A-H) UMAPs showing the expression of viral + and sgRNA-N + cells in the animal data (A) mouse ciliated cells, (B) hamster C1qb + macrophages, (C) hamster Cd8 + T cells, (D) hamster ciliated cells, (E) AGM alveolar macrophages, (F) AGM goblet cells, (G) AGM ciliated cells, and (H) AGM fibroblasts.A gene is considered significant if it has an adjusted P < 0.05 (P-value adjusted by multiple testing in the Wilcoxon rank-sum test).

Figure S9 .
Figure S9.Analysis of proinflammatory cytokines and the most abundant viral RNA, Orf10, across animal models.(A-C) UMAP plots displaying the major cell types found in the integrated (infected and control) datasets from (A) mouse, (B) hamster, and (C) AGM samples.Colors represent individual cell types and are depicted in legend.(D-I) UMAPs showing the percentage of Il1b + and Il18 + cells in the COVID-infected and control data.Tables showing the upregulated genes in the Il1b + and Il18 + COVID-infected fraction for each animal model (D and E) mouse, (F and G) hamster, and (H and I) AGM.A gene is considered significant if it achieves an adjusted P < 0.05 (P-value adjusted by multiple testing in the Wilcoxon rank-sum test).

Figure S10 .
Figure S10.Expression of Il1b, Il18, and Cxcl10 and the most abundant viral RNA (Orf10) in the animal datasets.(A-C) UMAPs showing the percentage of Il1b + and Il18 + cells in the COVIDinfected and control data.Tables enumerating upregulated genes in the Il1b + or Il18 + COVIDinfected fraction for each model system: (A) ferret, (B and C) macaque.(D-H) UMAP demonstrating the expression of Cxcl10 in the (D) mouse, (E) hamster, (F) AGM, (G) ferret, and (H) macaque derived samples.The different panels indicated Cxcl10 expression in the individual sample for the infected and control data.(I-L) UMAPs showing the expression of the most abundant viral RNA, Orf10, in the (I) mouse, (J) hamster, (K) AGM, and (L) macaque data.Different panels are provided for infected and control data.

Figure S11 .
Figure S11.Expression of Cxcl10 and most abundant viral RNA (Orf10) are discordant.(A-J) Violin plots showing expression levels of Cxcl10 and Orf10 in different cell type (shown as different clusters) in the infected (A and B) mouse whole lung, (C and D) hamster whole lung, (E and F) AGM BALF, (G and H) ferret BALF, and (I and J) macaque BALF.Colors representing individual cell types are described in the legend.