Multimodal profiling of lung granulomas in macaques reveals cellular correlates of tuberculosis control

Summary Mycobacterium tuberculosis lung infection results in a complex multicellular structure: the granuloma. In some granulomas, immune activity promotes bacterial clearance, but in others, bacteria persist and grow. We identified correlates of bacterial control in cynomolgus macaque lung granulomas by co-registering longitudinal positron emission tomography and computed tomography imaging, single-cell RNA sequencing, and measures of bacterial clearance. Bacterial persistence occurred in granulomas enriched for mast, endothelial, fibroblast, and plasma cells, signaling amongst themselves via type 2 immunity and wound-healing pathways. Granulomas that drove bacterial control were characterized by cellular ecosystems enriched for type 1-type 17, stem-like, and cytotoxic T cells engaged in pro-inflammatory signaling networks involving diverse cell populations. Granulomas that arose later in infection displayed functional characteristics of restrictive granulomas and were more capable of killing Mtb. Our results define the complex multicellular ecosystems underlying (lack of) granuloma resolution and highlight host immune targets that can be leveraged to develop new vaccine and therapeutic strategies for TB.


In brief
Tuberculosis granulomas exhibit independent trajectories even within a single individual: some granulomas permit high bacterial growth, whereas others successfully resolve. Gideon et al. leveraged longitudinal PET-CT, bacterial burden assays, and scRNA-seq in cynomolgus macaques to define dynamic cellular phenotypes and circuits underlying granuloma persistence and immune control. Uncovered cell states and interactions may represent targets for host-directed therapies and vaccines.

INTRODUCTION
Tuberculosis (TB), caused by Mycobacterium tuberculosis (Mtb), remains a major global health threat (WHO, 2019). Mtb infection is characterized by the formation of granulomas predominantly in the lungs and lymph nodes (Flynn and Klein, 2011;Lin et al., 2014b;Russell et al., 2010;Ulrichs and Kaufmann, 2006). These spatially organized structures, composed of a mixture of immune and non-immune cells (Ehlers and Schaible, 2013;Flynn and Klein, 2011;Gideon et al., 2019;Lin et al., 2006;Mattila et al., 2013;Pagan and Ramakrishnan, 2014;Phuah et al., 2012;Reece and Kaufmann, 2012;Ulrichs and Kaufmann, 2006), are key sites of host-pathogen interactions that can either restrict or facilitate bacterial survival. Delineating protective responses in humans has been challenging given the limited accessibility of affected lung tissue and difficulty determining the true extent of bacterial control. The cynomolgus macaque model of Mtb infection recapitulates the diversity of human outcomes and granuloma pathologies and enables detailed studies of the features of immunologic success and failure in Mtb granulomas (Canetti, 1955;Flynn and Klein, 2011;Lin et al., 2006).
A spectrum of granuloma types, organization, and cellular composition has been described in both humans and non-human primates (NHPs) (Canetti, 1955;Flynn and Klein, 2011;Hunter, 2011Hunter, , 2016Lin et al., 2006). The bacterial burden in individual granulomas is highest early in infection and then decreases due to increased bacterial killing as the immune response matures, even in macaques that ultimately develop active TB (Cadena et al., 2016;Lin et al., 2014b;Maiello et al., 2018). Strikingly, however, individual granulomas within a single host follow independent trajectories with respect to inflammation, cellular composition, reactivation risk, and ability to kill Mtb Gideon et al., 2015;Lenaerts et al., 2015;Lin et al., 2013Lin et al., , 2014bMalherbe et al., 2016;Martin et al., 2017). We and others have profiled immune responses among individual cell types in macaque lung granulomas, including those of T cells (Diedrich et al., 2020;Foreman et al., 2016;Gideon et al., 2015;Lin et al., 2012;Mattila et al., 2011;Wong et al., 2018), macrophages (Mattila et al., 2013), B cells (Phuah et al., 2012(Phuah et al., , 2016, and neutrophils (Gideon et al., 2019;Mattila et al., 2015) and have also examined the instructive roles of cytokines, including interferon (IFN)-g, interleukin (IL)-2, tumor necrosis factor (TNF), IL-17, and IL-10 (Gideon et al., 2015;Lin et al., 2010;Wong et al., 2020). Although these analyses have led to insights into how specific canonical cell types and effector molecules relate to bacterial burden, they have not yet revealed how the integrated actions of diverse cell types within individual granulomas influence control.
High-throughput single-cell genomic profiling methods afford new opportunities to define the cell types, phenotypic states, and intercellular circuits that comprise granulomas and inform their dynamics (Prakadan et al., 2017). Here, we developed and applied a multifactorial profiling pipeline-integrating longitudinal positron emission tomography and computed tomography (PET-CT) imaging, single-cell RNA sequencing (scRNA-seq), and molecular measures of bacterial killing with immunohistochemistry and flow cytometry-to identify features of TB lung granulomas that correlate with bacterial clearance in cynomolgus macaques. We defined the cellular compositions and cell-cell signaling networks associated with bacterial persistence or control. Collectively, our data define the cellular ecosystems within TB lung granulomas in which Mtb is controlled or alternatively survives and multiplies, uncovering therapeutic and prophylactic targets for future investigation.

RESULTS
Profiling longitudinal TB granuloma dynamics, bacterial burden, and bacterial killing We sought to define the complex cellular ecosystems of granulomas that manifest different degrees of bacterial control in NHPs. Four cynomolgus macaques were infected with a low dose of Mtb (<10 CFU; Erdman strain) and followed for 10 weeks ( Figure 1A). Ten weeks post-infection (p.i.) was chosen as a pivotal time point at which bacterial killing could be identified in some but not all granulomas during the course of immune activation and mobilization, even in macaques that would eventually progress to active TB (Figures S1A-S1C). Progression of Mtb infection and individual granuloma dynamics were monitored at 4, 8, and 10 weeks p.i. by using PET-CT imaging of FDG avidity as a proxy for inflammation (Figures S1D and S1E; Table S1) White et al., 2017). At necropsy, individual PET-CT identified lung granulomas were excised and dissociated to obtain a single-cell suspension; viable bacterial burden (CFU, colony forming units-i.e., culturable live bacterial burden) and cumulative (live + dead) bacterial load (chromosomal equivalents, CEQ) were measured to define the extent of bacterial growth and killing in each granuloma Munoz-Elias et al., 2005).
Twenty-six granulomas from four animals were randomly selected at the time of necropsy 10 weeks p.i. for scRNA-seq analysis. Among them, there was a range of granuloma-level bacterial burdens, from sterile (0 CFU/granuloma) to high (4.6 log 10 CFU/granuloma) (Figures 1B and 1C; Table S1). We binned the granulomas by bacterial burden (low, n = 13; high, n = 13). There was a significant difference in CFU between low and high CFU granulomas (median 2.2 [low] vs 3.6 [high] log 10 CFU/granuloma, p < 0.0001, Mann-Whitney U [MWU] test) (Figure 1C). To determine whether low CFU reflected reduced bacterial growth or increased bacterial killing, we assessed the total number of bacterial genomes (CEQ), because we have previously shown that the genomes of dead bacteria are not readily cleared and that CEQ provides a measure of cumulative bacterial load (Munoz-Elias et al., 2005). There was not a significant difference in CEQ values between low-and high-burden granulomas, although there was a trend toward higher CEQ in highburden lesions ( Figure 1D). However, the extent of bacterial killing, calculated as the ratio of CFU to CEQ, was significantly higher in the low-bacterial-burden granulomas (p = 0.03, MWU test) ( Figure 1E), indicating that the lower CFU largely reflected greater killing rather than more limited bacterial growth.
We then sought to identify granuloma features correlated with the degree of bacterial control. Post hoc analysis of serial PET-CT imaging data revealed a strong association between the apparent timing of lesion formation and the extent of bacterial control. All high-bacterial-burden granulomas were detected at the four-week scan, whereas most (11/13) low-bacterial-burden granulomas were first detected at the final pre-necropsy scan (10 weeks) ( Figures 1F, 1G, and S1E). Consistent with these   data, we further evaluated bacterial burden between early-and late-appearing granulomas in 10 additional animals at 10 weeks p.i (Figures S1F and S1G) and again found that the median CFU/ granuloma per animal was significantly lower in late granulomas than in early ones (p < 0.0001, Student's t test). We considered the model that late lesions have lower CFU because the bacterial population had simply not had sufficient time to expand. However, the CFU/CEQ analysis was most consistent with greater bacterial killing in late-appearing granulomas (À2.1 log 10 CFU/CEQ per granuloma) as compared to that in early-appearing ones (À1.2 log 10 CFU/CEQ per granuloma, p = 0.01, MWU test) ( Figure 1I). Late-appearing granulomas could be due to differences in the timing of lesion formation, most likely due to a dissemination event from an early granuloma, such that granuloma development occurs in the context of an activated immune response, which we have previously shown to confer significant protection against reinfection . Alternatively, we considered the possibility that differences in inflammatory-response characteristics, and specifically the extent of necrosis, might make some granulomas both detectable by PET-CT before others and associated with higher bacterial burdens. Therefore, we reviewed the histopathology from 87 historical granuloma samples from 16 cynomolgus macaques at 10-12 weeks p.i. but found no association between necrosis and time of granuloma detection (p = 0.72, Fisher exact test; Figure 1J), suggesting that bacterial control in early and late granulomas is a result of more complex factors than necrosis alone.

Cellular composition of TB lung granulomas
We next sought to identify cellular and molecular factors associated with increased Mtb killing in an unbiased fashion through scRNA-seq (STAR Methods) Macosko et al., 2015;Young and Behjati, 2018;Lun et al., 2019;McGinnis et al., 2019;Wolf et al., 2019). Among the 10-week granulomas, we analyzed 109,584 cells, resolving 13 general cell types (Figures 2A, 2B, and S2A-S2G; Table S2; STAR Methods) A C B Figure 2. Analysis of scRNA-seq of tuberculosis lung granulomas (A) Uniform manifold approximation and projection (UMAP) plot of 109,584 cells from 26 granulomas colored by identities of 13 generic cell types. (B) Expression levels of cluster-defining genes. Color intensity corresponds to the level of gene expression, whereas the size of dots represents the percent of cells with non-zero expression in each cluster. (C) Significant correlations between proportion of canonical cell types with bacterial burden of individual granulomas (log 10 CFU per granuloma) using nonparametric Spearman's rho correlation test with Benjamini-Hochberg multiple testing correction. Color indicates binned granuloma bacterial burden. See also Figures S2, S3, and S5; Table S2. (Tabula Muris Consortium et al., 2018;Han et al., 2018;Liberzon et al., 2011;Lopez et al., 2017;Varemo et al., 2013;Guo et al., 2018;Zilionis et al., 2019). These encompass groups of lymphocytes, including B cells, T and NK cells (T/NK), and plasma cells; myeloid cells, including conventional dendritic cells (cDCs), plasmacytoid dendritic cells (pDCs), and macrophages; mast cells; neutrophils; erythroid cells; stromal cells, including endothelial cells and fibroblasts; type 1 pneumocytes; and type 2 pneumocytes (Figures 2A, 2B, and S2G; Table S2). For each of these 13 cell types, we also performed further within cell-type subclustering; in these analyses, we only detected substructure among the T/NK and macrophage clusters (detailed below, STAR Methods).
Cell types associated with timing of granuloma formation and control To investigate the relationship between cell type composition and bacterial burden, we quantified the correlation between cellular frequency and CFU across all granulomas. Our data revealed multiple cell types that were significantly enriched in high-burden (early-appearing) granulomas, including plasma cells (relative cell abundance vs CFU, q = 0.00021, non-parametric Spearman's rho correlation test with Benjamini-Hochberg multiple testing correction), mast cells (q = 0.016), endothelial cells (q = 0.0087), and fibroblasts (q = 0.036) ( Figure 2C; Table S3). By contrast, T/NK cells were more abundant in low-burden (late-appearing) granulomas (q = 0.023) ( Figure 2C; Table S3). Cynomolgus macaques are variable in their infection outcomes ( Figure 1B), so to control for inter-subject variability, each of the cellular associations between granuloma dynamics and bacterial control was examined both (1) across all animals and lesions and (2) through a directed analysis of the granulomas from a single NHP host (4017) ( Figure S2H). We found similar trends in bulk RNA-sequencing data of 12 additional granulomas (six high-CFU [early] and six low-CFU [late] granulomas) from separate macaques ( Figure S3A) (Newman et al., 2015). To account for compositional dependencies between the cell types comprising each granuloma, we also conducted a multivariate Dirichlet regression analysis, which explicitly considers how shifts in the abundance of one cell type affect the relative proportions of the others present (Smillie et al., 2019). In this framework, T/NK cells were also significantly associated with low burden (Dirichlet p = 3.3 3 10 À5 ), and mast cells and plasma cells significantly associated with high burden (Dirichlet p = 0.025 & p = 0.021, respectively). We chose to prioritize cell types for further investigation based on concordance across statistical testing frameworks.
High-bacterial-burden granulomas are characterized by fibrosis and type 2 immune features To validate our mast cell observations, we performed immunohistochemistry on NHP and human granuloma sections by using tryptase and C-kit/CD117 markers ( Figures S3B-S3E) . This revealed the presence of mast cells within both NHP and human granulomas and that they primarily localize to the outer regions of NHP granulomas, including the lymphocyte cuff ( Figure S3D) and could be found within and around human granulomas ( Figure S3E) (Garcia-Rodriguez et al., 2017). In our data, mast cells were distinguished by their expression of IL4 and IL13 ( Figure S3B), which we also recently observed in a study of human nasal polyposis, a type 2 inflammatory disease associated with far-reaching epithelial remodeling (Ordovas-Montanes et al., 2018). This association between mast cells and fibrosis is further supported by a study on the spatial structure of human TB granulomas, which found a class of local signaling neighborhoods characterized by elevated proportions of mast and endothelial cells and speculated about an association with tissue repair (McCaffrey et al., 2022). Mast cells were also marked by expression of ALOX5A and ALOX5AP, which encode the system to synthesize the anti-inflammatory lipoxin LXA 4 ; the balance between LXA 4 and the pro-inflammatory lipoxin LTB 4 has been strongly implicated in the progression of TB disease in humans (Tobin et al., 2010(Tobin et al., , 2012. Plasma cells were also abundant in high-burden lesions, consistent with previous findings (Jacobs et al., 2016;Phuah et al., 2012). Recruitment of mast cells can be characteristic of allergic type 2 immune responses mediated by IgE (Kanagaratham et al., 2020), but mast cell function is also regulated by IgG, which is much more abundant in the circulation and tissues. Among the plasma cells in our scRNA-seq dataset, the vast majority expressed either IGHG or IGHA (Collins and Jackson, 2013) constant chains ( Figures S3B and S3C), suggesting that IgG and IgA were the dominant antibody classes. Taken together, these data suggested that granulomas with failed bacterial clearance are characterized by a type 2 immune environment, but the antibody features were not consistent with a canonical allergic response.
T and NK functional subclusters as mediators of protection Of the 13 broad cell types, only the T/NK cell subcluster was associated with more robust bacterial control in granulomas (q = 0.023; Dirichlet p = 3.3 3 10 À5 ; Figure 2C). Previously, we showed that $90% of T cells in granulomas are tissue localized, with only $10% immigrating to the tissue from blood over a 24-h period; $95% of the tissue-localized cells exhibit a tissue-resident memory phenotype (Potter et al., 2021). To further assess functional diversity within the T and NK cell cluster and association with bacterial burden, we performed additional sub-clustering analyses. This revealed 13 T/NK cell subclusters which we annotated based upon expression of the following: lineagedefining markers; known cytotoxic, regulatory, and proliferation genes (Figures 3A-3C and S4; Table S4); and TCR constant gene (TRAC, TRBC, and TRDC) expression ( Figure 3B). The process of annotation revealed that most subclusters did not correspond neatly to canonical T and NK cell subsets, consistent with recent studies in other systems (Rath et al., 2020). Accordingly, we annotated each subset based on distinguishing functional patterns of gene expression by using known T cell markers and literature-derived genes of interest where possible, as opposed to ontological classification based on pre-structured developmental relationships. These genes were parts of broader transcriptional signatures that appeared to reflect dominant cellular response states superimposed on cell-lineage-associated gene-expression programs. Among the 13 T/NK cell subclusters, two were significantly negatively associated with bacterial burden (with another four trending toward significance with q<0.1) ( Figure 3D; Table S3).
A prominent role for type 1-type 17 T cells in bacterial control One T/NK cell subcluster represented the most abundant cell type identified across all granulomas (8.8%) (Table S4), and the strongest correlate with bacterial control (q = 0.016; Dirichlet p = 3.3 3 10 À9 ) ( Figure 3D; Table S3). This subcluster, which we designated type 1-type 17 (T1-T17) ( Figure 3C), was enriched for expression of classical Th1-associated genes, including IFNG and TNF (Raphael et al., 2015), as well as transcription factors associated with Th17 differentiation    (C) Expression levels of T/NK cell cluster-defining genes. Color intensity corresponds to the level of gene expression and the size of dots represents the percent of cells with non-zero expression in each cluster. (D) Significant correlations between proportion of T/NK subclusters with bacterial burden of individual granulomas (log 10 CFU per granuloma) using nonparametric Spearman's rho correlation test with Benjamini-Hochberg multiple testing correction. See also Figure S4; Tables S2, S3, and S4.   Lin et al., 2014aLin et al., , 2016. Although we also detected additional features of T17 cells, including CCR6 (Hirota et al., 2007) and IL23R (Kobayashi et al., 2008), we did not observe expression of either IL17A or IL17F ( Figure 4A; Table S4), which was consistent with our published flow-cytometry data demonstrating minimal IL-17 production from granuloma T cells Wong et al., 2020). T1-T17 cells in our dataset were double positive for CXCR3 and CCR6 ( Figure 4A), consistent with markers for Th1* or ex-Th17 cells, which are believed to be precursors to tissue-resident memory cells. Multiple prior studies have reported a CXCR3+CCR6+ Th1/Th17 subset that contributes to Mtb-specific T cell responses (Becattini et al., 2015;Nikitina et al., 2018) and is capable of producing IFNg, IL-17, and IL-22 after stimulation, but not Th2-biased IL-4, IL-5, or IL-13 (Acosta-Rodriguez et al., 2007;Becattini et al., 2015;Lindestam Arlehamn et al., 2013;Mahnke et al., 2013). These CXCR3+CCR6+ Th1/Th17 T cells have also been demonstrated to express Th1-associated TBX21 (encoding T-bet) and Th17-associated RORC (encoding RORgt), but not Th2-associated GATA3 (Acosta- Rodriguez et al., 2007;Becattini et al., 2015), supporting the existence of the T1-T17 cell type. Further supporting a hybrid cell state, we independently confirmed the presence of a subset of granuloma T cells expressing both T-bet and RORa by flow cytometry (Grant et al., 2022). Notably, although Th1* and ex-Th17 subsets are described primarily as CD4 T cells (Darrah et al., 2020;Gideon et al., 2015;Lyadova and Panteleev, 2015;Mpande et al., 2018), our T1-T17 sub-cluster was characterized by the expression of both CD4 and CD8A/ B transcripts (Figures 3C, 4A-4C, S4D, and S4E), suggesting that this phenotype is not an identity program but context dependent, consistent with findings in other systems (Lee et al., 2021).

Article
To better resolve the identities of the cells in this cluster, we further sub-clustered the T1-T17 cells. This revealed four distinct subpopulations, each of which expressed T1-T17 cluster markers (RORA, RORC, IL23R, and BHLHE40) but were further distinguished by markers of cell type and state ( Figure 4B; Table S4): T1-T17 subpopulation 1 was distinguished by expression of CD4 and markers of activation and motility, including IL7R, CD6, TXNIP, PDE4D, ZFP36L2, ITGB1, CCR6, and CXCR3 (Figures 4B-4D; Table S4), making it most akin to ex-Th17 cells (Amezcua Vesely et al., 2019;Nikitina et al., 2018); T1-T17 subpopulation 2 was characterized by increased relative expression of both CD8A and CD8B and cytotoxic effector molecules; T1-T17 subpopulation 3, which includes cells expressing either CD8A/B or CD4, was characterized by cytokine gene expression (IFNG, TNF, LTA, and LTB) and markers of an inhibitory cell state (CTLA4, GADD45B, and SLA); and T1-T17 subpopulation 4 was very low in abundance and characterized by heat shock and DNA damage associated transcripts (DNAJB1 and HSPH1). In a univariate analysis, there was a trend toward negative association between bacterial burden and higher abundance of T1-T17 subpopulation 1 (q = 0.074) and a significant negative association between bacterial burden and abundance of T1-T17 subpopulation 2 (q = 0.043); both of these were significantly associated with low burden in the multivariate Dirichlet regression analysis (Dirichlet p = 0.001 & p = 0.018, respectively). T1-T17 subpopulation 3, however, was not correlated with bacterial burden, despite elevated expression of IFNG and TNF ( Figure 4E; Table S4)-cytokines generally considered as critical mediators of control in Mtb infection (O'Garra et al., 2013;Scriba et al., 2017).
CD4 and CD8 subclusters associated with low bacterial burden Among the remaining 12 T/NK cell subclusters, six were enriched for both CD4 and CD8 expression (Figures 3A-D, S4D, and S4E; Table S4). The most abundant subcluster (8.3% of granuloma cells, q = 0.074, Dirichlet p = 0.00049; Figure 3D; Tables S3 and S4) exhibited elevated expression of markers of naive and memory T cells (TCF7, CCR7, IL7R, and TXNIP) and activation or memory state (CD69 and ITGB1) ( Figure 3C; Table S4). When we conducted further subclustering of this population to evaluate the potential presence of separate naive or memory T cell states, we instead found that these markers were expressed homogenously and overlapped throughout the subcluster ( Figure S4F). As such, these cells could represent a ''stem-like'' population of T cells, which has been described as an early differentiating memory phenotype, distinct from naive T cells, that are long lived and possess a distinguishing ability to proliferate and self-renew (Ahmed et al., 2016;Caccamo et al., 2018;Gattinoni et al., 2011). Further targeted experimentation will be required to establish true stem capacity for these cells in tissue.
The remaining two CD4/CD8 subclusters were not associated with bacterial control by either statistical framework. One was (E) Significant correlations between proportion of T1-T17 subcluster and subpopulations with bacterial burden of individual granulomas (log 10 CFU per granuloma) using non-parametric Spearman's rho correlation test with Benjamini-Hochberg multiple testing correction. See also Figure S4; Tables S3 and S4. regulatory T cells (1.2%), defined by elevated expression of canonical Treg markers (FOXP3, CTLA4, TIGIT, and IL1RL1) and GATA3, a Th2 lineage-defining transcription factor that has been observed in a subset of tissue-resident Tregs (Figures 3C  and 3D; Table S4) (Wohlfert et al., 2011). Of note, although CTLA-4 was highly expressed by regulatory T cells, the inhibitory receptor PD-1 (PDCD1) was only sparsely detected in our data set, concordant with recent work (McCaffrey et al., 2022;Wong et al., 2018). The final subcluster was interferon-responsive T cells (0.4%), which were enriched for type-1-interferon-inducible molecules (Szabo et al., 2019) (Figures 3C and 3D, Table S4).
Bacterial control is associated with a specific cytotoxic T cell population The remaining six T/NK subclusters were broadly defined by expression of CD8A and/or CD8B and cytotoxic genes, including granzymes, granulysin, and/or perforin (designated cytotoxic 1-6, Figure 3C; Table S4). We confirmed expression of multiple granzymes among CD8 ab T cells in Mtb granulomas by flow cytometry ( Figure S8).
The remaining five cytotoxic subclusters did not associate with bacterial control by either statistical framework. Cytotoxic subclusters C1-3 were enriched for the expression of CD8A but not CD8B and elevated TCRD, implying that these cells possessed innate cytotoxic function (Figures 3B and 3C). C5, which expressed CD8A and CD8B, was distinguished by elevated expression of GZMK ( Figure 3C), which has been recently described as a hallmark of immune dysfunction in inflammation (Mogilenko et al., 2021).
The functional complexity of these six subclusters, along with the common and distinct responses they represent, suggests a significant and underappreciated role for cytotoxic cells in TB granulomas.

Macrophage heterogeneity in Mtb granulomas
Although macrophages are responsible for much of the bacterial killing within granulomas, we did not observe any association between overall macrophage abundance and bacterial burden (Figures 2 and S5). Yet, like the T/NK cell cluster, the macrophage cluster had discernable substructure based on unbiased gene-expression analyses. Among the 27,670 macrophages, we identified nine subclusters (Table S4). The only cluster independently associated with bacterial control was Mac4, a subpopulation of macrophages enriched in high-burden lesions (q = 1.6 3 10 À5 , Dirichlet p = 0.12; Figure S5E; Table S4). Upregulated genes in Mac4 included known interferon-response genes (NFKBIA, IFI27, IFI30), as well as more general pro-inflammatory processes (IL1B, CXCL8, LYZ) and complement activation (C1QA, C1QB, C1QC) (Figures S6A and S6B), consistent with the ''macrophage IFN'' phenotype described by Esaulova et al. as associated with poor bacterial control . Mac5 and Mac3, meanwhile, were the populations that most strongly expressed genes that have been described as characteristic of epithelioid macrophages in zebrafish granulomas (q = 1.67 3 10 À7 and q = 9.17 3 10 À6 , respectively; Figures S6C-S6E) (Cronan et al., 2021). Mac5 was statistically significantly associated with high burden via multivariate Dirichlet regression analysis (Dirichlet p = 0.034), but not via univariate correlations with CFU (q = 0.31; Figure S9E).

Defining trajectories of bacterial burden and granuloma phenotype
To further understand the temporal emergence of variations in bacterial burden and granuloma states, we evaluated how cellular identities and compositions track with time. Here, we leveraged a scRNA-seq discovery dataset from six granulomas isolated at four weeks p.i. from two separate macaques ( Figures 5A-5C). Four weeks is the earliest timepoint at which we can reliably identify granulomas by imaging; these lesions are by definition early appearing and thus likely to be high burden at 10 weeks p.i. However, they were captured at an earlier point in their development and therefore might be considered more analogous to late-appearing lesions at the 10-week timepoint (i.e., those first detected four weeks prior). We defined cell-type-specific ''burden-associated gene sets'' based on differentially expressed genes between 10-week p.i. high-vs. low-burden granulomas. Scoring four-week p.i. granuloma cells for these gene sets demonstrated that the T cell and macrophage phenotypes were more concordant with the early, high-burden lesions at 10 weeks than the later-appearing, more restrictive lesions ( Figures 5D and 5E).
These data suggest a measure of stability in the cellular microenvironment between four and 10 weeks in early-appearing granulomas. They further indicate that the differences between high-and low-burden granulomas at 10 weeks do not simply reflect lesions at different stages in the same maturation continuum. Instead, they suggest that late-appearing, lowburden granulomas reflect a different path. We propose that late-appearing granulomas develop in the context of an emerging adaptive immune response, can recruit adaptive T cells quickly, and are better able to kill Mtb. This model is consistent with our published work showing robust clearance of Mtb in a reinfection model . However, we also acknowledge the potential for bacterial burden to shape granuloma phenotype, with burden and multicellular microenvironment each having the capacity to influence one another, potentially in a self-reinforcing manner.
Cellular ecology of pulmonary TB granulomas Given demonstrable differences in cellular composition across the bacterial burden spectrum, we wondered whether specific cell types co-occur in TB lung granulomas to collectively influence control. By using hierarchical clustering of pairwise correlations between cell type frequencies, we defined five groups of cell types whose collective abundances were associated across granulomas ( Figure 6A; Table S5). Of these, group 2 (shown in red), which included mast cells, plasma cells, Mac4, and certain (D) Expression levels of macrophage burden-associated gene set, defined by using genes differentially expressed between macrophages in 10-week-p.i. highburden and 10-week-p.i. low-burden granulomas; boxplot with median, interquartile range, and whiskers extending a maximum of 1.5*IQR; MWU test with Benjamini-Hochberg correction for multiple hypothesis testing.
Distinct cellular ecosystems associate with granulomalevel bacterial burden To further explore how specific cellular compositions might underpin differential bacterial control, we examined putative cell-cell interactions within each granuloma (STAR Methods).
High-bacterial-burden lesions were dominated by signals sent by group 2 cell types (i.e., mast, fibroblast, endothelial, plasma, type 1 pneumocyte, and Mac4); these cell types displayed the highest counts of high-burden-linked interactions as well as those most strengthened in high-burden granulomas (p < 2.2 3 10 À16 , binomial test) ( Figures 6C and 6D). In contrast, interactions in low-burden granulomas more evenly involved groups 1, 3, 4, and 5, with group 3 showing the strongest enrichment for signaling activity strengthened in low-burden granulomas (p < 2.2 3 10 À16 ) ( Figures 6E and 6F). We further examined shifts in intercellular interaction network topology by quantifying the sender and receiver activity associated with different degrees of bacterial burden. In high-burden granulomas, group 2 cell types were the key source of intercellular signals ( Figure 6G), with strong intra-group 2 signaling. This suggests that highburden lesions are driven by self-reinforcing interactions amongst group 2 cell types (e.g., between mast cells, plasma cells, fibroblasts, and endothelial cells). In contrast, in lowburden granulomas, we found only sparse contributions from group 2 cell types ( Figure 6H); instead, low-burden granulomas were characterized by a more even distribution of signals stemming and terminating in groups 1, 3, 4, and 5 cell types, suggestive of a coordinated immune response involving multiple cellular subsets. We next examined which specific axes of intercellular communication and, among whom, which were associated with varying bacterial control. Among the ligands whose interactions are most strengthened in high-burden granulomas, we identified genes implicated in fibrosis (e.g, FGF1, PDGFB, CTGF, FGF7, IL34), vascular remodeling (VEGFB, VEGFC, ANGPTL4) and TGFb signaling (TGFB2, TGFB3, BMP6), suggestive of a wound-healing response ( Figure 6I) (Joshi et al., 2020;Padela et al., 2008). In addition, we observed evidence of intercellular communication via genes implicated in type 2 immunity (CCL11, CCL13, CD5L, IL4, IL5, IL13, IL24) and allergy-linked inflammation (CCL19) (Nakano et al., 2019). These specific ligands were largely produced and received by group 2 cell types (with only sparse contributions from groups 3-5). Collectively, this supports a model where intra-group 2 signaling drives a self-reinforcing high-burden microenvironment via wound-healing-like responses and associated type 2 immune activity (Figure 6J). This interpretation was further supported by enrichment of pathways such as TGFb, WNT, and FGF signaling, as well as organogenesis, epithelial/endothelial proliferation, and tissueremodeling processes ( Figure S6F); is consistent with prior descriptions of angiogenesis and fibrosis in human granulomas (McCaffrey et al., 2022;Polena et al., 2016); and supports the targeting of vasculature as a therapeutic direction for tuberculosis (Datta et al., 2015;Oehlers et al., 2015).
Our cell-cell interaction analyses also indicated contextdependent roles for certain cell types and ligands. For instance, the macrophage-dominated group 1 was not statistically correlated with granuloma control in our compositional analyses (Figure 6A) but participated in the second-most interactions in both high-and low-burden granulomas ( Figures 6B, 6C, 6E, 6G, and 6H). The idea of dual roles for group 1 cells was borne out by examination of the ligands produced by group 1 cell types in high-(PDGFB, CD5L, TNFSF13) and low-burden (CXCL9/10/11, CD86, IL18, CCL20) microenvironments ( Figures 6I and 6J). Similarly, some individual ligands participated in interactions in both high-and low-burden granulomas, suggesting pleiotropic  Figure S6; Table S5. effects. As one specific example, IL-1's effects on Mtb vary based on disease stage and model (Juffermans et al., 2000;Law et al., 1996;Mayer-Barber et al., 2014;Mishra et al., 2013;Zhang et al., 2014). Based on our analyses, IL1A and IL1B each mediated interactions associated with both high and low bacterial burden but were derived from different sender cell populations in the two instances. Thus, our intercellular interaction analyses uncover axes of cellular plasticity and ligand pleiotropy across granuloma microenvironments, important for improved understanding and therapeutic modulation of Mtb (Keshavjee and Farmer, 2012).

DISCUSSION
Within an individual with Mtb infection, distinct granulomas can achieve sterilizing immunity, immune standoff, or frank immune failure (Flynn, 2006;Flynn and Klein, 2011;Lin et al., 2009Lin et al., , 2014b. In NHPs, which most closely recapitulate human Mtb infection and disease (Coleman et al., 2014a), this heterogeneity provides an opportunity to define the cellular and molecular factors that correlate with bacterial control to identify potential hostdirected prevention and cure strategies for TB. Here, our data substantiate a model where the state of the surrounding host cellular ecosystem helps inform a granuloma's infection trajectory, leading to long-term, stable states which either permit or restrict bacterial survival.
To exemplify the links between cellular composition, gene expression, intercellular interaction patterns, and bacterial burden, we highlight mast cells: most abundant in high-burden lesions, mast cells were major producers of type 2 cytokines, especially IL4, IL5, and IL13, which are down-modulators of lymphocyte and macrophage antimicrobial activity, including inhibiting the cytolytic functions of CD8 + T cells (Kienzle et al., 2005;Wijesundara et al., 2013). However, IL-4 and IL-13 have broader functions in the context of wound healing. Indeed, the cellular interactions in high-burden granulomas revealed both specific signaling molecules (e.g., FGF1 from type 1 pneumocytes, PDGFB from endothelial cells, ANGPTL4 from plasma and mast cells, among others) and broad pathways that reflected fibrosis, metabolic remodeling, and angiogenesis. Collectively, these data suggest a cascade of interactions in early-appearing granulomas with failed control, whereby an initially permissive environment is reinforced by a tissue-remodeling response that seeks to limit and wall off pathologic activity. Although more detailed studies on the roles of wound-healing responses and tissue remodeling in TB are indicated, these features could represent critical targets for host-directed therapies that need to not only enhance restrictive adaptive immune responses but also address the maladaptive features of microenvironments permissive to granuloma persistence (Ahidjo et al., 2016).
The strongest cellular correlate of bacterial control was a subcluster of cells with transcriptional features of both type 1 and type 17 T cells that was expanded in granulomas with bacterial control. Previous studies have revealed a prominent role for  Garra et al., 2013;Scriba et al., 2017), and studies in NHP granulomas suggest an association between T1 and T17 cytokine expression and bacterial burden . In addition, in murine models, BHLHE40 is required for control of Mtb infection, as a repressor of IL-10 production (Huynh et al., 2018). Aspects of these data are consistent with recent observations that granulomas established in immune-primed environments-e.g., existing Mtb infection  or intravenous or intrabronchial BCG vaccination-are characterized by Th1/17 expression patterns that are associated with protection (Darrah et al., 2020;Dijkman et al., 2019); however, we extend these findings, defining appreciable substructure among the T1-T17 subcluster of relevance to control. The CD4 T1-T17 subpopulation (subpopulation 1) is most consistent with published descriptions of Th1/17 cells (e.g., Th1* or ex-Th17)  Mahnke et al., 2013;Nikitina et al., 2018). These cells could represent precursors to long-lived tissue memory, which has been shown to play a crucial protective role in autoimmunity, bacterial control, and memory immune responses to pathogens (Amezcua Vesely et al., 2019;Liang et al., 2015;van Hamburg and Tas, 2018;Wacleche et al., 2016), including Mtb infection. A recent study using flow cytometry and immunohistochemistry in Mtb-infected rhesus macaques supports an association of Th1 (IFNg+) and Th17 (IL-17+) cells in lung tissue with latent infection (Shanmugasundaram et al., 2020); in contrast, another study using scRNA-seq reported activated CD4 and CD8 T cells including Th1 and Th17 in the lung tissue of macaques with pulmonary TB . The CD8 subsets within the T1-T17 subcluster (subpopulations 2 and 3), meanwhile, have not been described previously. The former of these was strongly associated with bacterial control and could represent an immunologic paradigm that can be exploited for vaccine development.
Our data also revealed a CD4-and CD8-expressing T cell subcluster associated with low-burden granulomas that resembles stem-like T cells (Ahmed et al., 2016;Caccamo et al., 2018;Cartwright et al., 2016;Fuertes Marraco et al., 2015;Gattinoni et al., 2011;Mateus et al., 2015;Todryk, 2018). We hypothesize that these cells could be a source of T cell renewal in granulomas and could differentiate into the various functional subsets we observe within them. It is possible, however, that these represent memory T cells that are not specific for Mtb antigens but migrate to the granuloma in response to inflammation and/or chemokine gradients. Indeed, flow-cytometry-based studies support that a majority of T cells in granulomas do not respond to Mtb antigens by making cytokines and do not display hallmarks of exhaustion Sakai et al., 2016;Wong et al., 2018).
Although both CD4 and CD8 T cells have been implicated in control of Mtb infection, the cytotoxic function of lymphocytes in Mtb infection has been relatively understudied. However, we also found previously unappreciated complexity among granuloma cytotoxic cells of relevance to bacterial control. Of these, cytotoxic subcluster 4, which was enriched for CD8 ab T cells and defined by expression of several granzymes and perforin, likely represents cytotoxic effector T cells that target infected cells and is associated with low-burden granulomas. Our findings contrast with those in model systems like mice, which notably do not have the capacity to sterilize sites of infection and whose CD8 T cells also do not express granulysin (Hojo-Souza et al., 2020). However, our findings are consistent with a recent study on lung tissue from Mtb-infected macaques which also found evidence of cytotoxic molecule expression associated with controlled infection .
Our analyses not only revealed sets of biological pathways utilized in the host cells of high-vs. low-burden granulomas but also assigned roles to the specific cell types that drive these signaling patterns. In particular, the strong internal signaling among group 2 cell types and comparatively weaker cross-talk to other groups in early lesions could drive establishment of a cellular ecosystem dominated by type 2 immune and wound-healing responses that preclude effective T cell engagement and conversion to a more restrictive state. By comparison, in late-appearing lesions, primed T cell populations, in concert with different innate populations, could use a variety of pro-inflammatory and pro-activation interactions to control Mtb growth or dissemination; a similar phenomenon might explain how infection with Mtb can protect against subsequent reinfection  even in the presence of ongoing original infection, by locally recruiting adaptive responses that can act before self-reinforcing group 2 responses work to limit pathology.
We note that the contrasting microenvironments revealed through our analyses can occur within the same individual. Knowledge of intercellular networks underlying granuloma stability will spur future research efforts to identify and manipulate linchpins that serve as key nodes in limiting or enhancing the efficacy of therapeutic and prophylactic measures. For instance, there might be a potential therapeutic role for IL-15 super-agonists in clinical development that can drive expansion of cytotoxic populations (Fujii et al., 2018;Knudson et al., 2019). We also found strong enrichment for the expression of distinct neuro-hormonal modulators by group 2 (NRG1, RLN3, NTS) and group 3 cells (UCN3), as well as associations with transcriptional targets of sex hormones. Ligands and receptors implicated in lowburden interactions were enriched for targets of several neuromodulatory agents, including buprenorphine and fluoxetine, where serotonin reuptake inhibitors have already been identified in screens for host-acting compounds that improve macrophage control of Mtb, supporting potential for their further investigation (Heemskerk et al., 2021;Stanley et al., 2014).
In summary, our scRNA-seq investigation revealed cellular and molecular features that dynamically associate with natural control of Mtb in pulmonary granulomas. Interactive visualizations of all scRNA-seq data and associated metadata are hosted through the Broad Single-Cell Portal for further exploration and re-analyses (see data and code availability). Beyond recapitulating canonical correlates, our analysis defined nuanced, actionable, innate as well as adaptive functional cell states and shed light on essential dynamics among host-pathogen interactions (Iwasaki and Medzhitov, 2015). Collectively, our data substantiate a model where high Mtb burden within granulomas is dictated locally by type 2 immunity and tissue-protective (wound-healing) responses that seek to maintain essential tissue functionality at the expense of creating a niche for bacterial persistence. In granulomas that form later in infection, and, therefore, in the context of an adaptive immune response, this balance is tipped toward bacterial control by the emergence of adaptive T1-T17 and cytotoxic responses, with interactions involving innate immune cell types enabling sufficient infiltration and activation of these T cell subsets. As a result, successful immune coordination across cell types in late-forming granulomas could obviate the self-reinforcing type 2 immune/wound-healing responses that would otherwise exclude immune effector functions needed for Mtb control. We also identified cell types and ligands that participate in both high-and low-burden granulomas potentially indictive of phenotypic plasticity and pleiotropic effects that might both be molded by and (in turn) reinforce distinct, pathology-associated granuloma microenvironments. Such a framework is consistent with previous observations of natural  or induced (Darrah et al., 2020) control and supports the need to look to combinatorial host-directed paradigms for the development of efficacious therapeutic and prophylactic measures.
Moving beyond the perspective of individual molecular targets, our work highlights the importance of the complexities of divergent host cellular ecosystems in driving Mtb persistence or control. By defining and nominating several putative axes of intra-and intercellular signaling associated with contrasting Mtb outcomes, our work provides a foundation for enabling effective manipulation of the properties and states of complex cellular ecosystems, therapeutically relevant destabilization of pathologic molecular environments to enable adaptive immune access and fundamental connections to other inflammatory and infectious diseases that affect epithelial barrier tissues (Hughes et al., 2020;Ordovas-Montanes et al., 2018).

Limitations of the study
Granulomas are inherently heterogeneous and include necrotic debris, requiring robust technical correction and quality control; this results in an analysis of only high-quality cells. Because only a fraction of cells from each granuloma were analyzed, proportions might not have reflected the true composition of cells within a granuloma and could be skewed toward lymphocytes, highlighting the importance of orthogonal validations. In bulk RNAsequencing analysis of a separate set of dissociated early and late granulomas, we observe generally similar trends in cell-type composition, supporting our conclusions; similar studies will need to be performed in undigested granulomas to account for dissociation artifacts. In the absence of prior comparable studies on macaque granulomas, we could not predict a priori the granuloma diversity uncovered by scRNA-seq profiling, even before considering potential genetic differences in both host and pathogen. Even with these considerations, the sample size of this study was sufficient to reveal features of host responses linked to Mtb persistence or control that could inform future efforts across the TB community. Furthermore, knowledge of T cell antigen specificity could serve to prioritize T cell subsets for their relevance to bacterial control but would require the development of new methodologies that allow analysis of very small numbers of primary cells and a very large antigenic repertoire against the major histocompatibility complex diversity of outbred macaques. Relatedly, the transcriptomic granuloma landscape investigated here is from a pair of (albeit pivotal) time points, including granulomas at the earliest timepoint of reliable, non-invasive detection and granulomas across a spectrum of growth trajectories when bacterial killing can be identified in some but not all granulomas.

Article
It is likely that expression of certain genes that arise early in infection and then are downregulated as infection progress will be missed, as will some populations critical to guiding overall granuloma outcome. More generally, matched profiling of additional timepoints p.i., along with analysis of lung tissue and granulomas from vaccinated or reinfected and protected animals, will provide a more complete picture of the temporal control of Mtb in granulomas and is the subject of future work.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

DECLARATION OF INTERESTS
A.K.S. reports compensation for consulting and/or SAB membership from Merck, Honeycomb Biotechnologies, Cellarity, Repertoire Immune Medicines, Third Rock Ventures, Hovione, Relation Therapeutics, FL82, Empress Therapeutics, Ochre Bio, and Dahlia Biosciences.C.L. is a shareholder and consultant for Honeycomb Biotechnologies. T.K.H. is a shareholder and consultant for nference, inc.

INCLUSION AND DIVERSITY
One or more of the authors of this paper self-identifies as an underrepresented ethnic minority in science. One or more of the authors of this paper self-identifies as a member of the LGBTQ+ community. While citing references scientifically relevant for this work, we also actively worked to promote gender balance in our reference list.

RESOURCE AVAILABILITY
Lead contact Further information and requests for resources, analytical code, and reagents should be directed to and will be fulfilled by the lead contact, Alex K. Shalek (shalek@mit.edu).

Materials availability
The study did not generate new unique reagents.
Data and code availability d scRNA-seq data generated for this study is available at Gene Expression Omnibus. Accession numbers are listed in the key resources table. Processed data from granulomas sampled at 10 weeks p.i. can be accessed and visualized at https:// singlecell.broadinstitute.org/single_cell/study/SCP257/cellular-ecology-of-m-tuberculosis-granulomas-10-week-dataset#/. Data from granulomas sampled at 4 weeks p.i. can be accessed and visualized at https://singlecell.broadinstitute.org/ single_cell/study/SCP1749/cellular-ecology-of-m-tuberculosis-granulomas-4-week-dataset. d All original code has been deposited to Zenodo at https://doi.org/10.5281/zenodo.6419143. d Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Research animals
Cynomolgus macaques (Macaca fascicularis), >4 years of age, (Valley Biosystems, Sacramento, CA) were housed within a Biosafety Level 3 (BSL-3) primate facility. Further information (including biological sex, number of granulomas, etc.) for each macaque involved in this study can be found in Table S1. Animals were infected with low dose ($10 colony-forming units (CFUs)) M tuberculosis (Erdman strain) via bronchoscopic instillation. Infection was confirmed by PET-CT scan at 4 weeks and monitored with clinical and radiographic examinations until 10 weeks p.i.
All experimental manipulations, protocols, and care of the animals were approved by the University of Pittsburgh School of Medicine Institutional Animal Care and Use Committee (IACUC). The protocol assurance number for our IACUC is D16-00118. Our specific protocol approval numbers for this project are 18124275 and IM-18124275-1. The IACUC adheres to national guidelines established in the Animal Welfare Act (7 U.S.C. Sections 2131 -2159) and the Guide for the Care and Use of Laboratory Animals (8 th Edition) as mandated by the U.S. Public Health Service Policy.
All macaques used in this study were housed at the University of Pittsburgh in rooms with autonomously controlled temperature, humidity, and lighting. Animals were singly housed in caging at least 2 square meters apart that allowed visual and tactile contact with neighboring conspecifics. The macaques were fed twice daily with biscuits formulated for nonhuman primates, supplemented at least 4 days/week with large pieces of fresh fruits or vegetables. Animals had access to water ad libitum. Because our macaques were singly housed due to the infectious nature of these studies, an enhanced enrichment plan was designed and overseen by our nonhuman primate enrichment specialist. This plan has three components. First, species-specific behaviors are encouraged. All animals have access to toys and other manipulata, some of which will be filled with food treats (e.g., frozen fruit, peanut butter, etc.). These are rotated on a regular basis. Puzzle feeders foraging boards, and cardboard tubes containing small food items also are placed in the cage to stimulate foraging behaviors. Adjustable mirrors accessible to the animals stimulate interaction between animals. Second, routine interaction between humans and macaques are encouraged. These interactions occur daily and consist mainly of small food objects offered as enrichment and adhere to established safety protocols. Animal caretakers are encouraged to interact with the animals (by talking or with facial expressions) while performing tasks in the housing area. Routine procedures (e.g. feeding, cage cleaning, etc) are done on a strict schedule to allow the animals to acclimate to a routine daily schedule. Third, all macaques are provided with a variety of visual and auditory stimulation. Housing areas contain either radios or TV/video equipment that play cartoons or other formats designed for children for at least 3 h each day. The videos and radios are rotated between animal rooms so that the same enrichment is not played repetitively for the same group of animals.
All animals are checked at least twice daily to assess appetite, attitude, activity level, hydration status, etc. Following M. tuberculosis infection, the animals are monitored closely for evidence of disease (e.g., anorexia, weight loss, tachypnea, dyspnea, coughing). Physical exams, including weights, are performed on a regular basis. Animals are sedated prior to all veterinary procedures (e.g. blood draws, etc.) using ketamine or other approved drugs. Regular PET/CT imaging is conducted on most of our macaques following infection and has proved very useful for monitoring disease progression. Our veterinary technicians monitor animals especially closely for any signs of pain or distress. If any are noted, appropriate supportive care (e.g. dietary supplementation, rehydration) and clinical treatments (analgesics) are given. Any animal considered to have advanced disease or intractable pain or distress from any cause is sedated with ketamine and then humanely euthanatized using sodium pentobarbital.

METHOD DETAILS
Serial PET-CT Imaging Animals underwent PET-CT scans after Mtb infection at 4 weeks, 8 weeks and pre necropsy (i.e. 10 weeks post-infection) as previously described (White et al., 2017). Briefly, animals were sedated, intubated and imaged by 2-deoxy-2-18 F-D-deoxyglucose (FDG) PET imaging (microPET Focus 220 preclinical PET scanner, Seimens Medical Solutions, USA, Malvern, PA) and Cretom CT scanner (Neurologica Corp, Danvers, MA, USA) within biosafety level 3 facility. The total lung FDG avidity was analyzed using Osirix viewer, an open-source PACS workstation and DICOM viewer (Pixmeo, Bernex, Switzerland). The whole lung was segmented on CT by using the growing region algorithm on the Osirix viewer to create a ROI of normal lung (Hounsfield units <200). The closing tool was used to include individual nodules and other pulmonary disease. The ROI was transferred to the co-registered PET scan and manually edited to ensure all pulmonary disease was included. Voxels outside the ROI were set to zero and voxels with an SUV greater than or equal to normal lung (SUV >2.3) were isolated. Finally, the ''Export ROIs'' plug-in was then used to export the data from these isolated ROIs to a spreadsheet where the total SUV per voxel were summed to represent the total lung FDG activity. Total FDG activity in lungs was used to estimate thoracic bacterial burden prior to reinfection ( Figure 1C), as previously published White et al., 2017). Granulomas were individually characterized by their date of establishment (scan date), size (mm), and relative metabolic activity as a proxy for inflammation ([ 18 F]-FDG standard uptake normalized to muscle [SUVR]) White et al., 2017). Granulomas greater than 1mm are detected by CT scan.

Necropsy
Necropsy was performed as previously described Lin et al., 2009Lin et al., , 2013Maiello et al., 2018). Briefly, an 18 F-FDG PET-CT scan was performed on every animal 1-3 days prior to necropsy to measure disease progression and identify individual granulomas. At necropsy, monkeys were maximally bled and humanely sacrificed using pentobarbital and phenytoin (Beuthanasia; Schering-Plough, Kenilworth, NJ). Individual granulomas previously identified by PET-CT and those that were not seen on imaging from lung and mediastinal lymph nodes were excised for histological analysis, bacterial burden, and other immunological studies. TB specific gross pathologic lesions and overall gross pathologic disease burden were quantified using a previously published method . The size of each granuloma was measured by pre-necropsy scans and at necropsy. Granulomas were enzymatically dissociated using the gentleMACS dissociator system (Miltenyi Biotec Inc) to obtain a single suspension for enumerating bacterial burden and for single cell RNA-sequencing (scRNA-seq) on the Seq-Well platform.
Bacterial burden 200 mL of each granuloma homogenate were plated in serial dilutions onto 7H11 medium, and the CFU of M. tuberculosis growth were enumerated 21 days later to determine the number of bacilli in each granuloma . As a quantitative measure of overall bacterial burden, a CFU score was derived from the summation of the log-transformed CFU/gram of each sample at the time of necropsy.
Chromosomal equivalents, CEQ DNA extraction and qPCR were performed with modifications as described previously . Briefly, frozen aliquots of homogenates were thawed and volumes recorded throughout the extraction process. Samples were transferred to tubes containing 150 mL of 0.1mm zirconia-silica beads (Biospec Products) before adding 600mL of Tris-EDTA buffer, pH 8.0. Three hundred microliters of phenol/chloroform/isoamyl alcohol (25:24:1, Sigma-Aldrich) at 70 C were subsequently added and the samples incubated at room temperature for 10 min. The samples were then vortexed, the aqueous layer separated and supplemented with 50 mL 5M NaCl and a second phenol chloroform extraction performed on the extracted aqueous layer. DNA was precipitated with the addition of one volume of 100% isopropanol and one-tenth volume of 3M sodium acetate and incubating at À20 C overnight. The DNA pellet was washed with 70% ethanol, dried and resuspended in nuclease-free water. Mtb genomes were then quantified using Taqman Universal Master Mix II (Life Technologies) and previously published sigF primer-probe combination . Each sample was amplified in triplicate using an ABI Systems 7900HT machine. Chromosomal equivalents (CEQ) were quantified by comparing the samples with a standard curve derived from serial dilution of Mtb genomes prepared from liquid culture. Our detection limit for the standard curve was 10 copies per reaction. When we calculated the number of genomes for the whole granuloma, our detection limit was 1,000 copies per granuloma. Of the 26 granulomas analyzed, 2 granulomas failed at the CEQ quantification and they were eliminated from CEQ and CFU/CEQ analysis.

Immunohistochemistry analysis
Granulomas from macaques were harvested at 10 or 11 weeks post Mtb infection from other published (Phuah et al., 2016) and unpublished studies at the University of Pittsburgh. Following formalin fixation and paraffin embedding, 5 mm sections were placed on slides for staining. Slides were deparaffinized in xylenes, hydrated in a series of graded ethanol dips, and then antigen retrieval was performed by boiling the slides in a pressure cooker containing antigen retrieval citrate buffer for slides stained with c-kit and tryptase or Tris-EDTA buffer (Mattila et al., 2013) for slides stained with CD11c, CD20, and CD3. Sections were cooled to room temperature and washed with 13 PBS then stained overnight at 4 C in a humidified chamber using anti-human c-kit, anti-mast cell tryptase antibodies, or rabbit-anti-CD3 and mouse anti-CD11c antibodies as previously described (Phuah et al., 2016). For the c-kit and tryptase buffer. RNA was then extracted from whole lysates using RNeasy kits (Qiagen, Inc.) and combined with mRNA capture beads. Reverse transcription, whole transcriptome amplification, tagmentation and sequencing were performed as described above. Within each bulk RNA sequencing sample, expression values were summarized across bead barcodes to obtain an aggregate expression profile for each population. specifically added to avoid exclusion of bona fide cell clusters with high-background contamination (presumably due to low endogenous RNA content) and low biological signal (e.g., naive T cells). Here, cell populations that scored high for markers of a single lineage yet had higher soup-expression scores presented with lower rates of co-expression of these soup and lineage defining transcripts relative to clusters which did not, likely representing ambient RNA and debris.
Using these three values, cluster-specific background ''contamination'' scores were calculated for each array in 2 ways: These two ''contamination'' scores quantify both the (1) absolute and (2) relative soup-profile contamination in subsequent cluster classification.
Next, for each array, clustering was performed to identify clusters with array-specific ambient contamination and debris. Specifically, hierarchical clustering was performed using a total of 7 variables to identify clusters defined by ambient contamination: the 2 contamination scores (shown above), three scaled soup scores (soup gene thresholds: 0.01, 0.05 and 0.001), the average log-fold change for the top 5 cluster genes, and soup/lineage gene co-expression. For each array, the hierarchical clustering tree was cut at the first branch point to identify clusters with a signature of ambient contamination. In total, 41 array-specific clusters, comprising 56,590 barcodes from 21 out of 32 total arrays, were identified as characterized by ambient RNA contamination and cellular debris and removed them in all subsequent analyses.

Correction for residual background contamination
After removal of cell barcodes that were derived from background contamination and extracellular debris, additional correction for ambient RNA contamination was performed among remaining cell barcodes on an array-by-array basis. Among filtered cell barcodes, array-specific, ambient RNA contamination was observed to be marked by ectopic expression of cell-type defining genes (e.g., widespread expression of JCHAIN, HBB, and CPA3, etc.). Specifically, this contamination was observed to vary in relation to the overall distribution of cell types recovered from each array. To correct for residual ambient contamination within each array, SoupX (Young and Behjati, 2018) was used to: (1) generate array-specific profiles of background contamination, (2) estimate per-cell contamination fractions, and (3) generate corrected background-corrected UMI counts matrices. To generate background expression profiles, counts matrices containing up to 50,000 barcodes were generated to assemble a collection of low-UMI cell barcodes that presumably represent extracellular mRNA. For each array, a UMI threshold for background expression was determined using EmptyDrops (Lun et al., 2019) to estimate the likelihood distribution that low-UMI barcodes represent cells rather than ambient contamination. Using an array-specific UMI-threshold (Range: 20-100 UMIs), a composite background profile was created for each array. To estimate the per-cell contamination fraction, a set of lineage-defining genes was first identified with bimodal expression patterns across cells (i.e., lineage defining genes with leaky expression). For each array, this set of soup-defining, lineage genes was used to estimate contamination fraction for cell types with known endogenous expression. Finally, the composite soup profile was subtracted from each the transcriptional profile of each cell based on the estimated contamination fraction. For each array, individual transcripts most likely to be contamination were removed from each single-cell based on the estimated contamination fraction. Specifically, individual transcripts were sequentially removed from each single-cell transcriptome until the probability of subsequent transcripts being soup-derived was less than 0.5 to generate a background-corrected counts matrix for each array.

Separation of doublets
Within each array, doublet identification and separation were performed using DoubletFinder. To account for differences in cell loading densities and expected cell doublet frequencies, array-specific estimates of the expected number of doublets were generated (Table S1). For example, for a total of 20,000 cells applied to a Seq-Well device containing 85,000 wells (lambda = 20,000), an expected doublet rate of >2.37% (since not all of the array's surface area contains wells) was calculated. For each array, pseudodoublets were generated using DoubletFinder (McGinnis et al., 2019). Here, the pK parameter estimate was separately optimized for each array by performing a parameter sweep in which we selected the pK value with the maximum bimodality coefficient, while a pN = 0.25 was maintained across all arrays based on published recommendations (McGinnis et al., 2019). Cells were identified as doublets based on their rank order in the distribution of the proportion of artificial nearest neighbors (pANN). Specifically, the pANN value for the cell at the expected doublet percentile was identified and the corresponding pANN value was used as a threshold to remove additional cells in the event of ties. In total, we excluded 3,656 cells as doublets.

Integrated cell type classification
Following the aforementioned quality filtering, a combined dataset of 109,584 cells was used in downstream analysis (Table S1). An initial dimensionality reduction was performed on these cells by selecting 1580 variable genes, performing principal component analysis (PCA), UMAP dimensionality reduction and Louvain clustering using Scanpy (Wolf et al., 2018). To identify broad cell types, we examined cluster assignments at multiple clustering resolutions (Resolutions: 0.5 to 2.25). We selected a cluster resolution of 1.00 because this was the resolution beyond which branching did not result in discovery of clusters that represent distinct cell lineages (e.g., division of Type 1 and Type 2 pneumocytes) (Table S2). To define major cell populations, extensive comparisons to existing signatures of lung parenchyma and immune cell populations were performed using data from the Tabula Muris (Tabula Muris Consortium et al., 2018) and Mouse Cell Atlas  studies. Specifically, lung scRNA-seq data from both studies were collected and used to calculate enriched gene expression signatures for each lung cell type cluster using a Wilcox rank-sum test. For each cluster, the top 20 genes (Table S2) were selected as a cluster-specific expression signature and then used them to score all cells in the granuloma dataset. The average signature score within each cluster was calculated and the distribution of signature score was examined within each granuloma cell type, and significance was determined via permutation testing.
Cell type assignment of proliferating cells Among our top-level clusters was one defined by markers of cellular proliferation (MKI67, TOP2A, and CDK1). To identify the underlying cell type identity for these cells, a separate dimensionality reduction and clustering was performed among 3,123 cells defined by this proliferation signature. UMAP dimensionality reduction and Louvain clustering was running at multiple clustering resolutions (0.4-0.8), and a resolution of 0.70 was selected as the value beyond which no additional major cell type clusters were observed (Figure S3E). For each of the major cell types identified in the global clustering analysis, we generated a gene signature using the top 20 enriched genes and scored the proliferating cells clusters using the AddModuleScore function in Seurat. We then examined the distribution of cell-type signature scores across each of the sub-clusters of proliferating cells and re-assigned clusters based on enrichment of lineage-specific gene expression. Here, we assessed the significance of the cluster scores using a permutation test. More specifically, 1,000 permutations were performed in which the proliferating clusters were down-sampled to have the same number of cells. Cluster assignments of the cells were randomized and the average generic cell type signature score was calculated for each randomized cluster. The significance of a cell type score for each proliferating cluster was determined by comparing the observed average signature score to the random null distribution. Through this approach, distinct clusters of proliferating B cells, macrophages, neutrophils, plasma cells, and T cells were identified and re-assigned to their respective cell types.

Filtering of soup-defining transcripts
To avoid artifacts from ambient RNA contamination and cellular debris in sub-clustering of T cells and macrophages, genes that were observed to be soup-defining for any array were excluded. Specifically, a set of 210 soup-defining genes was identified that comprised 0.001 of total soup expression in any array. The threshold of 0.001 was selected to maximize the cumulative fraction of soup expression with the least number of genes to avoid removing underlying biology. Here, this threshold value represents cumulative fraction of soup expression accounted for by a given gene for each array. In a further effort to avoid removing cell type specific biology, any genes with average log-fold changes greater than 1.00 in T cells and macrophages compared to all other generic cell types were retained. In total, 204 and 180 genes were removed prior to sub-clustering analysis of T cells and macrophages, respectively.
Sub-clustering of granuloma unified T and NK cells Across the complete set of 44,766 T and NK cells, Louvain clustering was initially performed at a range of resolution of values (0.30-0.75) to examine the relationships between cluster membership. In this analysis, a cluster was observed to be defined by persistent expression of contaminating transcripts derived from macrophage and mast cells (Cluster 4 -Louvain Resolution 0.60). To confirm that these cells did not represent persistent doublets, all T cells were scored by expression of the top 20 cluster defining T cells and similar signature scores between the contaminated cell population were observed. Additional sub-clustering within the ''contaminated'' T cell cluster was performed to understand whether residual contamination obscured additional T cell biology; this failed to reveal additional T cell clusters not identified among the remaining non-contaminated populations. Since this contamination cluster was not observed to obscure a novel T cell phenotype, this population was excluded from downstream analysis (including compositional analyses associating cell type/group abundances with bacterial burden). Following removal of the cluster of T cells defined by residual contamination, dimensionality reduction and clustering at multiple clustering resolutions (Louvain resolution: 0.25-0.75) were performed. In this final analysis, a total of 12 T cell populations were identified at a clustering resolution of 0.75. Finally, additional subclustering was performed within the population of 2,377 gd and cytotoxic T cells, including dimensionality reduction and clustering at multiple resolutions (0.30-0.75). Here, 2 primary populations of cells were identified: sub-cluster 2, a population of cytotoxic cells enriched for expression of TRDC and sub-cluster 3, a population of XCL1+ NK cells. Differential expression analysis was performed to determine differences in gene expression between these clusters upon which the classification of these cells was based.
Additional sub-clustering analysis was performed within the T1-T17 population through repeated variable gene identification, dimensionality reduction and Louvain clustering (Resolution = 0.55), and 4 distinct sub-populations were discovered. Differential expression analysis was performed within the 9,234 T1-T17 cells using a Wilcoxon test in Seurat to identify sub-cluster defining gene signatures.
Annotation of T /NK subclusters T cell populations were classified using a combination of manual curation and comparison to literature-derived sequences. Granuloma T cell populations were compared to publicly available T cell population and scRNA-seq signatures. Specifically, comparisons were performed in the following ways: We then performed hierarchical clustering to identify clusters of correlated cell-types across granulomas, calculating the proportional composition of correlated cell-type clusters within each lesion. For each of the 5 clusters identified through hierarchical clustering, we calculated permutation p values to examine average correlation values. To understand the relationship between identified cell-type clusters and granuloma-level bacterial burden, we examined the abundance of correlated cell types by grouping lesions by timing of granuloma formation.

Cell-communication analysis
To examine cell-cell interactions, we first generated a curated list of receptor-ligand pairs through a combination of publicly-available databases and literature review. Within each granuloma, we generated edge weights between cell types for a given receptor ligand pair by multiplying the average receptor expression in Cell Type 1 by the average ligand expression in Cell Type 2. Edge weights were constructed for all receptor-ligand pairs and pairwise-cell type combinations within granulomas individually. Within each granuloma, we performed a total of 1,000 permutations for each receptor-ligand pair in which cell-type identifiers were randomly resorted and the resulting edge weight was recorded. For each receptor-ligand pair, the significance of the observed value was calculated from a z-score comparison of the observed value relative the permuted values.
We further performed adjustment of receptor-ligand edge weights at multiple levels. (1) To account for differences in the relative abundance of 'sender' cell types, we multiplied receptor-ligand edge weights by the proportion of all 'sender' cells within a granuloma. In effect, this generates a pool of 'sender' cell derived ligand that is available to act upon cell types bearing appropriate receptors. (2) To identify the most likely receiver cells, we weighted receptor-ligand edge-weights by the proportion of total receptor expression within the receiving cell subset cluster relative to the average receptor expression across all cells in the granuloma. In this scheme, receptors with more uniform expression across the entire granuloma will be down-weighted to reflect non-autonomous sinks of extracellular ligands, while receptors predominantly expressed by a single cell subset will be up-weighted. (3) Finally, we adjusted receptor-ligand edge weights to account for the percent of cells within the receiver cell subset expressing a given receptor by multiplying our receptor-ligand edge weights by the proportion of all 'receiver' cells expressing the receptor within the receiver cell subset.
To identify axes of intercellular communication with differential weights across granulomas, we performed student's t-tests of receptor-ligand edge weights between (A) high-burden and low-burden lesions, and (B) original and late-blooming lesions. We filtered results based on the following criteria: (1) the average permutation p values for the receptor-ligand pair within high or low-burden lesions <0.05, (2) p value from Student's t test in (A) or (B) above <0.05. The ''dplyr'' package in R was used to filter the resulting cell-cell interaction database to count significant interactions across cell type groups and granuloma burdens, identify cell type groups contributing to the top 10% of ligands most strengthened in either high or low burden granulomas, identify ligands most associated with high or low burden granulomas, and identify cell type specificity of these ligands. The ''circlize'' package in R was used to generate circus plots of the topology of signaling networks across high and low burden granulomas.

Statistical methods
Non-parametric Spearman's rho with Benjamini-Hochberg multiple testing correction was calculated for correlation analysis for evaluating the degree of relationship between cellular abundance and bacterial burden. As a complementary analysis approach that accounts for the inter-dependent nature of compositional data (i.e., where changes in counts of one cell type necessarily affect proportions of all other cell type, Dirichlet regression analysis was conducted to evaluate relationships between cell type abundances with bacterial burden; cell types were prioritized based on concordance between these two statistical testing frameworks. Non parametric t test or Mann-Whitney U test was used when comparing two groups. p values, or where appropriate adjusted or permutation p values, % 0.05 were considered significant. Statistical analysis was performed using GraphPad Prism v8 (GraphPad software, San Diego, CA), JMP Pro v12 and R base statistics.