B cells in perivascular and peribronchiolar granuloma-associated lymphoid tissue and B-cell signatures identify asymptomatic Mycobacterium tuberculosis lung infection in Diversity Outbred mice

ABSTRACT Because most humans resist Mycobacterium tuberculosis infection, there is a paucity of lung samples to study. To address this gap, we infected Diversity Outbred mice with M. tuberculosis and studied the lungs of mice in different disease states. After a low-dose aerosol infection, progressors succumbed to acute, inflammatory lung disease within 60 days, while controllers maintained asymptomatic infection for at least 60 days, and then developed chronic pulmonary tuberculosis (TB) lasting months to more than 1 year. Here, we identified features of asymptomatic M. tuberculosis infection by applying computational and statistical approaches to multimodal data sets. Cytokines and anti-M. tuberculosis cell wall antibodies discriminated progressors vs controllers with chronic pulmonary TB but could not classify mice with asymptomatic infection. However, a novel deep-learning neural network trained on lung granuloma images was able to accurately classify asymptomatically infected lungs vs acute pulmonary TB in progressors vs chronic pulmonary TB in controllers, and discrimination was based on perivascular and peribronchiolar lymphocytes. Because the discriminatory lesion was rich in lymphocytes and CD4 T cell-mediated immunity is required for resistance, we expected CD4 T-cell genes would be elevated in asymptomatic infection. However, the significantly different, highly expressed genes were from B-cell pathways (e.g., Bank1, Cd19, Cd79, Fcmr, Ms4a1, Pax5, and H2-Ob), and CD20+ B cells were enriched in the perivascular and peribronchiolar regions of mice with asymptomatic M. tuberculosis infection. Together, these results indicate that genetically controlled B-cell responses are important for establishing asymptomatic M. tuberculosis lung infection.


Clinical monitoring and survival
The mice were observed daily for routine health monitoring.The mice were weighed before M. tuberculosis aerosol infection and at least once per week afterward until consecutive weight loss was noted, and then were weighed up to daily.Criteria requiring euthanasia were any one of the following: severe weakness/lethargy, respiratory difficulty, or body condition score of 2 (25).We confirmed morbidity was due to pulmonary TB by finding (i) large nodular or severe diffuse lung lesions; (ii) histopatho logical confirmation of neutrophilic, lymphoplasmacytic, histiocytic, or granulomatous lung infiltrates; (iii) cultivable M. tuberculosis bacilli from lung tissue; and (iv) absence of other disease based on necropsy findings.Since the Institutional Animal Care and Use Committee disallowed natural death as an endpoint, the day a mouse was euthanized due to morbidity was used as a proxy of survival.Mice with morbidity not attributable to pulmonary TB were excluded from subsequent analyses.Progressors succumbed to pulmonary TB and were euthanized due to morbidity prior to 60 days post-infection.Controllers succumbed to pulmonary TB and were euthanized due to morbidity after 60 days post-infection.Asymptomatically infected Diversity Outbred mice were euthanized at a predetermined timepoint (on or before 60 days), and their classification state was determined by normal behaviors, postures, movements, eating/drinking, respiration, and weight gain.

Quantification of M. tuberculosis lung burden
Immediately after euthanasia, two or three lung lobes were removed from each mouse and homogenized in 1-mL sterile phosphate-buffered saline per lobe, serially diluted, plated onto OADC-supplemented 7H11 agar, and incubated at 37°C.After 3-4 weeks, M. tuberculosis colonies were counted, and M. tuberculosis burden in the lungs was calculated (18,22,24).

Histology
One or two lung lobes from each Diversity Outbred mouse were inflated and fixed in 10% neutral buffered formalin, processed, embedded in paraffin, sectioned at 5 micron, and stained with carbol fuschin for acid-fast bacteria followed by counterstaining with hematoxylin and eosin (H&E) at the Cummings School of Veterinary Medicine's Comparative Genomics and Pathology Shared Resource (North Grafton, MA, USA).Stained tissue sections on glass slides were digitally scanned by Aperio ScanScope or AT2 scanners at 0.23 microns/pixel at Vanderbilt University Medical Center's Digital Histology Shared Resource (Nashville, TN, USA).A separate set of tissue sections from Diversity Outbred mice was stained using immunohistochemistry to detect CD20 on B cells.Slides were digitally scanned by an Olympus VS2000 scanner at 0.138 microns/pixel at The Ohio State University's Comparative Pathology and Digital Imaging Shared Resource (Columbus, OH, USA).

Deep-learning and image analysis on lung tissue dual-stained by carbol fuschin and H&E
Lung tissue section images from 129 M. tuberculosis-infected Diversity Outbred mice were used for training, and 98 different images were used as a hold-out test set.Of the 129 training images, 66 were from asymptomatic mice and 63 images were from controllers.Of the 98 test set images, 10 were asymptomatic mice and 88 were controllers.Following image preprocessing (see Supplemental Methods), we used an attention-based multiple instance learning method (26) to identify regions of the images that contributed to classification and yielded an interpretable model (23).Briefly, we assigned lung tissue images based on the host classes (controllers and asymptomatic) and trained an end-to-end deep-learning model to predict the image-level class.The model consisted of two parts: a feature extractor followed by the attention mecha nism, depicted in Fig. S5.Optimization used the Adam optimizer with the following parameters: β1 = 0.9 and β2 = 0.999, an learning rate of 0.0001, a weight decay of 0.0005, and over 100 epochs.For each fold, 30 Diversity Outbred mice were randomly sampled for the validation set, and the rest of the cases were used for the training of the model, known as Monte Carlo cross-validation (27).To account for imbalanced training sets, the training procedure was modified to randomly select images from controller or asymptomatic mice with equal likelihood and then to randomly select a mouse from the selected category during every training iteration.Negative log-likelihood was used as a cost function.

Deep-learning and image analysis on lung sections stained by immunohisto chemistry for CD20
Peribronchiolar and perivascular regions were segmented based on a board-certified veterinary pathologist's (G.B.) manual training annotations using Aiforia Create (v.6.0) with default parameter settings (Aiforia Technologies, Helsinki, Finland).To quantify CD20+ B cells within the peribronchiolar and perivascular regions, we first identified the nuclei within the segmented regions at ×40 magnification using CellViT-SAM-H-x40 (28).Next, to identify CD20+ (brown colored) pixels, we used an entropy-based cell quantification method (29).Entropy-based cell quantification makes use of the uniform and perceptually aligned color representation of International Commission on Illumination (CIE) L*a*b* color space (30) to transform the complex quantification problem into an automatic entropy-based thresholding problem.Subsequently, we detected nuclei in contact with CD20+ (brown) pixels to identify the CD20+ cells.Finally, the total count of CD20+ B cells in the perivascular and peribronchiolar regions and the density of CD20+ B cells per square millimeter of peribronchiolar and perivascular segmented regions were calculated.

Gene expression in lung tissue by microarray analysis
One lung lobe from Diversity Outbred mice (n = 117) was homogenized in TRIzol and stored at −80°C, and RNA was extracted using PureLink RNA Mini Kits (Life Technolo gies, Carlsbad, CA, USA).The Boston University Microarray and Sequencing Resource Core Facility (Boston, MA, USA) confirmed RNA quality and quantity and prepared and hybridized material to Mouse Gene (v.2.0) ST microarrays.Raw CEL files were normal ized to produce gene-level expression values using the implementation of the robust multiarray average in the affy R package (v.1.62.0) and an Entrez Gene-specific probeset mapping (v.17.0.0) from the Molecular and Behavioral Neuroscience Institute (Brainarray) at the University of Michigan (31).All microarray data processing was performed using the R environment for statistical computing (v.3.6.0).

Data
The linear classifier we used for the cytokines and chemokines, and IgG antibody-based classification can only be applied to tabular data sets that do not have missing entries.Therefore, we filtered for the mice with complete measurements for the 12 cytokines and antibodies: CXCL5, CXCL2, CXCL1, TNF, IFN-γ, IL-12, IL-10, MMP8, S100A8, VEGF, anti-M.tuberculosis cell wall (CW), and anti-M.tuberculosis CFP.This filtering yielded asymptomatic (n = 30), controllers (n = 48), and progressor mice (n = 38) from two independent experimental infections.

Classification methods
To discriminate between acute pulmonary TB in progressors, chronic pulmonary TB in controllers, and asymptomatic lung infection, we first used a linear classifier, L1 regularized logistic regression.The regularization term promotes sparse coefficients (33), and λ is selected through grid search among 0, 10 −2 , 10 −1.95 , …, 10 1.95 , 10 2 .We used the scikit-learn implementation (34) of the logistic regression model.To mitigate the unbalanced classes, sample re-weighing with the "balanced" option of the scikit-learn library was used.We defined feature importance of a biomarker as the ratio of its (absolute) effect size (defined below) to the sum of all (absolute) effect sizes.More formally, let β j denote the effect size of the jth biomarker; its feature importance is given by , where p is the total number of biomarkers in that panel (i.e., 10 or 12).As an alternative method, we also tested a non-linear classifier, XGBoost (35).We used the python interface of the XGBoost and grid searched the two parameters: "learning_rate" ([1e-3, 1e-2]) and "n_estimators" ( [3,5,100]).We fixed the parameter "max_depth" to 3 and used the default values for the remaining parameters.During training, we again used sample re-weighing.We reported the performance corresponding to the classifier's best hyper-parameter using 30-fold cross-validation.Further details are discussed in the Supplemental Methods (36).The code used in the analysis will be made publicly available upon publication.

Imaging biomarkers
Model performance was evaluated using overall sensitivity, specificity, and area under the receiver operating characteristics curve (AUC) of a 10-fold Monte-Carlo cross-val idation.Ninety-five percent confidence intervals for each statistic were computed using bootstrapped samples of predictions (equal to the number of observations) with replacement (n = 1,000).Percentiles (97.5th and 2.5th) were taken as bounds for confidence intervals.
Corresponding P values to the AUC were calculated using one-sided Mann-Whitney U-statistic (38) with Python package statsmodels (v.0.13.2) (39).In the Supplemental Methods, we compared the statistically significant genes resulting from Mann-Whitney U-statistic with a parametric alternative, Welch's t-test.For each classification problem, the directionality of the test was selected such that the gene expression values were statistically higher in the class with longer survival under the alternative hypothesis.Benjamini-Hochberg correction was applied separately to each classification problem to control the false discovery rate at 0.05 after filtering the genes without correspond ing gene symbols.Clustermap is drawn using the python package seaborn (v.0.11.2) (40) with clustering metric "correlation." Enrichr (https://amp.pharm.mssm.edu/Enrichr)(41) was used to identify Gene Ontology (GO) biological processes (v.2023) that were significantly overrepresented (adjusted P < 0.05) within an input set of official mouse gene symbols.A subset of microarray data and analyses from progressor mice were published elsewhere (22,42), deposited in Gene Expression Omnibus (GEO) and assigned series ID GSE179417.

Survival, M. tuberculosis lung burden, and inflammatory biomarkers in infected mice
A low dose of aerosolized M. tuberculosis (20 ± 12 bacilli) results in early morbidity and mortality in approximately one-third of Diversity Outbred mice that succumb to acute necrosuppurative pulmonary TB with high bacterial burden within 60 days.This phenotype is reproducible across sexes, institutions, aerosol infection methods, and strains of M. tuberculosis and is not due to immune deficiency (16,18,19,22).M. tuberculosis infection significantly reduced survival of Diversity Outbred mice compared to identically housed, age-and sex-matched non-infected Diversity Outbred controls, and survival was significantly different from M. tuberculosis-infected C57BL/6J inbred mice, with approximately 25% of Diversity Outbred mice surviving longer than the median survival of C57BL/6J (Fig. 1A).The ~70% of Diversity Outbred mice that were more resistant to M. tuberculosis (i.e., controllers) survived longer than progressors (Fig. 1B).Table 1 summarizes forms of TB in humans (43)(44)(45)(46) that may be comparable to relative susceptibility and resistance to M. tuberculosis in Diversity Outbred mice.We speculated that resistance to M. tuberculosis could have been due to larger body size.However, retrospective analysis of preinfection body weight data failed to identify significant differences (Fig. 2A).As expected from inbreeding, age-and gender-matched C57BL/6J mice had a narrow preinfection weight range and, on average, weighed significantly less than Diversity Outbred mice (Fig. 2A).Controllers achieved higher body weights (Fig. 2B) due to normal growth while infected and longer duration of weight gain (Fig. 2C) prior to disease onset.Once disease onset occurred, controllers lost weight over a longer duration (Fig. 2D) and had a slower rate of weight loss or no weight loss (Fig. 2F).By euthanasia, both progressors and controllers had lost a similar percentage of weight (Fig. 2E), and controllers had a wider range of weight loss.Asymptomatic mice were euthanized before the end of their growth phase and thus achieved lower body weight than non-infected controls (Fig. 2B), and had a truncated duration of weight gain (Fig. 2C) and weight fluctuations without losing significant weight (Fig. 2D and E).
We expected controllers with end-stage pulmonary TB to achieve the same level of M. tuberculosis lung burden as progressors with end-stage pulmonary TB.Contrary to expectations, controllers had significantly lower M. tuberculosis burden than progressors (Fig. 3A).Likewise, we expected controllers and progressors to have similar levels of inflammatory mediators.However, except for IL-10, this also was not true.MMP8, CXCL1, TNF, and IFN-γ (22) were significantly lower in controllers compared to progressors (Fig. 3B, C, E and F) (Fig. 3D).These differences indicate that end-stage pulmonary TB in Diversity Outbred mice has two distinct pathogeneses: an "acute" form that is highly necrotizing, inflammatory, and promotes high M. tuberculosis bacillary growth, and a "chronic" form that is less inflammatory.Asymptomatic mice had significantly lower M. tuberculosis burden, CXCL1, and MMP8 and trend for lower levels of TNF, IL-10, and IFN-γ than progressors, and only lung M. tuberculosis burden and IL-10 were significantly lower between asymptomatic mice and controllers (Fig. 3A and D).Next, we analyzed data from four different modalities: protein measurements, histopathology, gene expression profiles, and immunohistochemistry staining for B-cell quantification by machine learning and statistical methods to find signatures that could distinguish acute pulmonary TB (progressors) vs chronic pulmonary TB (controllers) vs asymptomatic infection (see Fig. 4).We hypothesized that unique features identified by

A panel of lung cytokines, chemokines, and IgG antibodies classify acute TB and chronic pulmonary TB but not asymptomatic lung infection
To classify acute vs chronic end-stage pulmonary TB (i.e., progressors from controllers) vs asymptomatic infection, we analyzed a set of immune cytokines, chemokines, and growth factors by pairwise comparisons (Table 2).We first trained an L1-regularized logistic regression model using the following 10 cytokines, chemokines, and growth factors: CXCL5, CXCL2, CXCL1, TNF, IFN-γ, IL-12, IL-10, MMP8, S100A8, and VEGF.The classifier had high 30-fold cross-validation AUC (0.966) for progressor vs asymptomatic (Fig. S1) but performed relatively poorly (0.792 and 0.803) for comparisons against controllers (Table 2).When we included anti-M.tuberculosis CW IgG and anti-M.tuberculosis CFP IgG to the panel of lung proteins, the 30-fold cross-validation perform ance improved (Table 2).The improvement was highest for the progressor vs controller comparison in which the AUC increased from 0.792 to 0.933 (Fig. 5A).That improvement was attributed to anti-M.tuberculosis CW specifically, which had the highest average percentage of importance, while the other antibody, anti-M.tuberculosis CFP, was not used by the model (Fig. 5B).The classification between controllers and asymptomatic mice was the most challenging (AUC 0.83) (Table 2; Fig. S2).When an additional non-linear classifier, gradient tree boosting, was tested, the AUC did not improve (Table S1).All but one of the six panels performed with >90% accuracy when tested with (n = 22) non-infected Diversity Outbred mice previously unused during the training (Table S2).One panel, although successful for classifying between the two forms of end-stage pulmonary TB using antibodies, had low classification accuracy (32%) for the non-infected mice.That was because the non-infected Diversity Outbred mice had low levels of anti-M.tuberculosis CW (Fig. 3), which the classifier associates with lower survival (Fig. 5).

Qualitative evaluation of lung granulomas yields insight into asymptomatic lung infection
Anti-M.tuberculosis CW IgG improved the classification of progressors and controllers with acute and chronic end-stage pulmonary TB (Fig. 5) but not those with asympto matic lung infection.To find features of asymptomatic lung infection, a board-certified veterinary pathologist (G.B.) examined lung sections of progressors, controllers, and mice with asymptomatic lung infection (Fig. 6).Qualitative differences in size, cellular infiltrates, and distribution of granulomas were noted, like previous publications (18,22,23,42).The lungs of progressors contained coalescing fibrinous and necrosuppur ative granulomas with abundant pyknotic nuclear debris in alveoli and obstructing bronchioles, and necrosis of alveolar septae (panels A through D), often with fibrin thrombosis of septal capillaries (not shown).In contrast, the lungs of asymptomatic mice typically contained small, discrete, non-necrotizing lesions with perivascular and peribronchiolar aggregates of lymphocytes and few neutrophils (panels E through H).
The lungs of controllers typically contained diffuse, macrophage-dominated lesions, with many foamy macrophages, dense foci of lymphocytes and plasma cells, and occasional multinucleated giant cells (panels I through L), resembling end-stage pulmonary TB in the commonly used inbred mouse strain C57BL/6J (47).Additional features in controllers with end-stage pulmonary TB included cholesterol clefts, small pyogranulomas, septal fibrosis, cavitation with peripheral fibrosis, and bronchiolar obstruction with epithelial degeneration (not shown).

A deep-learning neural network produced an accurate, human-interpretable imaging biomarker of asymptomatic lung infection
Qualitative histopathological evaluation of granulomas provided insight; however, quantification of unique histopathological granuloma features was not feasible.Therefore, we trained and validated a deep-learning neural network using multiple instance learning and attention-based pooling to (i) classify asymptomatic mice and controllers and (ii) identify regions within the granulomas where the model made classification decisions based on feature importance.Table 3 shows the model achieved close to 90% sensitivity and 70% specificity with AUC close to 90%, an improvement over the lung biomarker panel.When we mapped attention weights back to the original images, the granuloma regions used as the basis for classifying asymptomatic lung infection was interpreted by a board-certified veterinary pathologist (G.B.) as perivascular and peribronchiolar lymphoplasmacytic cuffs (Fig. 7A and B; white areas, annotated in yellow).In contrast, the neural network did not weight granuloma regions that contained abundant macro phages, cholesterol clefts, or small pyogranulomas (Fig. 7C and D; black areas, annotated by red circles), which were characteristic of chronic pulmonary TB in controllers.Thus, the model identified an imaging biomarker (perivascular and peribronchiolar lympho plasmacytic cuffs, a form of GrALT) is a diagnostically accurate granuloma feature of  a Including antibodies improves the AUC in all three comparisons.The sensitivity is calculated with respect to Progressors in the first two comparisons and Controllers in the last comparison.Please refer to Fig. S3 for the confusion matrices.AS, asymptomatic; CR, controller; PR, progressor.
b For each of the classification tasks and their corresponding metrics, the panel with the higher performance is highlighted in bold.
asymptomatic lung infection.We hypothesized this granuloma feature corresponded to unique functional responses capable of restricting M. tuberculosis.

Gene expression analysis identifies functional correlates of asymptomatic infection in lung tissue
To identify functional correlates of asymptomatic lungs, we used transcriptional profiling and pathway analyses on two available data sets.One data set consisted of non-infected (n = 5), asymptomatic (n = 13), controller (n = 16), and progressor Diversity Outbred (n = 10) mice.The second data set consisted of non-infected (n = 9), asymptomatic (n = 36), and progressor Diversity Outbred (n = 28) mice.Within each data set, we performed a one-sided AUC analysis (see Materials and Methods) comparing progressor and asymptomatic lung samples, which identified sets of 2,569 and 6,891 genes expressed at significantly (false discovery rate [FDR] q < 0.05) higher levels in the lungs of asymptomatic mice within data sets 1 and 2, respectively.These two sets contained 2,264 genes in common, with the average AUC values of the two data sets ranging from 0.743 to 0.969 with a median of 0.844 (File S1).This set Magnifications are ×2, ×4, ×20, and ×40. of genes was input using Enrichr for pathway analysis, which identified 21 statistically significant (adjusted P < 0.05) Gene Ontology terms (Table S3).
In the same manner, we compared controller and progressor lungs using data set 1 and identified a set of 303 genes expressed at significantly (FDR q < 0.05) higher levels in the lungs of controllers.The AUC of the selected genes ranged from 0.888 to 1.0 with a median of 0.919 (File S1).Enrichr analysis identified four pathways containing statistically significant, highly expressed genes in controllers, and the pathways represent adaptive immunity, i.e., T-cell activation and B-cell receptor signaling pathway and antigen receptor-mediated signaling pathways (Table 4).
None of the genes were significant at FDR q < 0.05 for asymptomatic vs controller lungs.Upon further inspection, however, we observed that genes with high diagnostic  potential for the binary classification between asymptomatic and controller groups can include genes that are elevated in the non-infected lungs (Fig. S6), consistent with the histopathological finding that asymptomatically infected Diversity Outbred mice maintain a substantial fraction of normal lung tissue.
To focus on finding unique transcriptional signatures, we compared lungs from asymptomatic mice to lungs from all other groups: controller, progressor, and non-infec ted using data set 1.This identified 105 genes that were expressed at significantly (FDR q < 0.05) higher levels in the lungs (Fig. 8).The AUC values of the identified genes ranged from 0.844 to 0.963 with a median of 0.864 ( File S1).Pathway analysis using Enrichr identified eight statistically significant (adjusted P < 0.05) GO terms associated with the 105 genes (Table 5).a Only the significant pathways (adjusted P < 0.05) are displayed.
pathways with highly expressed genes indicate that B-cell differentiation, proliferation, activation, and effector functions may be important for establishing asymptomatic lung infection and early resistance to M. tuberculosis.

Immunohistochemistry and quantitative image analysis shows more B cells in perivascular and peribronchiolar regions of asymptomatic lung infection
Gene expression profiles suggested that B cells were functionally important in asymptomatic control of M. tuberculosis lung infection but could not spatially locate B cells to the perivascular and peribronchiolar regions that were identified in the lung sections stained by carbol fuchsin and hematoxylin and eosin.To confirm, localize, and quantify B cells in the perivascular and peribronchiolar regions, we stained lung tissue sections from Diversity Outbred mice using immunohistochem istry to detect CD20.CD20 is specifically expressed by immature and mature B cells and is encoded by the gene Ms4a1, which was highly expressed (Table 5).
Next, the peribronchiolar and perivascular regions were segmented, and areas were analyzed for CD20+ cells using entropy-based cell quantification combined with a deep learning-based nuclei detector, CellViT-SAM-H-x40.Pathologist evaluation confirmed positive and negative assay controls worked as expected, and confirmed the presence of perivascular and peribronchiolar CD20+ cells in all lung tissue sections.Representative images from non-infected, progressor, asympto matic, and controller lungs at low and high magnification show the brown staining for CD20 is most evident around bronchioles and blood vessels of asymptomatic and controller mice (Fig. 9A through D).The total number of CD20+ cells in the peribron chiolar and perivascular regions showed statistically significant differences between the asymptomatic (n = 27), progressor (n = 9), controller (n = 18), and non-infected Diversity Outbred mice (n = 19) (Fig. 9E).Controllers and asymptomatic mice had higher total CD20+ cell counts, and both groups were statistically different compared to the progressor and the non-infected groups.The density of CD20+ cells per square millimeter of peribronchiolar and perivascular regions also varied significantly between the four groups (Fig. 9F), with progressor lungs having significantly lower CD20+ cell density than all other groups and a trend for highest CD20+ cell density in asymptomatic that was not statistically significant.The median value of the density of CD20+ cells was the lowest in progressors, 48.66 cells/mm 2 , followed by non-infected mice with 256.0 cells/mm 2 , controllers with 324.9 cells/mm 2 , and asymptomatic mice with the highest median value, 619.2 cells/mm 2 .Overall, these quantitative immunohistochemistry results validate the gene expression profiles of asymptomatic lung infection and spatially locate B cells to the discriminatory imaging biomarker (perivascular and peribronchiolar lymphocytic cuffs) of asymptomatic lung infection.To discover signatures of resistance to M. tuberculosis infection, we performed long-term survival studies in the Diversity Outbred mouse population and showed survival was bimodal: less than 60 days or greater than 60 days.Interestingly, no infected mice reached the median survival of non-infected mice, and even the most resistant Diversity Outbred mouse eventually succumbed to chronic progressive pulmonary TB, as occurs in commonly used inbred strains of mice (47).We determined that the cytokines and chemokines that accurately classified progressors with acute pulmonary TB from non-progressors (22) could not distinguish progressors from controllers with chronic pulmonary TB but that the addition of anti-M.tuberculosis CW IgG significantly improved diagnostic accuracy.None of cytokines, chemokines, growth factors, or antibodies in our data set produced a biomarker panel that could accurately classify asymptomatic lung infection.This interesting result is consistent with historical challenges to accurately diagnose latent TB infection in humans by using skin tests, interferon gamma respon ses, and antibody-based serological tests (53)(54)(55)(56)(57)(58)(59).However, recent studies are more promising as specific types of antibodies have more diagnostic power (60)(61)(62).
In our studies, histopathological analyses combined with gene expression were much more useful to find key features of asymptomatic lung infection that have diagnostic value and provide mechanistic insight.Our deep-learning model automatically identified a granuloma feature specific to asymptomatic M. tuberculosis lung infection, which was interpreted by a pathologist as perivascular and peribronchiolar lymphocytic cuffs.This histopathological granuloma feature aligns with a large body of prior work in inbred mice, non-human primates, and natural experiments in humans (e.g., humans with acquired immune deficiency, or with genetic immune deficiencies), indicating that CD4 T lymphocytes and their effector molecules are required for resistance to M. tuberculosis (63).We therefore expected that lung tissue from asymptomatically infected Diversity Outbred mice would contain highly expressed genes and gene expression pathways indicative of CD4 T cell functions.However, we were surprised to find that the differentially expressed genes in lungs of Diversity Outbred mice with asymptomatic M. tuberculosis infection corresponded to B-cell functions and signaling, not to CD4 T-cell functions and signaling.Of the eight upregulated pathways in lungs of asymptomati cally infected Diversity Outbred mice, seven involved B-cell differentiation, proliferation, activation, or effector functions.None of the significant pathways were specific for T lymphocytes or CD4 T cells of any subtype.
We compared our gene expression results with a 2020 study by Ahmed et al. (19), which used RNAseq to identify differences between Diversity Outbred mice with a high-risk disease score (n = 16), low-risk disease score (n = 13), and non-infected (n = 10).Of the 105 genes we identified in the lungs of asymptomatically infected Diversity Outbred mice, 14 were also reported by Ahmed et al. as significantly overexpressed in low-risk vs high-risk disease scores, and 52 were significantly overexpressed in the low-risk disease score vs non-infected (19).The different results may reflect differences in methods (microarray vs RNAseq) and sample size (hundreds vs tens).However, more importantly, among the matching sets of 14 and 52 highly expressed genes, 10 overlapped: Pax5, Zfp318, Thada, Ralgps2, Dclk2, Itpr2, Cyb561a3, Dock8, F8, and Bach2.Many of these genes transcriptionally regulate B-cell differentiation and immunoglobulin production.Esaulova et al. reported similar findings in a 2021 study that B-cell follicles were smaller in the lungs of macaques with pulmonary TB (n = 5) compared to those with latent TB infection (n = 2), and a negative correlation between the B-cell follicle size and lung M. tuberculosis burden (64) supported a protective role of inducible bronchiolar-associated lymphoid tissue.Their single-cell RNAseq analysis indicated the relative number of CD79A+ B cells was higher in the pulmonary TB compared with latent TB infection (64).Our results suggest the opposite: lungs of asymptomatically infected Diversity mice had significantly higher expression of Cd79a as well as other B-cell genes, including Ms4a1 that encodes CD20.We also observed more CD20+ B cells in perivascular and peribronchiolar regions.
Our studies examining resistance to M. tuberculosis using protein biomarkers, lung histopathology, deep-learning neural networks, gene expression profiles, and immuno histochemistry provide insight, but there are limitations.One limitation is that Diversity Outbred mouse population does not model all forms of TB in humans, likely because M. tuberculosis is a human-adapted bacillus and because humans and mice have differentsized lungs, leading to clinical disease at different-sized lesions.For example, the lungs of an asymptomatic human could readily tolerate a 1-cm 3 granuloma, but that same-sized lesion would cause mortality in a mouse.Although we have nearly 900 mice available for survival and body weights, we have gaps in the data sets, and subsequent analyses used smaller subsets, often 100-200 samples.In part, these data gaps reflect smaller volumes, for example, in biomarker panel studies.The gene expression profiles had 117 samples available, and second sets of gene expression profiles remain in processing and will become available for future studies.Lastly, this work did not explore joint analysis of different modalities because using either gene expression profiles or histopathology slides alone enabled accurate classification between acute pulmonary TB in progressors, chronic pulmonary TB in controllers, and the lungs from asymptomatically infected mice.A future direction can be capturing the intermodal relationships for identifying more complex resistance signatures and boosting the diagnostic accuracy.
Overall, our results show two main findings that have important implications.First, there are two distinct forms of end-stage pulmonary TB in Diversity Outbred mice, which can inform the pathogenesis of pulmonary TB in humans and support research to discover and host-directed therapies against these two forms of TB.Second, by applying novel computational approaches, image analysis, and lung transcriptional profiles, we found granuloma regions of perivascular and peribronchiolar lymphocytic cuffs specific to asymptomatic lung infection and lung functional responses, which show B cells may be important to establish asymptomatic M. tuberculosis lung infection in genetically heterogenous populations.Future studies using the Diversity Outbred mouse population can define the genetic control upstream of B-cell responses to M. tuberculosis to improve our understanding of how genotype controlled responses restrict M. tuberculosis.

FIG 1
FIG 1 Survival of M. tuberculosis-infected mice.We infected 8-to 10-week-old, female Diversity Outbred (DO) mice and C57BL/6J mice with aerosolized M. tuberculosis bacilli and monitored as described in Materials and Methods.Non-infected, identically housed, and age-and gender-matched Diversity Outbred mice served as controls.Mice were euthanized at a predetermined timepoint, or if any one of three morbidity criteria developed a body condition score of <2, severe lethargy, or increased respiratory rate/effort.(A) The percent alive over time (cumulative survival) and the red vertical line mark 60 days post-infection when approximately 30% of Diversity Outbred mice succumbed to pulmonary TB due to early morbidity (progressors).Controllers survived at least 60 days without morbidity and succumbed later.Survival of non-infected Diversity Outbred mice (brown dashed line), infected Diversity Outbred (brown solid line), and infected inbred C57BL/6J (solid gray line) mice was significantly different by Mantel-Cox log-rank test.****P < 0.0001.(B) A subset of 556 mice from panel A that were euthanized because of pulmonary TB-related morbidity (526 mice) or non-infected controls euthanized at the end of the experiment (30 mice).Groups are shown on the X-axis box-and-whiskers plots in panel B, with interquartile range with whiskers at the minimum and maximum.Statistical analysis was performed using Brown-Forsythe and Welch's one-way analysis of variance followed by Dunnett's T3 post-test.****P < 0.0001.

FIG 2 FIG 3
FIG 2 Preinfection body weights and weight-related indicators of pulmonary TB in M. tuberculosis-infected mice.We infected 8-to 10-week-old, female Diversity Outbred mice and C57BL/6J mice with aerosolized M. tuberculosis bacilli and monitored as described in Materials and Methods.Non-infected, identically housed, and age-and gender-matched Diversity Outbred mice served as controls.Mice were euthanized at a predetermined timepoint, or if any one of three morbidity criteria developed: body condition score of <2, severe lethargy, or increased respiratory rate/effort.(A) Preinfection body weights, (B) peak body weight achieved during infection, (C) duration of weight gain, (D) duration of weight loss, (E) percentage of peak body weight lost, and (F) rate of weight lost are shown.Box-and-whiskers plots in all panels show interquartile range with whiskers at the minimum and maximum.Group names and sample sizes are shown on the X-axis.Statistical analyses were performed using Brown-Forsythe and Welch's one-way analysis of variance followed by Dunnett's T3 post-test.Non-significant P values are not shown.****P < 0.0001.

FIG 3 (
FIG3 (Continued)    three morbidity criteria developed: body condition score of <2, severe lethargy, or increased respiratory rate/effort.We quantified M. tuberculosis colony-forming units in the lungs (A) and measured eight lung proteins using sandwich ELISAs (B-I).Box-and-whiskers plots in all panels show interquartile range with whiskers at the minimum and maximum.Sample sizes are shown in the X-axis.Statistical analyses were performed using Kruskal-Wallis one-way analysis of variance (ANOVA) with Dunn's multiple comparisons post-tests (A) or Brown-Forsythe and Welch's one-way ANOVA followed by Dunnett's T3 post-test (B-F).Non-significant P values are not shown.**P < 0.01, ***P < 0.001, ****P < 0.0001.

FIG 4
FIG 4 Four modalities, (i) protein biomarkers, (ii) H&E-stained lung tissue sections, (iii) gene expression profiles, and (iv) immunohistochemistry staining for B-cell quantification, are used to characterize asymptomatic M. tuberculosis lung infection in Diversity Outbred mice.We used statistical/machine learning approaches to quantify the feature importance of protein biomarkers; an interpretable deep-learning model to identify regions of H&E-stained slides, AUC analysis to filter genes for subsequent Enrichr pathway analysis, and entropy-based cell quantification for quantifying the CD20+ cells in segmented perivascular and peribronchiolar regions of immunohistochemistry (IHC)-stained images.*During the gene expression analysis, we used two separate data sets.The images corresponding to modalities ii and iv are included as examples to illustrate the analysis process and are presented with further details (see Fig. 7A and 9C, respectively).AS, asymptomatic; CR, controller; PR, progressor.

FIG 5
FIG 5 Feature importance analysis for classification between progressors and controllers using cytokine, chemokine, and anti-M.tuberculosis antibody measurements.(A) Receiver operating characteristic (ROC) curve comparison of the 10-biomarker panel (blue) and 12-biomarker panel (orange) for the progressor vs controller comparison.(B) Percent Importance of the different biomarkers in two panels.Logistic regression is the classifier and importance scores averaged over 30-fold.Biomarkers corresponding to the unhatched colors are associated with longer survival and vice versa for the hatched colors.Biomarkers with less than 1% importance are omitted.The feature scores for other comparisons are shown in Fig. S1 and S2.

FIG 6
FIG 6 Representative histopathological lesions in the lungs of M. tuberculosis-infected Diversity Outbred mice.We infected 8-to 10-week-old, female Diversity Outbred mice with M. tuberculosis bacilli by inhalation.Lung lobes were fixed, stained, and sectioned for microscopic examination.(A-D) Representative necrosuppurative lung lesions with bronchiolar obstruction in progressors.(E-H) Non-necrotizing lymphohistiocytic lung lesions in asymptomatic mice.(I-L) Diffuse, non-necrotizing lesions with abundant macrophages, foamy macrophages, scattered lymphocytic foci, and a few cholesterol clefts in controllers.

FIG 7
FIG 7 Perivascular and peribronchiolar lymphocytic cuffs are imaging biomarkers of asymptomatic lung infection and resistance to M. tuberculosis.For each panel pair, the H&E-stained lung tissue section is displayed on the left, and the corresponding attention weights for the same lung region are displayed on the right.(A and B) Representative examples where the deep-learning neural network found an imaging biomarker of asymptomatic lung infection in Diversity Outbred mice, and then a board-certified veterinary pathologist determined that the regions receiving the highest attention weights (white) corresponded to perivascular and peribronchiolar lymphoplasmacytic cuffs, outlined in yellow.In contrast, (C and D) representative examples within the granulomas where the regions received very low attention weights are in black, and then a board-certified veterinary pathologist determined that the regions receiving little attention were the macrophage-rich regions, encircled in red.A scale bar corresponding to 50 microns is displayed on each top left corner.

FIG 8
FIG 8 Cluster heat map of the 105 genes selected for the asymptomatic vs rest classification.Columns correspond to the (n = 44) mice used in the asymptomatic vs rest classification.Rows correspond to the identified 105 genes.Color indicates the z-score, which is calculated by first subtracting the row wise average from the gene expression value and next dividing it by the row wise standard deviation.Gene expression values are first log2 transformed.

FIG 9
FIG 9 Distribution of CD20+ cells in the perivascular and peribronchiolar regions.To quantify the B cells within perivascular and peribronchiolar lymphocytic cuffs, we IHC-stained the lung tissue sections of n = 73 Diversity Outbred mice.The perivascular and peribronchiolar regions within the IHC images are segmented using Aiforia Create.In each segmented region, the CD20+ cells are detected by entropy-based cell quantification combined with a deep learningbased nuclei detector, CellViT-SAM-H-x40.(A-D) Representative perivascular and peribronchiolar regions of mice with CD20+ cell density close to their class medians.Each panel corresponds to a tissue section from a different susceptibility class: (A) non-infected, (B) progressor, (C) asymptomatic controller, and (D) (Continued on next page)

FIG 9 (
FIG 9 (Continued) controller.Magnification is ×80, and the high-magnification inserts are magnified ×400.The images were not altered in any way (i.e., not zoomed in or out) after extraction.(E and F) Visualization of the total number of CD20+ cells and density of CD20+ cells (cells per mm²) within the perivascular and peribronchiolar regions respectively.Box-and-whiskers plots in all panels show interquartile range with whiskers at the minimum and maximum.Each dot is one Diversity Outbred mouse and n = 27 Asymptomatic, n = 9 Progressor, n = 18 Controller, and n = 19 non-infected mice are shown.Statistical analyses were performed using Kruskal-Wallis one-way ANOVA with Dunn's multiple comparisons post-tests (E and F).Non-significant P values are not shown.**P < 0.01, ***P < 0.001, ****P < 0.0001, * P<0.05.

TABLE 1
Susceptibility to M. tuberculosis and forms of TB in humans and Diversity Outbred mice a aThe upper and lower halves of the table present the characteristics of TB in humans and Diversity Outbred mice, respectively.Columns correspond to different forms of TB and rows correspond to different properties.We present pairs of TB forms which share similar characteristics in the same column, i.e., fulminant and progressors (acute end stage), Post-primary and controllers (chronic end stage), latent infection, and asymptomatic.Such pairwise comparisons can have limitations (see Discussion) or not unavailable in general as the two forms, primary and resisters, do not have a corresponding class in the Diversity Outbred population.BAL, bronchoalveolar lavage; NA, not available; ND, not routinely done; neg., negative; pos., positive.

TABLE 2
Thirty-fold cross-validation performance of the two panels (10 and 10 + 2 antibodies) in three classification tasks (progressor vs controller, progressor vs asymptomatic, and controllers vs asympto matic) a

TABLE 3
Results of attention-based multiple instance learning model classification of asymptomatic mice and controllers

TABLE 4
Microarray Enrichr analysis resulting from the significantly different, highly expressed genes in the lungs of controllers with end-stage chronic pulmonary TB vs progressors with end-stage acute pulmonary TB a

TABLE 5
Microarray Enrichr analysis resulting from the genes selected for the asymptomatic lung classification vs other groups (progressor, controller, and non-infected) a

DISCUSSION
(1,6,(48)(49)(50)(51) remains a global health problem with high morbidity and mortality, most humans develop asymptomatic infection of various forms (e.g., primary TB, latent TB infection, and resisters).Only a minority of infected humans develop any of the symptomatic disease forms (e.g., progressive primary TB, miliary TB, fulminant TB, or active pulmonary TB)(1,6,(48)(49)(50)(51).Because infection is often silent and lung tissues from naturally resistant humans are not readily available, mechanisms of lung resistance are difficult to model and identify.A growing body of evidence supports the use of the Diversity Outbred mouse population to model disease states of pulmonary TB in biomarker discovery, gene expression signatures, and pathogenesis(14-19, 22, 32, 52).However, fewer studies focus on resistance to M. tuberculosis, which occurs in asympto matic infection.The novelty of this work is finding unique features of lung resistance to M. tuberculosis by using the Diversity Outbred mouse population and multimodal computational approaches.