Whole blood transcriptional profiling offers great diagnostic and prognostic potential. Although studies identified signatures for pulmonary tuberculosis (TB) and transcripts that predict the risk for developing active TB in humans, the early transcriptional changes immediately following Mycobacterium tuberculosis infection have not been evaluated. We evaluated the gene expression changes in the cynomolgus macaque model of TB, which recapitulates all clinical aspects of human M. tuberculosis infection, using a human microarray and analytics platform. We performed genome-wide blood transcriptional analysis on 38 macaques at 11 postinfection time points during the first 6 mo of M. tuberculosis infection. Of 6371 differentially expressed transcripts between preinfection and postinfection, the greatest change in transcriptional activity occurred 20–56 d postinfection, during which fluctuation of innate and adaptive immune response–related transcripts was observed. Modest transcriptional differences between active TB and latent infection were observed over the time course with substantial overlap. The pattern of module activity previously published for human active TB was similar in macaques with active disease. Blood transcript activity was highly correlated with lung inflammation (lung [18F]fluorodeoxyglucose [FDG] avidity) measured by positron emission tomography and computed tomography at early time points postinfection. The differential signatures between animals with high and low lung FDG were stronger than between clinical outcomes. Analysis of preinfection signatures of macaques revealed that IFN signatures could influence eventual clinical outcomes and lung FDG avidity, even before infection. Our data support that transcriptional changes in the macaque model are translatable to human M. tuberculosis infection and offer important insights into early events of M. tuberculosis infection.

Despite major global health efforts, 9.6 million new cases of tuberculosis (TB) occurred in 2014 (1), with increasing rates of multidrug-resistant TB annually. Those infected with Mycobacterium tuberculosis can develop symptomatic (active) TB disease (5% of people) or asymptomatic infection, called latent infection (LTBI). There is a growing body of evidence (24) recognizing that M. tuberculosis infection outcome is not binary but rather is a spectrum of clinical states with varying severity of active TB, as well as LTBI that ranges from low-grade or subclinical infection to bacterial clearance. Those with LTBI serve as a reservoir for reactivation TB, which can occur years to decades after initial infection. The spectrum of latent infection likely contributes to the risk for reactivation for an individual (2, 5). We (6) recently showed that high lung [18F]fluorodeoxyglucose (FDG) avidity is associated with risk for reactivation in nonhuman primates (NHPs) with latent infection. There are a variety of factors that could influence the outcome of infection, including M. tuberculosis strain type, exposure dose and duration, host genetics, and immune response. In human M. tuberculosis infection, very little is known about the early events postinfection (p.i.), because current diagnostics are not sensitive enough to determine time of infection, and the early course of infection is relatively clinically silent. However, the early innate and adaptive responses are likely to be crucial in the ultimate outcome of infection (4, 7). In humans and in animal models, various types of immune compromise exacerbate primary infection (reviewed in Refs. 811). A better understanding of early host responses to the infection is necessary to improve vaccine development and determine those who are at greatest risk for disease.

Blood transcriptional profiling provides a global perspective on the dynamics of complex molecular and cellular events in specific human diseases and has improved the diagnosis and biological understanding of diseases (12). Over the past decade, several groups published PBMC or whole blood transcriptional signatures associated with human active TB or LTBI. Whole-genome blood transcriptional profiles from a large-scale study of subjects recruited from high and moderate TB-endemic countries showed gene expression differences that could discriminate individuals with active TB from those with LTBI and healthy controls (reviewed in Ref. 12). Transcriptional profiles could also distinguish the response to anti-TB drug treatment as early as 2 mo into treatment (13). Finally, a recent study reported transcriptional signatures that correlated with risk for active TB in adolescents who had latent infection at baseline (14), suggesting that a pre–M. tuberculosis infection signature might play an important role in determining the outcome of infection. These studies underscored the potential of transcriptional profiling as a diagnostic and predictive tool in TB. Such studies provided key insights into immune factors that distinguish host immune responses between diseases with similar clinical manifestations.

In this study, we used an established NHP model to interrogate early transcriptional changes after M. tuberculosis infection. Cynomolgus macaques infected with low-dose M. tuberculosis develop the full clinical spectrum of pathology and outcomes seen in humans, including latent infection (4, 15). As reported by our group previously (4, 15, 16), active TB disease is defined by clinical signs of active TB (e.g., cough, weight loss, anorexia), the presence of M. tuberculosis growth in the bronchoalveolar lavage or gastric aspirate fluid, and/or elevated systemic inflammatory markers (by erythrocyte sedimentation rate). Animals with LTBI are asymptomatic, without M. tuberculosis growth in bronchoalveolar lavage or gastric aspirate and no evidence of systemic inflammation. Latent infection is not declared until 6 mo p.i., whereas active disease can be declared before 6 mo. In this infection model, ∼40–50% of infected animals progress to active disease, with the remainder maintaining latent infection. We showed that the outcome of infection (active TB or latent infection) can be distinguished as early as 6 wk p.i. based on serial positron emission tomography (PET)–computed tomography (CT) imaging (16) and, to a lesser extent, by immunological assays (4). Given the close genetic similarities between humans and NHPs, this model provides an exceptional platform for in-depth investigations of host–pathogen interactions. Unlike humans, the use of this model allows us to control the inoculation dose, timing, and strain while conducting serial, high-frequency sampling to examine p.i. responses.

In this model, we evaluated serial whole-genome blood transcriptional signatures in NHPs from preinfection to 6 mo post–M. tuberculosis infection using a human microarray platform. We used linear mixed model analysis, k-means clustering, and a pre-existing human modular gene expression framework to interrogate the data. The greatest dynamic change occurs between days 20 and 56 p.i., regardless of the clinically defined outcome. We extended our evaluation to identify signatures that were associated with outcome using clinical definitions of active TB and LTBI, as well as total lung inflammation, as measured by the FDG avidity in PET-CT, a surrogate marker for disease severity (17). Finally, we show that the signature associated with active TB in NHPs is comparable to that observed in human TB, suggesting that the NHP signature is translational to humans.

Adult (>4 y of age) cynomolgus macaques (Macaca fascicularis) (Valley Biosystems, Sacramento, CA) were housed within a Biosafety Level 3 primate facility, as previously described (4, 18, 19). Monkeys were infected with low-dose M. tuberculosis (Erdman strain) via bronchoscopic instillation of ∼25 CFU per monkey to the lower lung lobe. Infection was confirmed by tuberculin skin test conversion and/or lymphocyte proliferation assay 6 wk p.i. (19). Serial clinical, microbiologic, and immunologic examinations were performed, as previously described. Based on defined clinical criteria and radiographic and microbiologic assessments during the course of infection, monkeys were classified as having latent infection or active disease as late as 6 mo p.i., as described previously (4, 15, 18, 19). A total of 38 animals were included in this study over a period of 2 y, from 2011 to 2013. Because of logistical constraints, the study was divided into two sets composed of 19 animals each. Sixteen animals were declared to have active disease, whereas 22 remained with latent infection. After infection outcome was declared, these animals were dedicated to other on-going studies.

All experimental manipulations, protocols, and care of the animals were approved by the University of Pittsburgh School of Medicine Institutional Animal Care and Use Committee (IACUC). The protocol assurance number for our IACUC is A3187-01. Our specific protocol approval numbers for this project are 11090030 and 1105870. The IACUC adheres to national guidelines established in the Animal Welfare Act (7 U.S.C. Sections 2131–2159) and the Guide for the Care and Use of Laboratory Animals (8th edition), as mandated by the U.S. Public Health Service Policy.

Longitudinal blood sampling included two pre–M. tuberculosis infection time points, as well as days 3, 7, 10, 20, 30, 42 (6 wk), 56 (8 wk), 90 (12 wk), 120 (16 wk), 150 (20 wk), and 180 (24 wk) post–M. tuberculosis infection. For animals that developed active disease prior to 180 d p.i. (Supplemental Fig. 1), the last time point was the time at which the active disease was declared.

One milliliter of macaque whole blood was drawn into heparin blood collection tubes (BD Biosciences) and mixed well. Then, 500 μl of this blood was immediately aliquoted into a cryovial containing 1.5 ml of Tempus reagent. All samples were thoroughly vortexed for 15 s to facilitate RNA stabilization. Samples were stored at −80°C until RNA extraction.

Cellular composition of the blood was monitored by complete blood counts, including differential counts (analyzed by the clinical laboratory at the University of Pittsburgh Medical Center), and by flow cytometry at the same time points. PBMCs were isolated via Percoll gradient centrifugation as previously described (20). PBMCs were stained for cell surface markers along with the respective isotypes for flow cytometry. T cell markers included CD3 allophycocyanin-Cy7 (clone SP34-2; BD Horizon), CD4 FiTC (clone L200; BD Pharmingen), CD8 Pacific Blue (clone DK25; Dako). For B cells, CD20 PE Cy7 (clone 2H7; eBioscience) and for NK cells CD16-PE (clone 3G8; BD Pharmingen) were included. Data acquisition was performed using an LSR II (BD), and data were analyzed using FlowJo Software v.9.7 (TreeStar, Ashland, OR).

At the time that infection outcome was declared, animals underwent a PET-CT scan. Animals were sedated, intubated, and imaged by FDG PET imaging (microPET Focus 220 preclinical PET scanner; Seimens Molecular Solutions) and by a CT scanner (NeuroLogica) within our Biosafety Level 3 facility, as previously described (16, 18). The total lung FDG avidity was analyzed using OsiriX viewer, an open-source PACS workstation, and DICOM viewer. The whole lung was segmented on CT using the Growing Region algorithm on the OsiriX viewer to create a region of interest (ROI) of normal lung (≤200 Hounsfield units). The closing tool was used to include individual nodules and other pulmonary disease. The ROI was transferred to the coregistered PET scan and manually edited to ensure that all pulmonary disease was included. All extrapulmonary structures and disease, including mediastinal lymph nodes, were excluded. Voxels outside the ROI were set to zero, and voxels with a standardized uptake volume (SUV) greater than or equal to normal lung (SUV ≥ 2.3) were isolated. Finally, the Export ROIs plug-in was used to export the data from these isolated ROIs to a spreadsheet, where the total SUV per voxel was summed to represent the total lung FDG avidity. PET-CT imaging data (lung FDG avidity) were only available for 37 animals.

Total RNA was isolated from whole blood lysate using the MagMAX-96 Blood RNA Isolation Kit (Applied Biosystems), according to the manufacturer’s instructions. All postextraction RNA yields were measured using a NanoDrop 8000 (NanoDrop Technologies). RNA integrity (RIN) values were assessed on a 2100 Bioanalyzer (Agilent). RIN and yield data were managed using a laboratory information management system for quality control and sample tracking. Samples with RIN values > 5.5 were retained for further processing. Globin mRNA was depleted from a portion of each total RNA sample using a GLOBINclear Kit, Human (Ambion, Austin, TX). Following globin reduction, RIN was assessed a second time (RIN cutoff = 5.5, average RIN value = 7.5, SD = 0.83). Globin-reduced RNA was amplified and labeled using an Illumina TotalPrep RNA Amplification Kit (Ambion). The RNA input for this reaction was 250 ng and 750 ng of amplified labeled RNA was hybridized overnight to HT-12 v4 BeadChips (Illumina). Following hybridization, each chip was washed, blocked, stained, and scanned on an Illumina BeadStation 500, following the manufacturer’s protocols. Microarray data have been submitted to the Gene Expression Omnibus under accession number GSE84152 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE84152).

Normalization and batch correction.

Illumina’s GenomeStudio software, version HT12, was used to generate signal-intensity values from the scans. After background subtraction, the average normalization recommended by the GenomeStudio software was used to rescale the difference in overall intensity to the median average intensity for all samples across multiple arrays and chips. All data <10 were set to 10 and then the dataset was log2 transformed. In total, 489 microarray samples passed stringent quality-control standards following hybridization and microarray scanning. Principal variance component analysis (PVCA) was performed using JMP Genomics 6.0 (SAS Institute) analysis software to identify major sources of variability within a combined dataset. PVCA revealed dataset membership (training or test) and hybridization chamber (chamber in which batches of arrays were hybridized) as major sources of technical variation. To correct for this technical variation and increase sensitivity to sources of biologic variability, batch correction (Supplemental Fig. 1B) was applied using the ComBat R package (21). After batch correction for technical variables, the overall sensitivity to biologic variables was increased, which accounted for 35,082 probes expressed in at least one sample. Next, a filter selecting the 60% most variable transcripts across all time points for all animals was used to further increase the signal/noise ratio (n = 21,049 probes).

Unsupervised analysis.

PVCA was used as an unsupervised analysis tool to explore the contribution of biologic variables and outcome groups to the variance attributed to the first three principal components. In addition to PVCA, unsupervised hierarchical clustering and classic principal component analysis were conducted at each time point to assess sample clustering and association with biologic variables (data not shown).

Linear mixed-model analysis of longitudinal data.

For each macaque, two preinfection baseline samples were used and combined to create a shared baseline with which all postchallenge time points were compared. Statistical analyses of log2-transformed microarray data were performed using SAS 9.3 Software and JMP Genomics 6.0 (SAS Institute). A linear mixed-model analysis (LMMA), accounting for repeated measures (repeated within each macaque) and unequal time point spacing, was applied to the top 60% of variable transcripts to test for significant differences in gene expression between each postchallenge time point and the shared baseline. Benjamini–Hochberg multiple testing correction was applied (false-discovery rate [FDR] = 0.05) for all hypothesis testing, resulting in the selection of 6371 differentially expressed probes. These probes were hierarchically clustered using the Ward clustering method. The cubic clustering criterion was used to identify k = 20 for k-means clustering. Ingenuity Pathway Analysis (Ingenuity Systems) and MetaCore Pathway Analysis (Thompson Reuters) network analysis tools were used to annotate the k = 20 clusters. Furthermore, the Novartis Gene Atlas was queried for modules containing genes with expression limited to particular cell types. Additionally, we used cross-matching of probes within the k = 20 clusters with the predefined human disease modules (22) to further inform the annotation of clusters. Annotated modules contained multiple genes with similar function, cellular localization, or membership in known biologic pathways.

Percentage cluster activity and cluster activity scores.

To assess the longitudinal activity of each of the k-means (k = 20) clusters relative to prechallenge baseline, a percentage activity score was determined for each cluster at all time points postchallenge. Briefly, this activity score represents the percentage of all genes within each cluster that are differentially expressed at each time point using LMMA. The sign associated with the mean estimate of the LMMA for the significant genes within each cluster, at each time point, was used to assign increased or decreased activity for each cluster.

A cluster activity score was calculated for each animal at each time point p.i. and was used to correlate complete blood count, flow cytometry, and outcome. This score was calculated for each animal by summing the difference in probe-level p.i. expression values and the average preinfection expression value for all probes within a cluster. This sum of differences was divided by the number of probes within each cluster plus the number of samples within each time group.

Molecular distance to health analysis.

Molecular distance to health (MDTH) was performed as previously described using a Microsoft Excel 2010 visual basic add-in (23). Briefly, in this approach, a score is computed to represent the molecular distance of a given sample relative to a baseline (i.e., preinfection time points). This is performed by determining whether the expression of a given sample lies inside or outside two SDs from the mean of the baseline. The MDTH (Fig. 1C) was calculated only for the 6371 probes identified using LMMA.

FIGURE 1.

Supervised LMMA reveals three distinct transcriptional phases following M. tuberculosis infection. (A) Two-way hierarchical clustering of LMMA-selected transcripts (columns) (FDR = 0.05), followed by k-means clustering (kn = 20), was used to compare each p.i. time point (rows) with the preinfection baseline. Those clusters (12/20) with significant ontology and network associations, as determined by Ingenuity Pathway Analysis, are annotated. (B) The total number of differentially expressed genes determined by LMMA at each p.i. time point. (C) The amplitude of p.i. transcript perturbations, as measured by MDTH. (D) Cluster activity at each time point p.i. is represented as the percentage of all genes within each cluster that are differentially expressed using LMMA. (E) Transcripts from NHPs were also evaluated using a pre-existing human expression modular framework. Module activity scores represent the percentage of transcripts that are overexpressed (red) or underexpressed (blue) in 62 predefined human coexpression modules. Module activity was hierarchically clustered to group modules with similar activity patterns over time. Those clusters with >15% representation are shown. Red represents overexpression relative to the preinfection time point, whereas blue represents underexpression. Color intensity represents the percentage of each module that is differentially expressed from preinfection baseline. k-means clusters from (D) were mapped to those framework modules that share >15% similar transcript constituents.

FIGURE 1.

Supervised LMMA reveals three distinct transcriptional phases following M. tuberculosis infection. (A) Two-way hierarchical clustering of LMMA-selected transcripts (columns) (FDR = 0.05), followed by k-means clustering (kn = 20), was used to compare each p.i. time point (rows) with the preinfection baseline. Those clusters (12/20) with significant ontology and network associations, as determined by Ingenuity Pathway Analysis, are annotated. (B) The total number of differentially expressed genes determined by LMMA at each p.i. time point. (C) The amplitude of p.i. transcript perturbations, as measured by MDTH. (D) Cluster activity at each time point p.i. is represented as the percentage of all genes within each cluster that are differentially expressed using LMMA. (E) Transcripts from NHPs were also evaluated using a pre-existing human expression modular framework. Module activity scores represent the percentage of transcripts that are overexpressed (red) or underexpressed (blue) in 62 predefined human coexpression modules. Module activity was hierarchically clustered to group modules with similar activity patterns over time. Those clusters with >15% representation are shown. Red represents overexpression relative to the preinfection time point, whereas blue represents underexpression. Color intensity represents the percentage of each module that is differentially expressed from preinfection baseline. k-means clusters from (D) were mapped to those framework modules that share >15% similar transcript constituents.

Close modal

Module maps.

A previously defined and annotated framework of 260 gene-expression modules with coordinate expression across one to nine human disease datasets was used to assess p.i. changes (22). Gene expression levels of the top 60% variable transcripts were compared between a shared preinfection baseline across all animals and healthy controls on a module-by-module basis. The percentage of transcripts showing significant differences (unpaired t test, p < 0.05) in expression was used as an indicator of module activity.

Module activity scores.

For correlation analysis with complete blood count, flow cytometry, and outcome, a module activity score was calculated for each animal at each p.i. time point. This score was calculated for each animal by summing the difference in probe-level p.i. expression values and the average preinfection expression value for all probes within a module. This sum of differences was divided by the number of probes within each module plus the number of samples within each time group.

Cross-correlation analyses.

Cluster activity scores (k = 20) and module activity scores were cross-correlated with absolute cell counts, flow cytometry counts, or lung FDG avidity (as continuous variable) using JMP Genomics (SAS Institute). For all cross-correlations, nonparametric Spearman correlation was used. Significant correlations were identified using Benjamin–Hochberg multiple testing correction (FDR = 0.05).

Statistical analysis and data visualization.

MDTH and modular framework analysis calculations were performed using Microsoft Excel 2010 and associated visual basic add-in. For statistical analysis, JMP Genomics 6.0, GraphPad Prism 6, and GeneSpring 12.6.1 (for fold change analysis) were used. Cross-correlation heat maps were created by exporting statistical results from JMP Genomics and visualized using Plotly.

To investigate blood transcriptome dynamics over the course of M. tuberculosis infection, whole blood transcripts were compared between pre–M. tuberculosis infection and 11 serial time points following M. tuberculosis infection among 38 NHPs (Supplemental Fig. 1A). We performed LMMA, an adjusted supervised analysis, to assess transcripts with differential expression between preinfection and each p.i. time point. Overall, 6371 transcripts were differentially expressed at at least one p.i. time point, regardless of infection outcome. The hierarchical clustering of time points based on differentially expressed transcripts (Fig. 1A), number of differentially expressed transcripts at each time point (Fig. 1B), and MDTH (Fig. 1C) in comparison with the preinfection baseline revealed evidence of three phases of transcriptional activity post–M. tuberculosis infection: early (3–10 d p.i.), middle (20–56 d p.i.), and late (90–180 d p.i.) (Fig. 1A–C). Modest transcript perturbations were observed between days 3 and 10 and between days 90 and 180 p.i. (Fig. 1A, 1B). The greatest differential expression from preinfection time points was observed at 20–56 d p.i., and the highest number of differentially expressed transcripts was observed at 20 d p.i. (Fig. 1B). Furthermore, MDTH, a metric conveying the magnitude of deviation from baseline for each list of significant genes by time point, confirmed that the greatest gene expression perturbations occur at 20 d p.i. (Fig. 1C).

To identify functional components of the transcriptional host response, we investigated clusters of differentially expressed transcripts with similar expression patterns across time points using k-means clustering (kn = 20) (Fig. 1A, 1D). Pathway analysis (MetaCore) was used to determine functional annotations that identified ontologies for 12 of the 20 clusters (Supplemental Table I). A cluster activity score for each cluster was determined to assess the dynamic changes over time (Fig. 1A, 1D). Clusters exhibited monophasic and biphasic activity, with peak cluster activity observed between 20 and 56 d p.i. Each cluster reached maximum change from baseline at 20 or 30 d p.i. Compared with baseline, increased cluster activity was observed with IFN response (C1), Inflammation (C2), Innate/Platelet Response (C3), Hematopoiesis (C4), and Complement/Lymphocyte Regulation (C6), whereas Ribosomal Translation/Lymphocyte (C14), lymphocyte-related clusters (C15, C16, and C17), and clusters related to Metabolism/Transport (C18), Cell Cycle (C19), and Cytoskeletal Remodeling (C20) (Fig. 1D) had decreased activity. By 90–120 d p.i., the whole blood signature returned to near-baseline expression levels.

In a complementary approach aimed at gaining increased biologic resolution and greater downstream interpretability, we performed an independent analysis using a pre-existing human modular gene expression framework (22, 24, 25) to assess changes in transcript abundance following M. tuberculosis challenge (Fig. 1E). This analysis facilitates the direct comparison of this macaque M. tuberculosis signature and previously described human signatures (3) using the same array platform and a similar module-based analysis approach. In this analysis, the differential expression of 14,424 gene probes constituting the module framework was assessed comparing preinfection and p.i. time points for each module of the coexpressed transcript (percentage up or down). One-way hierarchical clustering was used to cluster modules with similar modular-activity patterns over time (Fig. 1E). Consistent with the k-means cluster analysis, the greatest number and intensity of module activity (overexpressed or underexpressed) peaked between 20 and 56 d p.i. Similarly, the modules that represent innate immune responses were significantly overexpressed from baseline, and those that represent adaptive immune responses were underexpressed at p.i. time points. In contrast to the cluster analysis, modular framework analysis revealed modest modular activity (over- and underexpression) in the late phase (150–180 d p.i.), especially in modules representing erythropoiesis/hematopoiesis, protein synthesis, and adaptive immune responses.

Published blood transcriptional studies suggested that transcript abundance often reflects changes in the cellular composition of peripheral blood that occur p.i. (3, 10, 12, 2630). In blood, a significant increase in monocyte numbers was observed at 30 d p.i. and a significant increase in neutrophil numbers was observed at 20 and 30 d p.i. compared with baseline (Supplemental Fig. 2A, 2B); these correlated positively with increases in transcriptional activity of clusters (and modules) related to IFN, inflammation, and innate immune response (monocytes, neutrophils, myeloid lineage) (Supplemental Fig. 2A, 2B, Supplemental Table II). A reduction in lymphocyte numbers correlated positively with reduced transcriptional activity of modules and clusters relating to lymphoid lineage, cytotoxic/NKT cells, T cells, and B cells (Supplemental Fig. 2C, Supplemental Table II). Thus, dynamic changes in the cellular composition of blood contribute to the total transcriptional abundance. In summary, the early blood signature of M. tuberculosis infection in macaques, irrespective of final disease outcome, identifies three distinct phases of transcriptional activity that are composed of dynamic immune-related pathways driven by changes in gene expression and the underlying cellular composition of the blood.

To determine whether there was a differential signature that distinguished active TB disease and LTBI outcomes, we analyzed whole-blood transcriptional signatures of animals at the time of clinical diagnosis. Animals that developed active TB (n = 16) were diagnosed between 90 and 180 d p.i., whereas LTBI (n = 22) was declared at 180 d p.i. (4, 15, 19). Significant fold-change differences (>1.5) between active TB and LTBI were observed in 109 transcripts, with 84 transcripts upregulated in active TB (Mann–Whitney unpaired t test with Benjamini–Hochberg multiple testing correction) (Fig. 2A, 2B, Supplemental Table III). Canonical pathway analysis (Ingenuity Pathway Analysis, MetaCore) identified that IFN signaling and dendritic cell maturation were significantly represented in this signature (Supplemental Fig. 3A, 3B). When two-way hierarchical clustering of these 109 transcripts was performed (Fig. 2B), 10 of the 16 animals with active disease clustered together, whereas only 8 of the 22 animals with latent infection clustered together; the remaining 6 animals with active TB and 14 animals with LTBI were interspersed. This interspersed clustering likely reflects the spectrum of M. tuberculosis infection that is observed in humans and NHPs (24).

FIGURE 2.

Differential gene expression at the time of clinical diagnosis. At the time of clinical diagnosis 109 transcripts were differentially expressed between animals with active disease and latent infection (fold change > 1.5). (A) The mean differential signature was hierarchically clustered based on the genes with similar differential expression between active disease and latent infection as clinical groups. (B) Two-way hierarchical clustering was used to cluster animals and transcripts with similar patterns of expression. NHPs with latent infection are marked in green, whereas those that developed active disease are in maroon. (C) Transcripts from NHPs at the time of clinical diagnosis were evaluated using a pre-existing human gene expression module framework for animals that developed active disease (maroon) or remained latently infected (green). (D) Transcripts from NHPs were evaluated using a pre-existing human gene expression module framework for animals that developed active disease or remained latently infected over the course of M. tuberculosis infection. Module activity scores representing the percentage of genes that are overexpressed (red) or underexpressed (blue) relative to the preinfection baseline are shown. The intensity of the color represents the degree of differential expression from baseline. Module activity >15% is shown. Green boxes indicate significant differential expression (module activity) between animals with active disease and latent infection.

FIGURE 2.

Differential gene expression at the time of clinical diagnosis. At the time of clinical diagnosis 109 transcripts were differentially expressed between animals with active disease and latent infection (fold change > 1.5). (A) The mean differential signature was hierarchically clustered based on the genes with similar differential expression between active disease and latent infection as clinical groups. (B) Two-way hierarchical clustering was used to cluster animals and transcripts with similar patterns of expression. NHPs with latent infection are marked in green, whereas those that developed active disease are in maroon. (C) Transcripts from NHPs at the time of clinical diagnosis were evaluated using a pre-existing human gene expression module framework for animals that developed active disease (maroon) or remained latently infected (green). (D) Transcripts from NHPs were evaluated using a pre-existing human gene expression module framework for animals that developed active disease or remained latently infected over the course of M. tuberculosis infection. Module activity scores representing the percentage of genes that are overexpressed (red) or underexpressed (blue) relative to the preinfection baseline are shown. The intensity of the color represents the degree of differential expression from baseline. Module activity >15% is shown. Green boxes indicate significant differential expression (module activity) between animals with active disease and latent infection.

Close modal

A complementary module-based analysis of the transcriptional changes between preinfection and at the p.i. time of clinical diagnosis (Fig. 2C) was performed for animals as clinical groups. The pattern of module activity overlapped between active disease and latent infection. Nevertheless, when evaluated as clinical groups, differences in module activity were observed in macaques with different clinical outcomes. Overexpression of IFN response, inflammation, myeloid lineage, and coagulation/platelets modules and underexpression of T cells, lymphoid lineage, cytotoxic T cells/NK cells, and cell cycle modules were seen in macaques with active disease compared with those with LTBI.

To accurately assess the relationship between macaque signatures at the time of clinical diagnosis and those reported previously in humans with active disease and latent infection, we compared the modular signatures in this study with those derived from Berry et al. (3). Their assessment of human TB signatures used much of the same sample collection, processing, and analytic infrastructure as did this study, facilitating this cross-species signature comparison.

Using module-based analysis (Supplemental Fig. 4A, 4B, 4D, 4E), we observed high concordance between active TB signatures in humans and macaques (r = +0.72, p < 0.001, Supplemental Fig. 4A, 4B). We further compared and validated the gene-level signature of human pulmonary TB (3) (393-gene set) with the 2048 transcripts differentially expressed in macaques with active disease compared with preinfection baseline (Supplemental Fig. 4D, 4E). This comparative analysis identified 102 of the 393 differentially expressed genes observed in human active TB that are also differentially expressed in the macaque when preinfection signatures are compared with those at the time of clinical diagnosis (closest approximation of sample comparison in human healthy controls versus active TB diagnosis) (Supplemental Fig. 4D). Together, these comparative analyses suggest that the macaque signature of active TB infection presented in this article is homologous to previously validated human signatures of active pulmonary TB. The signatures for LTBI in humans and macaques also exhibited some similarities; however, the lower intensity of transcriptional perturbations from preinfected and healthy baselines, coupled with heterogeneity within the groups, yielded lower agreement than that observed when comparing active signatures (Supplemental Fig. 4C).

To determine whether there were transcriptional differences observed at early p.i. time points (i.e., prior to clinical diagnosis) that could distinguish outcome, we evaluated module activity of animals that would develop active TB disease or LTBI over each of the time points p.i. (Fig. 2D). Interestingly, we observed a significant decrease in the activity of modules associated with lymphoid lineage and increased activity of IFN responses at 3 and 7 d p.i., respectively (Supplemental Table IV), in animals that would develop latent infection compared with active disease. However, by 30 d p.i., the IFN response becomes significantly higher in animals that would develop active disease and remained higher at later time points. Similarly, modules, such as lymphoid lineage, T cell, and B cell, were significantly higher by 56 d p.i. in animals that develop latent infection (Supplemental Table IV). This suggests that the timing of the innate or adaptive response influences the outcome of M. tuberculosis infection. Overall, although these differences in module activity were statistically significant, the magnitude of activity often fell below the 15% minimum cutoff for visualization in Fig. 2D, suggesting that the heterogeneity among LTBI and active TB among animals dampens the differential signature. Only modest transcript signatures discriminate infection outcome early in the course of M. tuberculosis infection, despite the fact that early differences in disease outcome can be measured by other parameters (e.g., PET-CT) (16).

Excessive inflammation is associated with poor clinical outcomes in TB (reviewed in Ref. 31). We stratified NHPs in this study using a quantitative measure of lung inflammation determined by PET-CT (using FDG probe). FDG is taken up and retained by metabolically active host cells, including inflammatory cells, in the lungs. We (16, 18) and other investigators (32) showed that total FDG avidity in the lungs correlates with M. tuberculosis disease severity and bacterial burden and is reduced upon successful drug treatment. We theorized that total lung FDG avidity (measured as SUV) could more accurately reflect the severity of infection, revealing the spectrum of infection. Rather than relying on the binary clinical definitions (active disease or LTBI), this measurement could provide us with a continuous variable with which we could correlate gene expression and module activity. To determine the association and difference in gene expression in animals with high and low lung FDG avidity, we initially classified macaques into two groups: high or low lung FDG avidity (described below). However, to determine the degree of gene-expression change that correlates with the extent of lung disease, we also used lung FDG avidity as a continuous variable in all animals and also within those with high or low FDG avidity.

The range of total lung FDG avidity for macaques in this study at the time of clinical diagnosis was 0–106 SUV (Fig. 3A); the median was 1820 SUV, and macaques were classified as low FDG (≤1820 SUV) or high FDG (>1820 SUV) (Fig. 3A). Most of the animals with active disease (12 of 16) had high lung FDG avidity (>1820) and most animals with latent infection (15 of 22) had low lung FDG avidity (<1820). However, there is overlap of lung inflammation between the clinical classifications of these macaques (Fig. 3A), reflecting the spectrum of M. tuberculosis infection.

FIGURE 3.

Differential gene expression based on lung inflammation measured by PET-CT. (A) Lung FDG avidity of 37 animals after diagnosis represented in a log scale. Each symbol represents an animal, and the color indicates the clinical status of the animal (maroon = active TB, green = latent infection). Median FDG avidity was 1820 (indicated by a line). We classified animals above the median FDG value as the high FDG avidity group and those at and below the median value as the low FDG avidity group for further evaluation. (B and C) At the time of clinical diagnosis, 91 transcripts (fold change > 1.5) were differentially expressed between animals with high and low lung FDG avidity. (B) The mean differential signature at the time of clinical diagnosis was hierarchically clustered based on the genes with similar differential expression pattern between animals that had high or low lung FDG avidity. The intensity of the heat map signifies the range of differential expression from blue (underexpression) to red (overexpression). (C) Differential transcripts were hierarchically clustered based on individual animals and the pattern of expression. NHPs with low FDG avidity are marked in green, whereas those with high FDG avidity are in maroon. (D) Transcripts from NHPs were evaluated using a pre-existing human gene expression modular framework for animals with high or low lung FDG avidity over the course of M. tuberculosis infection. Module activity scores representing the percentage of genes that are overexpressed (red) or underexpressed (blue) relative to the preinfection baseline are shown. The intensity of the color represents the degree of differential expression from baseline. Module activities that are significantly different between animals with high and low FDG avidity are in green boxes. Modules that are overexpressed and positively correlated with lung FDG avidity are represented as red filled circles in red squares, whereas those that are overexpressed and negatively correlated are represented as red filled circles in blue squares. Similarly, modules that are underexpressed and positively correlated are represented by blue filled circles in red squares, and those that are underexpressed and negatively correlated are represented by blue filled circles in blue squares.

FIGURE 3.

Differential gene expression based on lung inflammation measured by PET-CT. (A) Lung FDG avidity of 37 animals after diagnosis represented in a log scale. Each symbol represents an animal, and the color indicates the clinical status of the animal (maroon = active TB, green = latent infection). Median FDG avidity was 1820 (indicated by a line). We classified animals above the median FDG value as the high FDG avidity group and those at and below the median value as the low FDG avidity group for further evaluation. (B and C) At the time of clinical diagnosis, 91 transcripts (fold change > 1.5) were differentially expressed between animals with high and low lung FDG avidity. (B) The mean differential signature at the time of clinical diagnosis was hierarchically clustered based on the genes with similar differential expression pattern between animals that had high or low lung FDG avidity. The intensity of the heat map signifies the range of differential expression from blue (underexpression) to red (overexpression). (C) Differential transcripts were hierarchically clustered based on individual animals and the pattern of expression. NHPs with low FDG avidity are marked in green, whereas those with high FDG avidity are in maroon. (D) Transcripts from NHPs were evaluated using a pre-existing human gene expression modular framework for animals with high or low lung FDG avidity over the course of M. tuberculosis infection. Module activity scores representing the percentage of genes that are overexpressed (red) or underexpressed (blue) relative to the preinfection baseline are shown. The intensity of the color represents the degree of differential expression from baseline. Module activities that are significantly different between animals with high and low FDG avidity are in green boxes. Modules that are overexpressed and positively correlated with lung FDG avidity are represented as red filled circles in red squares, whereas those that are overexpressed and negatively correlated are represented as red filled circles in blue squares. Similarly, modules that are underexpressed and positively correlated are represented by blue filled circles in red squares, and those that are underexpressed and negatively correlated are represented by blue filled circles in blue squares.

Close modal

First, we determined whether there was a differential signature between animals that have high and low lung FDG avidity at the time of clinical diagnosis. Ninety-one transcripts were differentially expressed (fold change > 1.5) between animals with high versus low lung FDG avidity (Fig. 3B, 3C, Supplemental Table III). In contrast to differential signatures based on clinical status (active versus LTBI), only three animals with low lung FDG avidity clustered with animals with high FDG avidity (Fig. 3C) in two-way hierarchical clustering. The ingenuity canonical pathway that was most dominant based on lung inflammation in these 91 differential transcripts was TREM1 signaling (Supplemental Fig. 3C), which was reported previously in humans (27).

Second, we investigated whether any differential signatures were observed earlier in the course of M. tuberculosis infection that reflect lung FDG avidity measured at the time of diagnosis. Animals classified as high or low FDG avidity in lungs at clinical diagnosis (Fig. 3D, Supplemental Fig. 5A, 5B, Supplemental Table IV) had many differences in transcriptional activity over the course of M. tuberculosis infection using module-based functional analysis. At 10 d p.i., there was higher B cell (M4.10) activity in animals with low FDG avidity (Supplemental Fig. 5B). After 20 d p.i., animals with high lung FDG avidity had significantly higher IFN Response, Inflammation, Myeloid Lineage, and Neutrophils modules, whereas animals with low lung FDG avidity had significantly higher T Cell, B Cell, Cytotoxicity/NK Cells, Lymphoid Lineage, and Protein Synthesis modules at various p.i. time points (Fig. 3D, Supplemental Fig. 5A, 5B, Supplemental Table IV). Of note, IFN Response modules were the only dominant differential signatures observed at 42 and 150 d p.i. in animals with high lung FDG avidity. Overall, we observed a higher number of modules that are differentially activated between high and low lung FDG avidity (71 versus 27 modules) compared with the classification based on clinical status, suggesting that evaluating M. tuberculosis infection based on total lung inflammation may more closely reflect the changes in whole blood transcript abundance.

To evaluate the association between the degree of gene expression and the extent of lung disease by using the lung FDG avidity as a continuous variable, we determined the correlation between module activity and lung FDG avidity within the high and low FDG groups, as well as the overall correlation with all animals. Multiple modules exhibited significant correlation with lung FDG avidity, as highlighted in Fig. 3D (Supplemental Table V). For example, in the low lung FDG avidity animals, IFN modules (M3.4 and M5.12) are overexpressed at 30 d p.i. and positively correlated with lung FDG avidity at the time of clinical diagnosis. In the animals with high lung FDG avidity, positive correlation with IFN modules is seen later in infection (i.e., 42–150 d p.i.). At 56–120 d p.i., T (M4.1) and B (M4.10) cell modules had decreased activity and correlated negatively with lung inflammation in high FDG avidity macaques, whereas no changes in these modules were observed in animals with low lung FDG. Overall, the timing of the correlation and their respective modules vary between high and low lung FDG avidity groups (Supplemental Fig. 5). We also investigated overall association module activity and the lung FDG avidity of all animals as one group. Overall, a significant negative correlation was observed in modules associated with B cells, T cells, cytotoxic/NK cells, lymphoid lineage, and erythropoiesis from days 10 to 180 p.i. Positive correlation was observed in modules associated with IFN, inflammation, monocytes, and myeloid lineage after day 20 p.i. (Supplemental Fig. 5C, Supplemental Table V). Some of the correlations observed when analyzed as high or low groups were no longer apparent in the combined evaluation. This could be due to the spectrum of module activity among each macaque (Fig. 3D, Supplemental Fig. 5A, 5B).

Our study also provides an opportunity to examine whether a preinfection immune status (signature) in blood is associated with outcome. We first evaluated differences in transcripts at the preinfection time point between animals that would later develop active disease or latent infection. Thirty-four preinfection transcripts were differentially regulated (defined as >1.5-fold change, Mann–Whitney unpaired t test with Benjamini–Hochberg correction) between those animals that would develop active disease or LTBI (Supplemental Table VI). Of these, 12 transcripts were upregulated and 22 were downregulated in animals that would eventually present with active disease compared with those animals that would develop LTBI (Fig. 4A, 4B). The majority of these upregulated transcripts are associated with IFN, cell cycle, and inflammation functions. Stratifying outcome based on lung FDG avidity identified 30 preinfection transcripts (14 downregulated, 16 upregulated) that were different between high and low FDG avidity animals (Fig. 4C, 4D, Supplemental Table VI). Only 5% overlap was found with the specific genes that are differentially regulated between the groups of either clinical status or lung FDG avidity, but the pathways associated with the genes are similar. In the clinical outcome and FDG avidity analyses, the majority of the differential transcripts belong to pathways associated with IFN and inflammation, suggesting that the host’s inherent upregulation of these pathways may predispose to poor infection outcome.

FIGURE 4.

Differential gene expression among active and latent outcomes observed before M. tuberculosis infection. (A and B) At baseline, 34 transcripts were differentially expressed between animals that developed active disease or latent infection (fold change > 1.5). (A) The mean differential signature before M. tuberculosis infection was hierarchically clustered based on the genes with similar differential expression between animals that later developed active disease and latent infection as clinical groups. (B) Two-way hierarchical clustering was used to cluster animals and transcripts with similar patters of expression. Green represents NHPs that remained with latent infection, whereas maroon represents animals that developed active TB. (C and D) Similarly, 30 transcripts were differentially expressed at baseline between animals that had high or low lung FDG avidity. (C) The mean differential signature before M. tuberculosis infection was hierarchically clustered based on the genes with similar differential expression pattern between animals that later had high or low FDG avidity. The intensity of the heat map shows the range of underexpression (blue) and overexpression (red). (D) Differential transcripts were hierarchically clustered based on individual animal and the pattern of expression. NHPs with low FDG avidity are shown in green, whereas those with high FDG avidity are shown in maroon.

FIGURE 4.

Differential gene expression among active and latent outcomes observed before M. tuberculosis infection. (A and B) At baseline, 34 transcripts were differentially expressed between animals that developed active disease or latent infection (fold change > 1.5). (A) The mean differential signature before M. tuberculosis infection was hierarchically clustered based on the genes with similar differential expression between animals that later developed active disease and latent infection as clinical groups. (B) Two-way hierarchical clustering was used to cluster animals and transcripts with similar patters of expression. Green represents NHPs that remained with latent infection, whereas maroon represents animals that developed active TB. (C and D) Similarly, 30 transcripts were differentially expressed at baseline between animals that had high or low lung FDG avidity. (C) The mean differential signature before M. tuberculosis infection was hierarchically clustered based on the genes with similar differential expression pattern between animals that later had high or low FDG avidity. The intensity of the heat map shows the range of underexpression (blue) and overexpression (red). (D) Differential transcripts were hierarchically clustered based on individual animal and the pattern of expression. NHPs with low FDG avidity are shown in green, whereas those with high FDG avidity are shown in maroon.

Close modal

Using a unique animal model (NHP) that mimics human M. tuberculosis infection in pathology and variability of outcome, we are able to detect time-specific, transcriptional changes in the blood before and during the course of low-dose M. tuberculosis infection. We show that the greatest fluctuation at the transcript level occurs early during infection (days 20–56), long before M. tuberculosis–infected animals develop overt clinical symptoms of TB. The greatest change is observed among innate and adaptive pathways and this correlates with the time of initial detection of a systemic inflammatory response (measured by erythrocyte sedimentation rate) (Supplemental Fig. 1C), initiation of adaptive immunity, and initial lung granuloma formation seen by PET-CT (16). Early innate transcriptional changes are temporally associated with increased monocytes and neutrophils circulating in the blood. Similarly, adaptive response transcripts are downregulated when lymphocytes are reduced in the blood at the early time points, likely due to lymphocytes trafficking to the lung as infection is established. The early signatures, within 6 mo p.i., seen in this study are consistent with the human signatures in adults seen during diagnosis of clinical disease, which occurs at much later time points (33). In fact, for the IFN and inflammation pathways, signatures prior to infection were associated with a poor outcome, suggesting that these pathways could contribute to increased susceptibility to TB disease. These data also suggest that the signatures in the human literature that distinguish active TB may not necessarily reflect the direct host response to M. tuberculosis infection but rather the predisposition of outcome.

Multiple studies reported upregulated IFN-inducible genes (type I and II) as the most prominent signature associated with active TB. Signatures, including myeloid and inflammatory transcripts, Fcγ receptor signaling, JAK–STAT pathway, complement, pattern recognition receptors, Ag presentation, B cell markers, and CD64, were also shown to be associated with active TB (3, 29, 3337). An independent integration and meta-analysis of eight independently obtained human TB microarray datasets revealed that the genes associated with pattern recognition receptor signaling, Fc receptors, fibrosis myeloid cell inflammation, and TREM1 signaling were strongly associated with active TB, in addition to T cells, B cells, and IFN signaling (27). Our dataset confirms that IFN signaling was the most significant pathway associated with active disease. However, when the macaques were evaluated with regard to the extent of lung inflammation (lung FDG avidity), TREM1 represented a top canonical pathway with the highest log p value, in concordance with the meta-analysis of human datasets (27).

At the time of clinical diagnosis in macaques, 109 transcripts were differentially expressed between animals with active disease and LTBI that overlapped with 24 transcripts reported in the meta-analysis of human TB studies (27). Although only 20% of the differentially expressed individual molecules in NHPs overlapped with human signatures, the pathways associated with these molecules are similar to those observed in human studies (27, 34). A major component of the active TB disease signature in NHPs is IFN signaling. Similarly, module-based analysis, using the same platform as a previously published human study (3), demonstrated similar trends in modular activity between animals with active TB and latent infection as humans (Supplemental Fig. 4). Only 4 modules were discordant between NHPs and humans with active disease, and this could be due to differences in infection duration. Active TB in humans can be more severe (requiring medical attention in a resource-limited area) and likely more prolonged than in NHPs that are regularly monitored. Further, in the NHP model, animals are declared to have latent infection at 6 mo p.i., whereas humans were likely infected for years and possibly decades. Thus, the differences in transcripts between macaques and humans likely reflect more extreme differences in human clinical presentation. Recently, a 16-gene signature was reported to determine the risk for TB disease in a prospective cohort of human adolescents (14). We had 43% overlap of this signature in our dataset, and four of these were differentially regulated between active disease and latent infection.

Two-way hierarchical clustering of individual NHPs based on the differential transcripts revealed interspersed coclustering among animals with active disease and latent infection (Fig. 2B, 2G). Similar findings were reported in transcriptional signatures in humans with M. tuberculosis infection (3, 34) and likely could reflect the spectrum of M. tuberculosis infection in humans and macaques by pathology (4). Similarly, a combined evaluation as a clinical outcome group revealed only modest differences between the signatures, with a number of modules with <15% activity, suggesting that the larger heterogeneity within the clinical outcome groups might hinder the detection of differential expression. Given the overlap observed in clinically defined outcomes, we took advantage of a PET-CT–determined measurement of inflammation, total lung FDG avidity, which is positively correlated with bacterial burden (16) and indicates disease severity in this model. Unsupervised cluster analysis of transcripts based on fold change provided some separation between groups classified by clinical data (active disease versus LTBI), whereas many animals were interspersed. In contrast, a similar analysis using lung FDG avidity to classify the outcome resulted in fewer misclassifications. Further, the intensity of the differential signature was much greater when analyzed by lung FDG avidity than by the clinical outcome. This suggests that lung inflammation quantified by PET-CT FDG avidity may be a more definitive method of estimating disease severity along a spectrum compared with the subjective, variable, and binary clinical assays that are used to designate active and latent outcomes.

Limitations of this study include the use of a human array for hybridization of our NHP samples. Although this platform has been used in other macaque studies (3840), there are sequence homology differences between NHPs and humans that reduce total probe hybridization and, therefore, reduce the strength of transcriptional signature to only modest levels compared with humans. Even with a relatively large cohort of 38 macaques, the variability within the stratified responses limited our ability to perform classical predictive analysis using independent training and test sets. Therefore, we performed combined analyses to increase the statistical power for interpreting results instead of having small cohorts as test and validation sets. We present only an overview of the transcriptional profiles during the course of M. tuberculosis infection and not an in-depth analysis of specific transcripts associated with infection outcome because we were not able to validate these findings using independent test and validation sets or by RT-PCR.

Although we did not observe a definitive signature for clinical outcome early p.i. using our clinical classifications of infection outcome, several key findings that may warrant further evaluation were made at the early time points using lung FDG avidity as an outcome measure of disease. In addition, the role of pre-existing IFN activity and inflammation in the evolution of M. tuberculosis infection warrants further study. To our knowledge, this is the only longitudinal study of M. tuberculosis infection with multiple early time points. We also used a pre-existing human expression module framework for better interpretation of the results in comparison with the existing human TB data. We used lung FDG avidity (PET-CT) for inferring inflammation in the lung and disease severity, which is more sensitive than using chest x-rays. Finally, we show that our NHP model recapitulates human TB with regard to clinical and radiological presentation, as well as the transcriptional signature of active disease.

We thank veterinary technicians Melanie O’Malley and Paul Johnston for sample collection and veterinary care; Jaime Tomko, Dan Fillmore, and L. James Frye for performing PET-CT scans; Pauline Maiello and M. Teresa Coleman for PET-CT analysis; and the former and current members of the Flynn and Lin Laboratories for their intellectual contributions. We also acknowledge the efforts of the Baylor Institute for Immunology Research Genomics Core, including Parvathy Vinod and Phuong Nguyen for assistance with sample preparation and Rahul Maurya for microarray processing.

This work was supported by National Institutes of Health Grants HL106804 and HL110811 (to J.L.F.) and AI 111871 (to P.L.L.) and by the Bill & Melinda Gates Foundation (to P.L.L. and J.L.F.).

The microarray data presented in this article have been submitted to the Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE84152) under accession number GSE84152.

The online version of this article contains supplemental material.

Abbreviations used in this article:

CT

computed tomography

FDG

[18F]fluorodeoxyglucose

FDR

false-discovery rate

IACUC

Institutional Animal Care and Use Committee

LMMA

linear mixed-model analysis

LTBI

latent infection

MDTH

molecular distance to health

NHP

nonhuman primate

PET

positron emission tomography

p.i.

postinfection

PVCA

principal variance component analysis

RIN

RNA integrity

ROI

region of interest

SUV

standardized uptake volume

TB

tuberculosis.

1
World Health Organization
.
2016
.
Global Tuberculosis Report 2015.
World Health Organization
,
Geneva, Switzerland.
2
Barry
C. E.
 III
,
Boshoff
H. I.
,
Dartois
V.
,
Dick
T.
,
Ehrt
S.
,
Flynn
J.
,
Schnappinger
D.
,
Wilkinson
R. J.
,
Young
D.
.
2009
.
The spectrum of latent tuberculosis: rethinking the biology and intervention strategies.
Nat. Rev. Microbiol.
7
:
845
855
.
3
Berry
M. P.
,
Graham
C. M.
,
McNab
F. W.
,
Xu
Z.
,
Bloch
S. A.
,
Oni
T.
,
Wilkinson
K. A.
,
Banchereau
R.
,
Skinner
J.
,
Wilkinson
R. J.
, et al
.
2010
.
An interferon-inducible neutrophil-driven blood transcriptional signature in human tuberculosis.
Nature
466
:
973
977
.
4
Lin
P. L.
,
Rodgers
M.
,
Smith
L.
,
Bigbee
M.
,
Myers
A.
,
Bigbee
C.
,
Chiosea
I.
,
Capuano
S. V.
,
Fuhrman
C.
,
Klein
E.
,
Flynn
J. L.
.
2009
.
Quantitative comparison of active and latent tuberculosis in the cynomolgus macaque model.
Infect. Immun.
77
:
4631
4642
.
5
Lin
P. L.
,
Flynn
J. L.
.
2010
.
Understanding latent tuberculosis: a moving target.
J. Immunol.
185
:
15
22
.
6
Lin
P. L.
,
Maiello
P.
,
Gideon
H. P.
,
Coleman
M. T.
,
Cadena
A. M.
,
Rodgers
M. A.
,
Gregg
R.
,
O’Malley
M.
,
Tomko
J.
,
Fillmore
D.
, et al
.
2016
.
PET CT identifies reactivation risk in cynomolgus macaques with latent M. tuberculosis.
PLoS Pathog.
12
:
e1005739
.
7
Cadena
A. M.
,
Flynn
J. L.
,
Fortune
S. M.
.
2016
.
The importance of first impressions: early events in Mycobacterium tuberculosis infection influence outcome.
MBio.
7
:
e00342
16
.
8
Boisson-Dupuis
S.
,
Bustamante
J.
,
El-Baghdadi
J.
,
Camcioglu
Y.
,
Parvaneh
N.
,
El Azbaoui
S.
,
Agader
A.
,
Hassani
A.
,
El Hafidi
N.
,
Mrani
N. A.
, et al
.
2015
.
Inherited and acquired immunodeficiencies underlying tuberculosis in childhood.
Immunol. Rev.
264
:
103
120
.
9
Flynn
J. L.
,
Gideon
H. P.
,
Mattila
J. T.
,
Lin
P. L.
.
2015
.
Immunology studies in non-human primate models of tuberculosis.
Immunol. Rev.
264
:
60
73
.
10
O’Garra
A.
2013
.
Systems approach to understand the immune response in tuberculosis: an iterative process between mouse models and human disease.
Cold Spring Harb. Symp. Quant. Biol.
78
:
173
177
.
11
O’Garra
A.
,
Redford
P. S.
,
McNab
F. W.
,
Bloom
C. I.
,
Wilkinson
R. J.
,
Berry
M. P.
.
2013
.
The immune response in tuberculosis.
Annu. Rev. Immunol.
31
:
475
527
.
12
Blankley
S.
,
Berry
M. P.
,
Graham
C. M.
,
Bloom
C. I.
,
Lipman
M.
,
O’Garra
A.
.
2014
.
The application of transcriptional blood signatures to enhance our understanding of the host response to infection: the example of tuberculosis.
Philos. Trans. R. Soc. Lond. B Biol. Sci.
369
:
20130427
.
13
Bloom
C. I.
,
Graham
C. M.
,
Berry
M. P.
,
Wilkinson
K. A.
,
Oni
T.
,
Rozakeas
F.
,
Xu
Z.
,
Rossello-Urgell
J.
,
Chaussabel
D.
,
Banchereau
J.
, et al
.
2012
.
Detectable changes in the blood transcriptome are present after two weeks of antituberculosis therapy.
PLoS One
7
:
e46191
.
14
Zak
D. E.
,
Penn-Nicholson
A.
,
Scriba
T. J.
,
Thompson
E.
,
Suliman
S.
,
Amon
L. M.
,
Mahomed
H.
,
Erasmus
M.
,
Whatney
W.
,
Hussey
G. D.
, et al
ACS and GC6-74 cohort study groups
.
2016
.
A blood RNA signature for tuberculosis disease risk: a prospective cohort study.
Lancet.
387
:
2312
2322
.
15
Capuano
S. V.
 III
,
Croix
D. A.
,
Pawar
S.
,
Zinovik
A.
,
Myers
A.
,
Lin
P. L.
,
Bissel
S.
,
Fuhrman
C.
,
Klein
E.
,
Flynn
J. L.
.
2003
.
Experimental Mycobacterium tuberculosis infection of cynomolgus macaques closely resembles the various manifestations of human M. tuberculosis infection.
Infect. Immun.
71
:
5831
5844
.
16
Coleman
M. T.
,
Maiello
P.
,
Tomko
J.
,
Frye
L. J.
,
Fillmore
D.
,
Janssen
C.
,
Klein
E.
,
Lin
P. L.
.
2014
.
Early changes by (18)fluorodeoxyglucose positron emission tomography coregistered with computed tomography predict outcome after Mycobacterium tuberculosis infection in cynomolgus macaques.
Infect. Immun.
82
:
2400
2404
.
17
Coleman
M. T.
,
Chen
R. Y.
,
Lee
M.
,
Lin
P. L.
,
Dodd
L. E.
,
Maiello
P.
,
Via
L. E.
,
Kim
Y.
,
Marriner
G.
,
Dartois
V.
, et al
.
2014
.
PET/CT imaging reveals a therapeutic response to oxazolidinones in macaques and humans with tuberculosis.
Sci. Transl. Med.
6
:
265ra167
.
18
Lin
P. L.
,
Coleman
T.
,
Carney
J. P.
,
Lopresti
B. J.
,
Tomko
J.
,
Fillmore
D.
,
Dartois
V.
,
Scanga
C.
,
Frye
L. J.
,
Janssen
C.
, et al
.
2013
.
Radiologic responses in cynomolgous macaques for assessing tuberculosis chemotherapy regimens.
Antimicrob. Agents Chemother.
57
:
4237
4244
.
19
Lin
P. L.
,
Pawar
S.
,
Myers
A.
,
Pegu
A.
,
Fuhrman
C.
,
Reinhart
T. A.
,
Capuano
S. V.
,
Klein
E.
,
Flynn
J. L.
.
2006
.
Early events in Mycobacterium tuberculosis infection in cynomolgus macaques.
Infect. Immun.
74
:
3790
3803
.
20
Pawar
S. N.
,
Mattila
J. T.
,
Sturgeon
T. J.
,
Lin
P. L.
,
Narayan
O.
,
Montelaro
R. C.
,
Flynn
J. L.
.
2008
.
Comparison of the effects of pathogenic simian human immunodeficiency virus strains SHIV-89.6P and SHIV-KU2 in cynomolgus macaques.
AIDS Res. Hum. Retroviruses
24
:
643
654
.
21
Johnson
W. E.
,
Li
C.
,
Rabinovic
A.
.
2007
.
Adjusting batch effects in microarray expression data using empirical Bayes methods.
Biostatistics
8
:
118
127
.
22
Banchereau
R.
,
Jordan-Villegas
A.
,
Ardura
M.
,
Mejias
A.
,
Baldwin
N.
,
Xu
H.
,
Saye
E.
,
Rossello-Urgell
J.
,
Nguyen
P.
,
Blankenship
D.
, et al
.
2012
.
Host immune transcriptional profiles reflect the variability in clinical disease manifestations in patients with Staphylococcus aureus infections.
PLoS One
7
:
e34390
.
23
Pankla
R.
,
Buddhisa
S.
,
Berry
M.
,
Blankenship
D. M.
,
Bancroft
G. J.
,
Banchereau
J.
,
Lertmemongkolchai
G.
,
Chaussabel
D.
.
2009
.
Genomic transcriptional profiling identifies a candidate blood biomarker signature for the diagnosis of septicemic melioidosis.
Genome Biol.
10
:
R127
.
24
Chaussabel
D.
,
Quinn
C.
,
Shen
J.
,
Patel
P.
,
Glaser
C.
,
Baldwin
N.
,
Stichweh
D.
,
Blankenship
D.
,
Li
L.
,
Munagala
I.
, et al
.
2008
.
A modular analysis framework for blood genomics studies: application to systemic lupus erythematosus.
Immunity
29
:
150
164
.
25
Banchereau
R.
,
Hong
S.
,
Cantarel
B.
,
Baldwin
N.
,
Baisch
J.
,
Edens
M.
,
Cepika
A. M.
,
Acs
P.
,
Turner
J.
,
Anguiano
E.
, et al
.
2016
.
Personalized immunomonitoring uncovers molecular networks that stratify lupus patients [Published erratum appears in 2016 Cell 165: 1546–1551].
Cell
165
:
551
565
.
26
Berry
M. P.
,
Blankley
S.
,
Graham
C. M.
,
Bloom
C. I.
,
O’Garra
A.
.
2013
.
Systems approaches to studying the immune response in tuberculosis.
Curr. Opin. Immunol.
25
:
579
587
.
27
Joosten
S. A.
,
Fletcher
H. A.
,
Ottenhoff
T. H.
.
2013
.
A helicopter perspective on TB biomarkers: pathway and process based analysis of gene expression data provides new insight into TB pathogenesis.
PLoS One
8
:
e73230
.
28
Joosten
S. A.
,
Goeman
J. J.
,
Sutherland
J. S.
,
Opmeer
L.
,
de Boer
K. G.
,
Jacobsen
M.
,
Kaufmann
S. H.
,
Finos
L.
,
Magis-Escurra
C.
,
Ota
M. O.
, et al
.
2012
.
Identification of biomarkers for tuberculosis disease using a novel dual-color RT-MLPA assay.
Genes Immun.
13
:
71
82
.
29
Maertzdorf
J.
,
Ota
M.
,
Repsilber
D.
,
Mollenkopf
H. J.
,
Weiner
J.
,
Hill
P. C.
,
Kaufmann
S. H.
.
2011
.
Functional correlations of pathogenesis-driven gene expression signatures in tuberculosis.
PLoS One
6
:
e26938
.
30
Maertzdorf
J.
,
Repsilber
D.
,
Parida
S. K.
,
Stanley
K.
,
Roberts
T.
,
Black
G.
,
Walzl
G.
,
Kaufmann
S. H.
.
2011
.
Human gene expression profiles of susceptibility and resistance in tuberculosis.
Genes Immun.
12
:
15
22
.
31
Kaufmann
S. H.
,
Dorhoi
A.
.
2013
.
Inflammation in tuberculosis: interactions, imbalances and interventions.
Curr. Opin. Immunol.
25
:
441
449
.
32
Chen
R. Y.
,
Dodd
L. E.
,
Lee
M.
,
Paripati
P.
,
Hammoud
D. A.
,
Mountz
J. M.
,
Jeon
D.
,
Zia
N.
,
Zahiri
H.
,
Coleman
M. T.
, et al
.
2014
.
PET/CT imaging correlates with treatment outcome in patients with multidrug-resistant tuberculosis.
Sci. Transl. Med.
6
:
265ra166
.
33
Bloom
C. I.
,
Graham
C. M.
,
Berry
M. P.
,
Rozakeas
F.
,
Redford
P. S.
,
Wang
Y.
,
Xu
Z.
,
Wilkinson
K. A.
,
Wilkinson
R. J.
,
Kendrick
Y.
, et al
.
2013
.
Transcriptional blood signatures distinguish pulmonary tuberculosis, pulmonary sarcoidosis, pneumonias and lung cancers [Published erratum appears in 2013 PLoS One 8(8);doi:10.1371/annotation/7d9ec449-aee0-48fe-8111-0c110850c0c1].
PLoS One
8
:
e70630
.
34
Cliff
J. M.
,
Kaufmann
S. H.
,
McShane
H.
,
van Helden
P.
,
O’Garra
A.
.
2015
.
The human immune response to tuberculosis and its treatment: a view from the blood.
Immunol. Rev.
264
:
88
102
.
35
Jacobsen
M.
,
Repsilber
D.
,
Gutschmidt
A.
,
Neher
A.
,
Feldmann
K.
,
Mollenkopf
H. J.
,
Ziegler
A.
,
Kaufmann
S. H.
.
2007
.
Candidate biomarkers for discrimination between infection and disease caused by Mycobacterium tuberculosis.
J. Mol. Med.
85
:
613
621
.
36
Kaforou
M.
,
Wright
V. J.
,
Oni
T.
,
French
N.
,
Anderson
S. T.
,
Bangani
N.
,
Banwell
C. M.
,
Brent
A. J.
,
Crampin
A. C.
,
Dockrell
H. M.
, et al
.
2013
.
Detection of tuberculosis in HIV-infected and -uninfected African adults using whole blood RNA expression signatures: a case-control study.
PLoS Med.
10
:
e1001538
.
37
Mistry
R.
,
Cliff
J. M.
,
Clayton
C. L.
,
Beyers
N.
,
Mohamed
Y. S.
,
Wilson
P. A.
,
Dockrell
H. M.
,
Wallace
D. M.
,
van Helden
P. D.
,
Duncan
K.
,
Lukey
P. T.
.
2007
.
Gene-expression patterns in whole blood identify subjects at risk for recurrent tuberculosis.
J. Infect. Dis.
195
:
357
365
.
38
Lu
Y. R.
,
Wang
L. N.
,
Jin
X.
,
Chen
Y. N.
,
Cong
C.
,
Yuan
Y.
,
Li
Y. C.
,
Tang
W. D.
,
Li
H. X.
,
Wu
X. T.
, et al
.
2008
.
A preliminary study on the feasibility of gene expression profile of rhesus monkey detected with human microarray.
Transplant. Proc.
40
:
598
602
.
39
Nieto-Díaz
M.
,
Pita-Thomas
W.
,
Nieto-Sampedro
M.
.
2007
.
Cross-species analysis of gene expression in non-model mammals: reproducibility of hybridization on high density oligonucleotide microarrays.
BMC Genomics
8
:
89
.
40
Skinner
J. A.
,
Zurawski
S. M.
,
Sugimoto
C.
,
Vinet-Oliphant
H.
,
Vinod
P.
,
Xue
Y.
,
Russell-Lodrigue
K.
,
Albrecht
R. A.
,
García-Sastre
A.
,
Salazar
A. M.
, et al
.
2014
.
Immunologic characterization of a rhesus macaque H1N1 challenge model for candidate influenza virus vaccine assessment.
Clin. Vaccine Immunol.
21
:
1668
1680
.

The authors have no financial conflicts of interest.