Identification of Antibody Targets for Tuberculosis Serology using High-Density Nucleic Acid Programmable Protein Arrays*

Better and more diverse biomarkers for the development of simple point-of-care tests for active tuberculosis (TB), a clinically heterogeneous disease, are urgently needed. We generated a proteomic Mycobacterium tuberculosis (Mtb) High-Density Nucleic Acid Programmable Protein Array (HD-NAPPA) that used a novel multiplexed strategy for expedited high-throughput screening for antibody responses to the Mtb proteome. We screened sera from HIV uninfected and coinfected TB patients and controls (n = 120) from the US and South Africa (SA) using the multiplex HD-NAPPA for discovery, followed by deconvolution and validation through single protein HD-NAPPA with biologically independent samples (n = 124). We verified the top proteins with enzyme-linked immunosorbent assays (ELISA) using the original screening and validation samples (n = 244) and heretofore untested samples (n = 41). We identified 8 proteins with TB biomarker value; four (Rv0054, Rv0831c, Rv2031c and Rv0222) of these were previously identified in serology studies, and four (Rv0948c, Rv2853, Rv3405c, Rv3544c) were not known to elicit antibody responses. Using ELISA data, we created classifiers that could discriminate patients' TB status according to geography (US or SA) and HIV (HIV- or HIV+) status. With ROC curve analysis under cross validation, the classifiers performed with an AUC for US/HIV- at 0.807; US/HIV+ at 0.782; SA/HIV- at 0.868; and SA/HIV+ at 0.723. With this study we demonstrate a new platform for biomarker/antibody screening and delineate its utility to identify previously unknown immunoreactive proteins.

Active tuberculosis (TB) 1 is a disease caused by uncontrolled infection with Mycobacterium tuberculosis (Mtb). It predominantly affects the respiratory tract and is typically transmitted through infectious droplets generated by coughing. The disease remains a major global public health problem, ranking alongside HIV infection as the leading cause of death worldwide (1). In 2015, an estimated 10.4 million new cases occurred globally with around 1.4 million TB-associated deaths; numbers that for the first time in decades reflect an increase in incident cases compared with the preceding year (1). Rapid TB diagnosis and treatment are cornerstones of TB control and essential for reduction of morbidity, mortality and transmission.
Diagnosis of TB can be challenging because the clinical presentations are manifold and dependent on the immune status of the host. Furthermore, the differential diagnosis can be broad with diagnostic confirmation desired. The gold standard tests for detecting Mtb, usually in a respiratory sample, are culture or nucleic acid amplification (NAA) both of which require a certain degree of laboratory infrastructure and/or equipment, which are often not available in endemic settings, which are typically resource-limited. Thus, there is an urgent need for simple point-of-care (POC) TB tests that are based on the use of easily accessible, nonsputum based body fluids, such as blood, and that can detect the different forms of TB, pulmonary and extrapulmonary, in various hosts (2,3). In the absence of such POC tests, a simple triage method to identify those symptomatic TB suspects that are in need of further confirmatory testing, would be desirable but remains a further unmet need among the current TB diagnostic armamentarium (2,3).
Antibody (Ab) detection assays can be adapted for the development of rapid, inexpensive, easy to use tests that neither require laboratory infrastructure nor specific training. Prior serological tests for the diagnosis of TB have been insufficiently sensitive and specific (reviewed in (4,5)) for several reasons. Importantly, the Ab profiles of TB patients are heterogeneous (6,7), and tests that are based on a limited number of antigens, often only one or two (2,3), are insufficient to capture the diversity of TB cases. For example, a strong Ab response to the 38 kDa protein is elicited almost exclusively in the subgroup of advanced, HIV negative pulmonary TB patients, so assays based on this antigen are limited in diagnostic scope (reviewed in (4,6,8)). Furthermore, several antigens appear to lack specificity for TB (reviewed in (5,8)). Because of the potential to turn Ab detection assays into simple dipstick formats, TB serology, despite its known limitations, remains a field of study that is worthwhile pursuing further and new biomarker targets need to be identified. The simultaneous use of multiple more recently identified Mtb proteins in form of multiplex microbead immunoassays has already shown promising improved accuracy for TB serodiagnosis in regional case-control studies (9,10). Although the World Health Organization recognizes the limitations of currently available serologic tests and in fact cautions against using them (11), it vigorously encourages further research to meet the need for reliable, simple tests for TB in endemic regions. Because Ab detection is amenable to use in dipstick format incorporating a diversity of antigens, the pursuit of Ab targets that are valid biomarkers of TB is worthwhile.
Discovery of potential biomarkers requires high-throughput methods for assessing proteome-wide screens for antibody reactivity. The utilization of in situ protein arrays provides advances in the access of high-throughput protein microarray and their translation studies (12). Instead of requiring purified protein for printing, the in situ protein microarray utilizes printing of expression plasmids encoding libraries of genes. After in situ transcription and translation the proteins "self-assemble" on the array surface with the aid of ribosomes and chaperones, thereby enhancing natural protein folding and post-translational modification. Among the in situ protein microarray methods, Nucleic Acid Programmable Protein Array (NAPPA) (13) represents a platform for the biomarker discovery in cancer (14), autoimmune diseases (15) and infectious disease (13,16). Membrane proteins express and display well with NAPPA with an efficiency that exceeds 90% (13). Because membrane proteins comprise a large portion of antigens eliciting a human humoral immune response to TB, this method could identify novel valuable Ab targets for TB serodiagnosis that might not be discovered with the conventional protein array platform that is based on printing prefabricated proteins, typically generated in E. coli, on glass slides (6).
We here describe the generation of a novel Mtb protein microarray based on the NAPPA platform (17,18), the High Density-NAPPA (HD-NAPPA), which we used in a multiplex version (M-HD-NAPPA) for high-throughput screening and in a single protein version for deconvolution and validation. This platform entails printing plasmids containing cDNAs encoding the Mtb proteome comprising ϳ 4000 proteins into silicon nano-wells. The objectives of this study were i) to screen sera from HIV uninfected and coinfected TB patients and controls from different geographic regions (US and South Africa (SA)) to identify proteins with potential and previously unknown value for TB serodiagnosis; and ii) to delineate panels of candidate proteins that hold promise in the different patient groups and should undergo further validation studies.
Experimental Procedures, Subjects and Study Design-Mtb Plasmid Construction and DNA Preparation-We obtained 3295 Mtb H37Rv and 437 CDC 1551 genes in entry vectors from the Pathogen Functional Genomics Center. We designed and obtained primers for the missing ϳ800 H37Rv genes (Integrated DNA Technologies, Coralville, IA) and performed PCR amplification from genomic Mtb H37Rv DNA to create entry clones for these missing genes as described (19). After two rounds of PCR amplification and transfer of clones to the pANT7-cGST expression vector, which encodes a Cterminal fusion partner for the target gene of Glutathione-S-Transferase (GST), we obtained a final sequence-verified gene set comprised of 3646 H37Rv and 399 CDC 1551 clones (4045 total) for array construction. The reduction in clone numbers resulted from failure to either produce a PCR product or creation of a verified expression clone. Purified plasmid DNA was prepared with a high throughput alkaline lysis miniprep protocol as described (17,20,21). For positive controls, we used several genes encoding for the antigens of the Epstein-Barr virus (EBV), a virus over 95% of individuals are infected with by adult age (22), specifically the Epstein-Barr Nuclear Antigen (EBNA), EBV_Small capsomere-interacting protein (BFRF3), EBV_EBNA2, and other viral genes, specifically H1N1_Nucleoprotein, H3N2_ Nucleoprotein, HCMV2_ Viral transcription factor IE2 (UL122) (16). For negative gene controls we used a plasmid encoding GST without any fusion partner.
HD-NAPPA Array Fabrication-The HD-NAPPA array fabrication included three main processes as reported (15): nanowell slide fabrication, plasmid plate and printing mixture preparation and piezoelectric printing. The nano-well slide fabrication was performed as reported (16). The plasmid plate was constructed as reported for the HD-NAPPA (16), with modifications to the multiplex version to allow for a more high-throughput evaluation. We admixed three unique genes into one well resulting in three unique proteins displayed in each spot. Although this added the need to deconvolute reactive spots by reassessing the same screening samples with a new microarray containing only individual proteins per spot, it overall allowed us to process the screening faster (15,16). The printing master mix (MM) was composed of polyclonal anti-GST Ab (GE Healthcare), bovine serum albumin (BSA, Sigma-Aldrich), BS3 cross linker (Pierce) and DEPC treated water (15,16). To control for secondary Ab reactivity, we also printed purified mouse IgG, human IgG and human IgA, in MM at concentrations from 40 to 200 ng/l in each subarray. Negative controls consisted of MM spots without any plasmid and the plasmid endcoding only for the fusion partner GST. The HD-NAPPA print was performed on AU302 piezoelectric dispensing system (Engineering Arts LLC, Tempe, AZ, USA) by depositing MM (1200 pL/well) and plasmid(s) (100 ng/l, 300 pL/well) sequentially utilizing 16 individual noncontact dispensing heads. The HD-NAPPA slides were stored under an argon gas filled container at room temperature until the day of use when proteins were expressed.
Protein Expression on M-HD-NAPPA-Arrays were blocked with SuperBlock (Thermo Fisher Scientific, Rockford, IL) prior to expression to reduce nonspecific binding, rinsed with DI water and centrifuged dry. The nano-wells were filled with human cell-free expression system (In Vitro Transcription and Translation coupled system; IVTT; Thermo Fisher Scientific) and a custom micro-reactor device was used for the protein expression (23). After sealing the wells with a polystyrene membrane under 200 PSI pressure, we incubated the reactor for 2 h at 30°C for expression and for 0.5 h at 15°C for protein capture, followed by blocking with 5% skim milk in phosphate buffered saline with 0.2% tween 20 (PBST) for 30 min. Anti-GST murine monoclonal Ab (mAb; Cell signaling technology, Danvers, MA) was used to assess protein display followed by detection with Alexa 647-labeled Goat antimouse IgG (HϩL) secondary Ab (A-21235, Thermo Fisher Scientific) (16).
Subjects and samples-Serum samples were obtained in cross-sectional studies from patients with Mtb culture-proven TB before or within the first 7 days of antituberculous treatment initiation and from asymptomatic controls (Table I). Subjects were enrolled in two different settings, in public hospitals in New York City, United States, and at Edendale Hospital in KwaZulu-Natal, South Africa (SA). Subjects provided informed written consent prior to enrollment and blood draw. Serum was obtained by collecting peripheral venous blood into BD Vacutainer® Serum Separation Tubes (SST™; Becton, Dickinson and Company, New Jersey) that do not contain any additives. Within 1-3 h after blood draw the samples were centrifuged at room temperature for 10 mins at 3000 rpm and serum was aliquoted and stored at Ϫ80°C until further use. The studies were approved by the Institutional Review Boards of Arizona State University; the Albert Einstein College of Medicine, New York; and the University of KwaZulu-Natal, SA. The samples were divided into four subgroups according to the region (US, SA) and HIV status (HIVϩ/HIVϪ; Table II). Prior to performing assays, the samples in each subgroup were randomized into two even sets: one set for performing the screening/deconvolution array and one independent set for performing the validation array (Table II).
M-HD-NAPPA -Concept Validation-In order to evaluate the M-HD-NAPPA array screening workflow, 96 Mtb genes were selected from initial individual gene glass-slide NAPPA results (data not shown) and scientific literature to create a gene set to validate immunodetection of individual proteins within a triple protein mix. In addition, we randomly selected 288 Mtb clones and printed those as individual genes as well as triple gene mixes on the HD-NAPPA slides. Ab binding was performed with a pooled sample set from 3 HIVϪ, TBϩ subjects that had documented Ab reactivity to various proteins from prior studies (24) as well as mAb anti-GST for protein  M-HD-NAPPA arrays were expressed for probing against 120 subject samples (Table II) to identify Mtb antibody binding proteins (16) (Fig. 1, Panel A). We utilized a four well gasket to create a chamber around each subarray (ProPlate Multi-well chamber, GraceBio-Labs, Bend, OR) and placed 650 l of individual serum sample, diluted to 1:150 in 5% skim milk in PBST in each chamber, which was sealed with an opposing glass slide. This was then incubated overnight (14 -16 h) at 4°C with gentle shaking to ensure even exposure of array surface to sample. The arrays were then rinsed with 5% skim milk/PBST and Ab binding detected with Alex647 labeled Goat anti-human IgG (HϩL) and 1:200 diluted Cy3 labeled Goat anti-human IgA (Jackson ImmunoResearch Labs, West Grove, PA). The slides were rinsed again to remove unbound secondary Ab, dried by centrifugation and scanned at 635 nm and 535 nm with Tecan PowerScanner. The resulting images were quantified with the ArrayPro Analyzer Software (Media Cybernetics, Inc.). Data were extracted and median normalized within each subarray. To assure a sufficient margin between positive and negative Ab reactivity we used a signal cutoff of 1.4 to identify spots for further deconvolution with the individual gene HD-NAPPA. In addition, we calculated the sensitivity and specificity within subgroups and all sample combinations. Those protein targets showing higher than 10% sensitivity in any of the four subgroups or higher than 5% sensitivity in all combined groups were selected for deconvolution.

Molecular & Cellular
HD-NAPPA -Deconvolution-We identified 272 multiplex spots (792 single genes) showing differential responses between TB positive and negative subgroups. These 792 genes were printed as single genes on the HD-NAPPA. In addition, we included those initial individual 96 Mtb genes (from un-published prior studies and the literature) that were not among the 792 genes (62 genes overlapped) and the controls resulting in a final 870 single gene HD-NAPPA of which 8 subarrays fitted on each slide. To identify the specific protein targets, we tested the same subject samples used for the M-HD-NAPPA screen and processed the slides as described earlier (Fig. 1, Panel B). Those genes with greater than 10% sensitivity in any of the four subgroups were selected as candidates for further analysis.
HD-NAPPA -Validation-Individual HD-NAPPA arrays as described above were created for deconvolution as well as validation with biologically independent sample sets (n ϭ 124; Fig. 1, Panel C). Each sample was tested on an independent array. These samples were randomly assigned to processing days so that we would minimize any day-to-day processing bias. Array processing was performed as described above.
RAPID ELISA-Rapid antigenic protein in situ display (RAPID) ELISA was used as described (25) to verify the selected candidate proteins according to the three criteria described above (Fig. 1, Panel D). We used 285 individual subject samples (1:500 dilution) from both the discovery and validation array experiments (Table II) to assess binding to selected proteins. As a negative control, we used the Mtb protein Rv1553, which showed a normalized response close to one (0.972 Ϯ 0.079,) on prior array analyses. Binding was detected using SuperSignal West Femto Chemiluminescent Substrate (Thermo Fisher Scientific). The chemiluminescent signal was measured using the EnVision 2104 Multilabel Reader (PerkinElmer, San Jose, CA) at 460 nm with 0.1 s per well, for a total of 10 repeat reads for each plate. The average from the last five reads was used for analysis. The Ab reactivity data were normalized by subtracting the reactivity to the negative control protein tested on the same run. Normalized values were log 10 transformed and responses below background were set to zero. Normalized and transformed reactivity from replicate protein runs were then averaged for each sample.
Data Analysis-HD-NAPPA -Validation-We conducted a visual inspection of each array image and spot by spot to avoid artifacts. The a Because the deconvolution of positive reactions was the prime goal of this experiment, we focused these analyses predominantly on TBϩ samples from the multiplex HD-NAPPA screening.
b Consisting of biologically independent samples. c Consisting of the original screening and validation samples (n ϭ 244) and heretofore untested samples (n ϭ 41).
data were median normalized, and the sensitivity and specificity were calculated at cutoff 1.4. We also calculated the odd's ratio of a positive response using Firth's penalized likelihood logistic regression (26). Finally, we calculated the area under the receiver operator characteristic (ROC) curve (AUC), which is a measure of marker performance across a range of cutoff values. It was set as 0.55, which elucidated the antigens likely to be positive in the TB groups. Only those genes that passed deconvolution and validation with the second set of samples were taken as possible biomarker candidates.
Because of the high level of heterogeneity of responses within the subject subcategories, we performed an analysis of the candidate biomarkers with the deconvolution and validation array data combined. Briefly, the normalized data of the deconvolution and validation within each subgroups were combined as 4 paired subgroups and processed with the same criteria as the validation array analysis. Those genes with a sensitivity higher than 20%, an odd's ratioϾ1.5 and an AUC valueϾ0.55 in the combined analysis were selected as the biomarkers for ELISA verification testing.
ELISA-We used ROC curve analysis to assess the performance of each protein tested via ELISA for discriminating TB positive from TB negative patients in each of the four patient subgroups. We used the pROC R package (27) to conduct the analysis. For each protein, we measured several ROC statistics including AUC, the sensitivity at 80% specificity and the specificity at 80% sensitivity. We also calculated the p value for the Wilcoxon rank sum test of no difference between the TB positive and TB negative patients. p values were not adjusted for multiple testing and should not be interpreted as strict statistical p values because of the protein selection process and sample re-use.
We developed multiprotein panels to classify TB positive and TB negative patients in each subgroup. The classifier for each subgroup was a logistic regression model. We evaluated all possible logistic regression models using the Bayes Information Criteria (BIC) (28) to identify the best set of proteins for each subgroup. This analysis was conducted using the bestglm R package and Morgan-Tatar search (29). For each sample we calculated the fitted (noncalibrated) probability of TB positivity. We also calculated this probability using leave-one-out cross validation. We generated ROC curves using both the fitted and cross-validated probabilities, and calculated ROC statistics including the AUC, the specificity at 80% sensitivity and the sensitivity at 80% specificity.

M-HD-NAPPA -Concept Validation-
We first generated a test set of 96 Mtb proteins, identified by preliminary studies using a serum pool from TBϩ patients, as well as proteins reported in TB serology literature. We used these test proteins to create multiplex mixes of 3 proteins per spot to validate the concept of our M-HD-NAPPA. To confirm that we could detect a positive responder even when it was mixed with nonresponders (i.e. that the nonresponders did not dilute the responder signal too much), we ensured that we created all possible combinations of positive responders and nonresponders as determined by reactivity to the serum pool. To ensure that our selection did not create any bias, we also selected 3 random plates from the Mtb collection set to add an additional 288 Mtb genes. The gene mixtures, as well as the single genes, were printed on HD-NAPPA array. After expression, we probed the arrays with the TBϩ pooled serum, sera from each of the individuals that comprise the pool, and anti-GST with signals normalized as described earlier (supplemental Fig. S1A and S1B). The quantified signal intensities were used to establish the Signal/Background (S/B) values for evaluating spot reactivity. Using the 96 individual protein array, we created a tool to evaluate signal intensities of the three-target mix as well as the individual component responses (supplemental Fig. S1C). The S/B values of the mixed protein responses revealed that all of the three-protein mixes and their individual components were detected for the mid-and high-level reactive protein mixtures (30/30; 100.0%) and were mostly detected if including the low reactive protein mixtures (32/35; 91.5%) using this multiplex testing strategy, demonstrating that responses to positive proteins could be observed reliably when mixed with two nonreactive proteins. We concluded that this level of reactivity of a three-target mix was satisfactory for screening purposes.
M-HD-NAPPA Screen-Following the successful demonstration of protein reactivity with M-HD-NAPPA, we performed a protein display quality control (QC) test with the 4045 Mtb M-HD-NAPPA array to assess the intra-and intersubarray correlation (protein display repeatability). The anti-GST reactivity showed that almost all of the spots exhibited a yellow to red color revealing an overall fluorescence intensity higher than 1 ϫ 10 6 arbitrary intensity units (a.u., the cutoff of the successful protein display; supplemental Fig. S2A-S2B). The successful protein display rate on the M-HD-NAPPA was as high as 99.45% Ϯ 0.14% across the subarrays and across two slides. The intra-and interslide correlations revealed great reproducibility (r ϭ 0.903 and r ϭ 0.829, respectively; supplemental Fig. S2C). Reproducibility testing using binding of our positive control TBϩ pool revealed excellent interslide correlations (0.977; supplemental Fig. S2D).
After assuring the quality and reproducibility of reactivity using the M-HD-NAPPA, we randomized samples to assure we had a mix of subject category samples during each run and day to ensure minimal run-to-run bias. All of the samples were analyzed as 4 paired subgroups according to the region (US/SA), HIV (HIVϮ) and TB status (TBϮ) for determining sensitivity and specificity. Using these groupings, we selected spots for deconvolution analysis. A representative image is shown in Fig. 2A.
HD-NAPPA -Deconvolution-In addition to the 792 candidate proteins identified in the M-HD-NAPPA screen, we included the 96 Mtb proteins (from unpublished data and published studies) to create 870 individual arrays for printing onto HD-NAPPA. Quality control assessments of these arrays are presented in supplemental Fig. S3. These arrays were probed initially against the same samples tested in the M-HD-NAPPA discovery analysis to allow for the identification of the specific reactive proteins that were responsible for the signal at the multispots. A representative image of a subject's serum binding to the deconvolution array is presented in Fig. 2C. Three hundred and sixteen single proteins that showed higher than 10% sensitivity and 70% specificity in the subgroup analysis were selected for further validation.
HD-NAPPA-Validation-Arrays with the same genes generated for the deconvolution were also used for validation experiments with 124 biologically independent samples (Table II). For biomarker candidate analysis, we used sensitivity and specificity as the first criteria. In addition, we used an odd's ratio higher than 1.5; an AUC value higher than 0.55. From the combined criteria, we identified 34 IgG hits and 8 IgA hits (Fig. 3A-3B, respectively) of potential biomarker targets with a sensitivity higher than 20%, an odd's ratio Ͼ1.5 and AUC value Ͼ0.55. The responses of these targets is shown in (supplemental Table S1) by geographic and HIV status. We performed a combined analysis of these hits with a data set that combined all of the deconvolution and validation data. Eight IgG hits (Table III) with a sensitivity higher than 20%, an odd's ratioϾ1.5 and an AUC valueϾ0.55 in the combined analysis were selected as the biomarkers for ELISA verification testing.
ELISA Verification-In order to verify the Mtb protein performance of the M-HD-NAPPA workflow, we used the rapid antigenic protein in situ display ELISA (25). We tested all available 244 sera from discovery and validation with an additional 41 samples (Table II). We performed an anti-GST quality control expression assessment of all targets prior to performing ELISA (supplemental Fig. S4A-S4B). The sensitivity and specificity of all 8 target proteins were calculated according to the four subgroups (Fig. 4). We also conducted ROC analysis and calculated the AUC values (Table IV). There were 1-4 proteins in each subgroup that showed an AUC value of 0.7 or higher (supplemental Fig. S5). Because the specificity at fixed cutoff varied because of the heterogeneity of responses between subgroups, we also evaluated the sensitivity at 80% specificity and the specificity at 80% sensitivity (Table IV). The proteins with the highest sensitivity at 80% specificity for the four subgroups were Rv0831c  HD-NAPPA (C, D). A, Representative sample being probed against the 1431 Mtb 3*protein spots (4,045 individual Mtb proteins) with the M-HD-NAPPA array. The array image shows color intensity from blue (no binding) to red (with very high antibody binding). Four selected binding spots on this image are highlighted, named 1-4, respectively. The inset below the array image shows a zoomed view of the four selected spots. B, The median normalized serological response of the multiplex protein spots, 96 Mtb genes, the 7 viral control genes, human IgG and master mix are shown as a dot plot. The red dots in the multiplex column represent the normalized reactivity of the 4 selected spots in A. C, Image of the Ab binding to the 870 single Mtb protein HD-NAPPA array. From the 4 spots in Fig. 2A, we now resolve the response to 12 individual proteins (zoomed view highlighted in inset under the HD NAPPA image). Within each 3 protein set, 1 out of 3 proteins showed responses to serum probing. D, The median normalized serological response of the single Mtb protein spots, 7 viral proteins, human IgG and master mix are shown in the dot plot. The 4 responsive proteins are highlighted in red.

Molecular & Cellular Proteomics 16 Supplement 4 S283
had an AUC of 0.723 under cross validation. The ROC curves for each classifier is shown in Fig. 5 and the statistics presented in Table IV.

DISCUSSION
Here we present the generation and validation of a new whole-proteome Mtb HD-NAPPA and show its value for detecting novel biomarkers for TB serodiagnosis. We further demonstrate the feasibility, efficiency, and accuracy of multiplexing proteins into a single spot for expedited high-throughput screening for Ab responses to the Mtb proteome. Using these new methods, we discovered several novel proteins that elicit Ab responses in TB. We created multimarker panels to distinguish TB patients from noninfected or latently infected subjects, with and without HIV coinfection across two geographic regions. With this initial evaluation, we identified 8 proteins that show potential as TB diagnostic biomarkers.
We originally introduced HD-NAPPA for its capability of achieving up to 8000 -10,000 high density spots, as compared with standard flat-glass NAPPA with ϳ 2300 spots per slide (16). This increase in density reduces screening costs and processing time. In addition, HD-NAPPA provides a higher signal to noise ratio for Ab biomarker discovery as compared with flat-glass NAPPA (15,16). Using M-HD-NAPPA arrays, where the screen utilizes multiplexing of targets, can further accelerate Ab biomarker screening. To perform serum Ab profiling over the whole Mtb proteome (around 4000 genes), flat-glass based NAPPA requires two slides per sample. In contrast, HD-NAPPA requires only a half slide and the M-HD-NAPPA (using 3-target multiplex per spot) requires only a quarter of a slide. Thus, the capacity to process 8 times more samples than flat-glass based NAPPA would not only facilitate the Ab discovery speed, but also result in significant reagent cost savings.
The possibility that low protein expression levels, from one of the three proteins in the mix, could mask detection was a concern. The concept validation experiments (supplemental Fig. S1) alleviated this concern. The results showed that 100% of high and medium signal intensity responses and 91.5% of low-signal intensity responses were detected when proteins were mixed in all possible combinations. In addition, we had previously demonstrated adequate individual protein expression levels using glass-based NAPPA (13,15,16,30). To print the 4,045 plasmids required creating two glass arrays, termed TB array 01 and TB array 02. With these arrays of individually-printed Mtb plasmids, we performed expression and demonstrated expression Mtb proteins by detection of the fusion partner with anti-GST staining. As shown in supplemental Fig. S6, TB array 01 and TB array 02 contained 4,045 Mtb proteins and the display rate for TB array 01 were from 91.3% to 93.7% whereas the TB array 02 were 99.8% separately and the overall display rates were 96.1%.
Consistent with other studies, we observed a high level of heterogeneity in Ab responses to Mtb proteins, including those with biomarker value (6,7,9,10,31). Hence, our data are consistent with the notion that a single biomarker will not be sufficient for distinguishing TB status in various patient groups (6,32). Further consistent with prior studies by us and others, we observed qualitatively different host responses in HIV coinfected compared with HIV uninfected TB patients (32)(33)(34), as well as quantitatively different Ab levels in proportion to extent of disease (6,34). We also observed clear regional differences with limited overlap in the proteins recognized by US versus SA within each of the respective HIVinfected and uninfected subgroups. Of note, the expression of Mtb proteins as well as the human Ab response to Mtb is dynamic and dependent on the state of infection and disease (reviewed in (35)). It remains to be explored whether some or all of the observed regional differences in Ab responses were predominately driven by regional difference in disease statewith TB patients from resource-limited TB endemic settings typically being diagnosed at more advanced stages than those living in the US (33), or whether the regional differences could be driven in part by infection with different Mtb strains. We therefore developed individual panels for the four subject subgroups, depending on the geographic region (US or SA) and HIV status (HIVϪ/ϩ). This approach shows promise and warrants further validation in larger studies within the same and other geographic regions.
The eight candidate immunoreactive Mtb proteins that we identified have varied characteristics. Four of these proteins are secreted and have been identified in Mtb culture filtrates (CFPs; Rv0054, Rv0831c, Rv2031c and Rv0222) (36,37), with three of these (Rv0054, Rv0831c, Rv2031c) also identified in the cell membrane (38,39) and two (Rv0831c, Rv2031c) in the cell wall (39,40). One (Rv0948c (39)) has only been associated with the Mtb membrane fraction. The cellular location for two of the proteins (Rv3405c and Rv3544c) has not been identi- fied. All of the four CFPs (Rv0054 (9, 10, 41), Rv0831c (9, 42), Rv2031c (9,10,43,44) and Rv0222 (45,46)), have already been reported in TB serology studies. Rv2031c (hspX) was also previously identified using a Mtb proteomic level microarray with printing of unpurified Mtb proteins generated in E. coli lysate (6). Interestingly, we did not identify shared candidate proteins with an independent recent microarray study that was based on printing proteins generated in a Saccharomyces cerevisiae system (47). Overall, we have validated the value of some previously known serological biomarkers and further identified some new TB biomarker candidates.
Advantages of TB serodiagnostic assays over currently available gold standard tests include their independence of a respiratory tract sample, their suitability for all forms of TB, and their ability to be scaled up into rapid, robust, low-cost dip-stick formats for use in remote and very resource-lim-ited settings. Except for the SA/HIV-group, none of our identified biomarker panels provided sufficiently high sensitivity and specificity within each subgroup to replace any of the currently available simple rapid methods, such as the sputum smear microscopy. Nevertheless, our identified protein biomarkers contribute further to the spectrum of already identified Ab targets, and provide the basis for further exploration with other candidate proteins as well as other nonprotein type of Mtb antigens such as lipids, polysaccharides or glycolipids, some of which have also shown diagnostic potential (reviewed in (8,34)). We thus anticipate that the NAPPA-based identified candidate antigens, if validated in larger studies in diverse TB endemic regions, could contribute to the generation of simple lateral-flow tests. Such simple tests exist already for the diagnosis of other complex and difficult to diagnose infectious diseases such as leprosy or crytococcosis (48,49). Such simple tests could further be evaluated as screening tools as well as in combination with other diagnostic methods, especially in those groups that are particularly challenging to diagnose, such as HIV coinfected and/or extrapulmonary TB patients, patients with insipient or culture-negative TB, or pediatric TB patients (50 -54). We note that our work focused on the generation of the HD-NAPPA and M-HD-NAPPA with preliminary screening of samples and validation of candidate proteins. Our study was limited by a small sample size in some groups such as the SA/HIV-TBϩ group, the lack of evaluation of samples from patients with other respiratory diseases than TB, and the testing of patients from a broad range of geographic regions. However, the majority of TB patients diagnosed in the US had emigrated from various TB endemic regions, including Asia, South America and Africa, regions where they were likely infected originally. It was further beyond the scope of the present study to generate and express Mtb proteins in M. smegmatis or eukaryotic cells to assure preservation of posttranslational modification of our identified candidate proteins. Further studies are thus needed to validate our findings and expand them to a larger scale including various patient populations enrolled in diverse TB endemic regions that include populations with high numbers of multi drug-resistant (MDR) TB cases.
In summary, we demonstrate that our newly developed Mtb proteome HD-NAPPA has the potential to identify proteins whose immunogenicity has been previously unknown. We demonstrate that the multiplexing of protein genes into a single spot allows for expedited reliable high-throughput screening for Ab responses and has value for detecting novel biomarker for TB serodiagnosis. Our findings highlight the heterogeneity of Ab responses to Mtb and indicate that clinically useful serodiagnostic tests might have to be developed for specific target patient groups. Larger studies, especially in combinations with other antigens, biomarkers, and/or diagnostic tests are needed to further explore the value of our identified biomarker proteins.