Plasma Biomarkers to Detect Prevalent or Predict Progressive Tuberculosis Associated With Human Immunodeficiency Virus–1

Abstract Background The risk of individuals infected with human immunodeficiency virus (HIV)-1 developing tuberculosis (TB) is high, while both prognostic and diagnostic tools remain insensitive. The potential for plasma biomarkers to predict which HIV-1–infected individuals are likely to progress to active disease is unknown. Methods Thirteen analytes were measured from QuantiFERON Gold in-tube (QFT) plasma samples in 421 HIV-1–infected persons recruited within the screening and enrollment phases of a randomized, controlled trial of isoniazid preventive therapy. Blood for QFT was obtained pre-randomization. Individuals were classified into prevalent TB, incident TB, and control groups. Comparisons between groups, supervised learning methods, and weighted correlation network analyses were applied utilizing the unstimulated and background-corrected plasma analyte concentrations. Results Unstimulated samples showed higher analyte concentrations in the prevalent and incident TB groups compared to the control group. The largest differences were seen for C-X-C motif chemokine 10 (CXCL10), interleukin-2 (IL-2), IL-1α, transforming growth factor-α (TGF-α). A predictive model analysis using unstimulated analytes discriminated best between the control and prevalent TB groups (area under the curve [AUC] = 0.9), reasonably well between the incident and prevalent TB groups (AUC > 0.8), and poorly between the control and incident TB groups. Unstimulated IL-2 and IFN-γ were ranked at or near the top for all comparisons, except the comparison between the control vs incident TB groups. Models using background-adjusted values performed poorly. Conclusions Single plasma biomarkers are unlikely to distinguish between disease states in HIV-1 co-infected individuals, and combinations of biomarkers are required. The ability to detect prevalent TB is potentially important, as no blood test hitherto has been suggested as having the utility to detect prevalent TB amongst HIV-1 co-infected persons.


Sampling
From the 2173 individuals screened as part of the parent study, a subset were willing to provide additional consent for further screening by IGRA and TST. Samples in this analysis included all individuals with an IGRA sample available from the parent study, who had either prevalent or incident TB, as well as two controls for each prevalent and incident case, selected randomly based on IGRA availability, in the order of recruitment to be just before and just after the recruitment of the prevalent or incident case. The IGRA samples were taken as part of a sub-study nested within the parent population as described in Rangaka et al "Interferon release does not add discriminatory value to smear-negative HIV-tuberculosis algorithms" in the European Respiratory Journal 2012, 39:163-171.

Standard curves
Analyte concentrations were calculated with reference to the standard curve for each analyte, ranging from 3.2 pg/ml to 10000 pg/ml in the Milliplex assay, according to the manufacturer's instructions. Manufacturer supplied internal controls (QC1 and QC2) were used to validate the standard curves. The minimal detection limit was additionally guided by the observed/expected values of the standard concentrations being between (70-130) %. Next, the corresponding (fluorescence-background) results were checked to ensure they are >0, thus further ensuring that we detect meaningful concentrations. The mean lower limit of detection for each analyte was calculated using data from the 11 plates used to run all samples. These values are given in the methods section in pg/ml, and they were consistent at both the lower and top end of the standard curve, except for VEGF, which had a sensitivity of 205 pg/ml, calculated as above. Raw out of range (OOR) values were adjusted, by replacing all OOR< (lower than the lowest detectable value) with 0 (1027 instances), and replacing all >OOR (higher than the highest detectable value) with 10000 pg/mL (top standard) (13 instances) and adding the mean limit of detection (LOD) per analyte to all plates.

Statistical methods
Results are presented for unstimulated (Nil) and stimulated -nil (TB Ag-Nil) values. Analyte values are nearly always presented in log2 transformed scale as log2 pg/ml. Subgroups for analysis included incident TB, prevalent TB, controls and the combined incident and prevalent TB groups, hereafter referred to as TBcombined. Frequency (percent) or median (inter-quartile range) were calculated by group for discrete and continuous values respectively. Sensitivity analyses were undertaken using the subgroup of cultureconfirmed incident TB, the subgroup randomised to placebo, and the subgroup of prevalent TB who were smear negative at baseline (Smear-negative). Demographic and clinical characteristics were also calculated for entire screening population of the parent study, from whom the samples in this analysis were drawn. Time to onset of TB was calculated as the days between date of screening and first date of TB registration.
Statistical tests to compare groups were Fisher's exact test or Wilcoxon rank sum test, as appropriate. Throughout, a nominal threshold for statistical significance was set at α = 0·05, and false discovery rate correction (FDR) by Benjamini-Hochberg 9 was applied. These values are reported as p-corrected. Data visualisation was used to clarify differences between and within groups.
Weighted correlation network analysis was carried out on the nil and background corrected analyte levels, stratified by TB status and presented with correlation diagrams using the package qgraph (Epskamp et al, 2012). Correlation matrixes were first estimated using Pearson's correlation of the log2 transformed data in the case of the nil values and z-score transformed data in the case of the Ag-nil data (due to the presence of negative values). The partial correlation network was estimated using the glasso internal method with tuning parameter gamma set to = 0.15. All other settings remained at default. Thicker edges indicate stronger associations, positive associations are represented with solid lines while negative associations are represented by dashed lines.
Supervised learning models were applied to the data to predict class membership (e.g. incident vs prevalent TB) in 2-way classifications using the caret (Kuhn et al, 2018) package. In all cases analyte values were centred and scaled prior to input and all models were carried out with 10-fold cross-validation resampling to estimate classification accuracy. Cross-validation is a form of internal validation that generates independent 'training' and 'evaluation' samples from the same data source, whereby the learning models are applied multiple times and a pooled estimate of final classification accuracy obtained. Sampling was stratified by down sampling to ensure balanced class representation in the re-samples. Classification learners assessed included random forests (RF) to set a performance ceiling (Liaw and Weiner, 2002) and elastic-net regularisation (glmnet) (Friedman, Hastie, Tibshirani, 2010) for a potentially interpretable and applicable model. Elastic-net regularisation is a type of penalised regression that results in covariate selection by downweighting covariates that do not contribute to classification accuracy.
Models were optimised by evaluation of area under the curve (AUC) of the receiver operator characteristic (ROC), and this metric along with ranges reported. Models were selected on the basis of largest minimum unbiased AUC estimate. Cross-validated AUC and ranges were estimated. Receiver-operator curves were generated using the predicted vs observed classifications for each independent model and were drawn for all models as there were only minor differences in performance across different model parameters. Training for the prediction models utilising the default grid search approach for the parameters with a specified grid length. For the random forest analyses, the grid searched over the parameter mtry which controls the number of possible samples at each node. Penalised regression searched for optimal values over the two parameters alpha and lambda, which adjust the elastic net penalisation weights.
Variable importance score (VI), calculated as a scaled beta coefficient, was used as the primary means of determining individual analyte impact on classification outcome. Lists of analytes and associated VI score are presented. All statistical analysis was carried out in R v.3.3 (R Core Team).

Table S1
Analytes assayed in this study and the reason for their selection.

Reason Reference
Interferon -

Figure S3
Weighted correlation networks in (left to right) controls, incident and prevalent TB groups respectively using background corrected analyte values. Solid lines are indicative of positive correlations, dashed lines indicate negative correlations and strength of correlation is indicated by thickness of line.
Control Ag-nil Incident Ag-nil Prevalent Ag-nil

Figure S4
Receiver operator curves for sensitivity comparisons using the nil data and penalised regression models. Each curve represents an independent cross-validated predictive model with different tuning parameters.