Systems Vaccinology for a Live Attenuated Tularemia Vaccine Reveals Unique Transcriptional Signatures That Predict Humoral and Cellular Immune Responses

Background: Tularemia is a potential biological weapon due to its high infectivity and ease of dissemination. This study aimed to characterize the innate and adaptive responses induced by two different lots of a live attenuated tularemia vaccine and compare them to other well-characterized viral vaccine immune responses. Methods: Microarray analyses were performed on human peripheral blood mononuclear cells (PBMCs) to determine changes in transcriptional activity that correlated with changes detected by cellular phenotyping, cytokine signaling, and serological assays. Transcriptional profiles after tularemia vaccination were compared with yellow fever [YF-17D], inactivated [TIV], and live attenuated [LAIV] influenza. Results: Tularemia vaccine lots produced strong innate immune responses by Day 2 after vaccination, with an increase in monocytes, NK cells, and cytokine signaling. T cell responses peaked at Day 14. Changes in gene expression, including upregulation of STAT1, GBP1, and IFIT2, predicted tularemia-specific antibody responses. Changes in CCL20 expression positively correlated with peak CD8+ T cell responses, but negatively correlated with peak CD4+ T cell activation. Tularemia vaccines elicited gene expression signatures similar to other replicating vaccines, inducing early upregulation of interferon-inducible genes. Conclusions: A systems vaccinology approach identified that tularemia vaccines induce a strong innate immune response early after vaccination, similar to the response seen after well-studied viral vaccines, and produce unique transcriptional signatures that are strongly correlated to the induction of T cell and antibody responses.


Introduction
This appendix provides supporting information for the manuscript entitled "'Systems vaccinology for a live attenuated tularemia vaccine reveals unique transcriptional signatures that predict humoral and cellular immune responses"'.

Study Design
This laboratory substudy was a systems biology study of the human response to two lots of tularemia

Microarray experiment
Affymetrix High-throughput (HT) Perfect Match (PM) Array GeneChips were run in a 96 array plate configuration (96 samples per HT array). Each individual array contained 54,715 probe sets and 536,460 perfect match (PM) probes (on average 9.8 probes per probe set). HT PM Array Plates only contain PM probes. Mismatch probes (MM) are not included. Three 96 HT array plates were run producing data for 205 samples (the last 96 HT array contained data for 13 samples). Plates were run on three different days (10/11/2011, 11/8/2011, and 11/15/2011). For five samples array data was not produced: subject T02HC051 was missing Day 1; T02HC115 was missing Days 2, 7, and 14; T02HC118 was missing Day 0. All of the other 39 subjects had array data for all 5 time points (Day 0, 1, 2, 7, 14). Hybridization and overall signal quality for each array was evaluated using 4 Affymetrix spike-in probe sets (BioB, BioC, BioD, CreX) that are added to the experiment with increasing RNA concentrations (as listed from low to high). Median intensity was used to summarize intensities per probe set and log 2 value of the ratio was reported. The slope (proxy for hybridization strength) was estimated for each sample by fitting a linear regression model. RNA-degradation was inspected by plotting average probe intensities in the 5' to 3' direction (the first 8 probes were included per probe set). The RNA degradation slope was estimated for each sample by fitting a linear regression model. The Robust

Cell mediated immunity experiments
Multichip Average (RMA) algorithm (Irizarry et al., 2003) was used to obtain background-corrected and quantile-normalized probe-level intensities. Probe and probe-level log 2 intensity distributions were investigated using boxplots, probability density function (PDF) plots, and log expression (RLE) boxplots (Brettschneider et al., 2008). The ComBat algorithm [2] using the parametric empirical Bayes option was applied to adjust the probe-level data for batch effects. Extreme outliers were removed from the probe level data and background correction, quantile normalization, and batch correction was rerun in the specified order. Next, probe sets with a sample coefficient of variation (ĈV = Ŝ x ) in the lower 25% quantile were filtered out.

Differential gene analysis
A two-sided paired t-test was applied to identify significantly up/down-regulated genes from baseline (Day 0 versus Day 1, 2, 7, 14). The paired test was carried out separately for each study group as well as for combined study groups. In addition, log 2 fold changes per study visit were compared between study groups using a two-sided Welch's t-test. To compensate for multiple testing, the false discovery rate (FDR) which controls the false positive rate among significantly differentially expressed genes was calculated using the qvalue R package. Genes with a q-value ≤ 0.05 and a fold change of ≥ 1.5-fold (up or down regulation) were deemed to be significantly differentially expressed.

Determination of robust gene clusters
log 2 fold changes of genes that were differentially expressed for a certain study group at any post baseline day were used as input for gene clustering. Uncentered Pearson correlation distance was used as distance measure. Pairwise distances between log 2 changes were calculated separately for each vaccine group and post baseline day (Day 1, 2, 7, 14) as well as for all study days (Day 1-14). Clusters were obtained using the hierarchical complete linkage clustering algorithm. To evaluate robustness of gene clusters, multiscale bootstrapping (Suzuki and Shimodaira, 2006) was carried out using varying dataset sizes (0.4*N, 0.5*N, 0.6*N, 0.7*N, 0.8*N, 0.9*N, 1*N, 1.1*N, 1.2*N, 1.3*N; where N stands for the respective dataset size). For each dataset size bin, 1,000 bootstrap samples were obtained, and bootstrap probabilities and p-values were calculated. The unbiased bootstrap probability cut-off was set to 0.05 and the maximum distance to form a significant cluster was set to 0.5 (equivalent to minimum uncentered Pearson correlation of 0.5). Clusters that were formed at larger distances were excluded.

Gene set enrichment analysis
Gene set enrichment analysis was performed using the Java implementation (Version 2.1.0, downloaded 01/23/2015 from http://www.broadinstitute.org/gsea) of the GSEA algorithm (GSEAPreranked method) to detect enriched gene sets [3]. Ensemble genes were grouped and analyzed based on 8,101 known gene sets obtained from the KEGG database (Version 70.0 (06/09/2014), [4]) and MSigDB (Version 4.0 (05/31/2013), [5] on decreasing t-statistic were used as input for Chemical/Genetic Perturbations, and Immunologic Signature gene. For all other gene set collections, the absolute t-statistic was used. In either case, before generating the lists, the largest absolute t-statistic was used for Ensembl gene IDs that mapped to multiple probe sets. Except for Chemical/Genetic Perturbations, and Immunologic Signature gene sets which contain directional information, the absolute t-statistic was used. Program specifications were as follows: random seed was specified as 149, collapse was set to false, the minimum and maximum gene set size was not restricted (set to 1 and 10000, respectively), and the number of permutations was set to 10,000. To adjust for testing multiple gene sets per category type, the Benjamini-Hochberg procedure was applied to each list. For Chemical/Genetic Perturbation and Immunologic Signature collections, with many more gene sets compared to the other gene set collections, a FDR < 0.01 was applied to determine significant enrichment. For all other collections, a cut off of FDR < 0.1 was used. Enrichment trends across post-vaccination days and comparison groups were visualized using heatmaps and binary Jaccard-distance based complete-linkage hierarchical clustering.

Regularized logistic regression analysis
As there is no a priori knowledge about the correlates of protection for these vaccines, responders, non-responders, and subjects with high baseline were arbitrarily defined. The positive cut off for immune response thresholds for responders based on the percentage of activated CD4+, CD8+ T-cells and microagglutination titer was defined as three standard deviations (SD) above the mean of all baseline samples. Concurrently, a minimum sample size of 10 subjects in either response group (responders and non-responders) was required. The minimum subject criterion was met for responders and non-responders when combining vaccine study arms based on CD8+ T-cell activation and microagglutination titer but not for percent activated CD4+ T-cells. As the SAP allowed for flexibility in the choice of the cut off, the numbers of responders and non-responders in the combined data set were plotted for different SD cut offs to see if an alternative cut off can be identified that meets the minimum subject requirement. Based on the CD4+ plot, a cut off of 2 SD was chosen to define a positive CD4+ response.
The classification for subjects with persistent T-cell responses (3 SD above the baseline mean at Day 7, 14, and 28) versus transient T-cell responses (3 SD above the baseline mean at Day 7 and 14 but below this cut off at Day 28) did not meet the minimum number of required subjects for CD4+ and CD8+ for any SD cut off. Thus, the comparison of transient versus persistent subjects was not carried out.
The following criteria were used to define assay-specific responders and non-responders: 1. classification based on Tularemia-specific microagglutination titer (a) responder: a subject in which tularemia vaccine induces an increase in Tularemia-specific log-transformed microagglutination titer that is 3 SD above the mean of all log-transformed baseline values, at any of the following time points: Days 14 or 28.
(b) non-responder: a subject in which tularemia vaccine does not induce an increase in tularemia specific log-transformed microagglutination titer that is 3 SD above the mean of all log-transformed baseline values, at any of the following time points: Days 14 or 28.
(c) subject with high baseline: a subject with a log-transformed baseline value that is 3 SD above the mean of all log-transformed baseline values.
For each post-vaccination Day (1,2,7,14) and assay-specific classification, a regularized logistic regression model was fit to identify gene responses that distinguish between responders and non responders using the glmnet R package (Version 2.0-2). Before fitting the models, the predictor variable set (probe sets) was processed as follows: 1. probe sets corresponding to the same gene (same gene name) were collapsed by retaining the probe set per gene that had the highest absolute mean fold change from baseline at the respective post vaccinationday.
2. collapsed probe sets with an absolute fold change ≥ 1.5 were retained and used as predictors for the respective model.
To avoid overfitting (n << p and collinearity among gene responses) and facilitate variable selection, an elastic net regularization step (combination of L1 Lasso and L2 ridge penalization, α = 0.5) was included as part of the fitting procedure. Five-fold cross validation was used to determine the optimum regularization parameter λ based on the minimum average model deviance across folds. In addition, the mean misclassification error across folds was determined for the optimal λ and provided as additional cross-validation statistic. The five random folds were stratified by positive and negative responders, i.e. the overall proportion of positive/negative responders was maintained for each random fold. The seed was set to 20151121.

Regularized canonical correlation analysis
To identify patterns that explain associations between changes in gene expression and peak changes in immunogenicity outcomes (IMO), multivariate regularized Canonical Correlation Analysis (CCA) was carried out separately for Days 1, 2, 7, and 14. CCA identifies linear combinations of the original variables (referred to as canonical variates) that maximize the inter-set correlation (here gene and immunogenicity variable sets). Planned immunogenicity variables included peak baseline fold change in %T-cell activation (CD8+ and CD4+ cells), T-cell proliferation at Day 180 (CD8+ and CD4+ cells), and peak baseline fold change in microagglutination titer. T-cell proliferation data was not included as part of the models as proliferation data for 11 subjects was missing (the data had not been collected). Before fitting the CCA models, the gene expression variable set was processed as follows: 1. probe sets corresponding to the same gene (same gene name) were collapsed by retaining the probe set per gene that had the highest absolute mean fold change from baseline for the respective post vaccinationday.
2. collapsed probe sets with an absolute fold change ≥ 1.5 were retained and used as gene variable set for the correlation analysis.
Prior to CCA, immunogenicity variables were transformed using the Box-Cox transformation to make their distributions more symmetric and unimodal (while normality is not required, it optimizes the distributional information used as part of the modeling process). To avoid over fitting (n« g and collinearity among genes) a regularization step was applied. Optimal regularization parameters were estimated using leave-one-out cross validation. Significant canonical variate pairs were identified by manually inspecting scree plots. Individual variables were correlated against their significant canonical variates to obtain canonical loadings. Canonical loadings were squared to obtain the proportion of explained variance. Canonical loadings for each variable set were plotted and variables with low explained variance were filtered out (correlation < 0.5 or explained variance < 25%).

Comparisons with viral vaccine microarray studies
RMA-normalized expression data obtained from yellow fever (YF-17D) vaccinees (Day 0, 3, 7) as well as trivalent influenza vaccine (TIV) (Day 0, 3, 7), and live attenuated influenza vaccine (LAIV) (Day 0, 3, 7) vaccinees were provided by Emory University. For each vaccine study, probe sets with a coefficient of variation (CV) higher than the lower quartile of all CVs were retained. Due to different Affymetrix array types and non-matching probe information (HG-U133 PM for 10-0019 Tularemia vaccine data and HG133 Plus 2 for YF-17D, TIV, and LAIV microarray data), the mean log 2 fold change across gene sets with the same gene symbol (based on Affymetrix release 34 probe set annotations) was used for the comparative analysis. For each vaccine study and post-vaccination day, a paired-t test was applied to identify genes that were differential expressed from baseline. A Welch t-test was used to identify differential fold changes between vaccine studies (Day 2 responses following DVC-LVS/USAMRIID-LVS vaccination were compared to Day 3 responses following YF-17D, TIV, or LAIV vaccination). The same fold change and FDR cut offs as described in Section 2.5 were used. For gene set enrichment analysis, gene symbols were mapped to Ensembl gene IDs using Affymetrix HG-U133 PM and Hugo (accessed 05/12/2015) mappings. Only gene symbols represented in the 10-0019 Tularemia expression data set were analyzed using GSEA. For Ensembl gene IDs that mapped to multiple gene symbols, the Confidential/Proprietary Information -14-Supp. Text "'Transcriptomics of tularemia vaccines"' Natrajan et al.
largest absolute t statistic was used. Genes were then ranked based on absolute t statistic and GSEA analysis was carried out as described in Section 2.7. The DVC/USAMRIID-LVS data was pooled when identifying enriched gene sets as well as DE genes. Tularemia vaccine group membership was retained and displayed in subject-level heatmaps when comparing differential gene lists.