Genome-based prediction of cross-protective, HLA-DR-presented epitopes as putative vaccine antigens for multiple Bordetella species

ABSTRACT Acellular pertussis vaccines protect against severe pertussis, but vaccine-induced immunity wanes over time. Prior animal studies showed that T-cell responses are integral to long-lasting immunity. Current pertussis vaccines also do not provide considerable protection against other species that cause pertussis-like illness, such as Bordetella parapertussis and Bordetella holmesii. We aimed to identify potential vaccine antigens from conserved orthologs that are predicted to engage CD4 T cells and provide cross-protective immunity against multiple Bordetella pathogens. Whole-genome sequence data were previously collected for Bordetella pertussis, B. parapertussis, and B. holmesii isolates. Immunoinformatics and comparative genomics were used to predict immunogenicity, cross-reactive proteins, and protein homology for a set of Bordetella isolates. Expression and production levels of homologous, immunogenic targets were screened using transcriptomic and proteomic data, and detectable genes were analyzed by reverse transcription quantitative PCR. Computational prediction methods identified putative human leukocyte antigen-DR-binding epitopes. Recognition of targets by T cells from individuals immunized with whole-cell pertussis vaccines was confirmed ex vivo. From the B. pertussis genome, 408 genes exhibited high sequence conservation with orthologs in B. parapertussis and B. holmesii, and a select group had high immunogenicity scores. A subset of detectable proteins were also Bordetella-specific and non-cross-reactive. Epitope mapping predicted 36 conserved, immunogenic, and naturally processed epitopes. Of these 36 targets, six epitopes upregulated markers of T-cell activation, and three elicited cytokine production. Our findings identified a list of peptides specific to Bordetella respiratory pathogens that may confer long-lasting, cross-protective T-cell immunity. IMPORTANCE Pertussis, caused by Bordetella pertussis, can cause debilitating respiratory symptoms, so whole-cell pertussis vaccines (wPVs) were introduced in the 1940s. However, reactogenicity of wPV necessitated the development of acellular pertussis vaccines (aPVs) that were introduced in the 1990s. Since then, until the COVID-19 pandemic began, reported pertussis incidence was increasing, suggesting that aPVs do not induce long-lasting immunity and may not effectively prevent transmission. Additionally, aPVs do not provide protection against other Bordetella species that are observed during outbreaks. The significance of this work is in determining potential new vaccine antigens for multiple Bordetella species that are predicted to elicit long-term immune responses. Genome-based approaches have aided the development of novel vaccines; here, these methods identified Bordetella vaccine candidates that may be cross-protective and predicted to induce strong memory responses. These targets can lead to an improved vaccine with a strong safety profile while also strengthening the longevity of the immune response.

KEYWORDS Bordetella pertussis, epitope vaccines, cross-protection, immunogenicity P ertussis, or whooping cough, is caused by respiratory infection with the bacterium Bordetella pertussis.Although coverage with pertussis-containing vaccines remains high in the United States, reported pertussis cases increased steadily from the early 1990s until the start of the COVID-19 pandemic in 2020 (1).Current acellular pertussis vaccines (aPVs) confer protection against pertussis that wanes over several years and fails to prevent asymptomatic carriage of the bacterium (2)(3)(4), resulting in increased disease risk in populations with high vaccination coverage (5,6).Licensed aPVs in the United States contain three to five B. pertussis purified surface proteins (pertactin, pertussis toxin, fimbriae types 2 and 3, and filamentous hemagglutinin), of which the sequences in circulating strains have all diverged from alleles present in vaccine reference strains (7,8).In recent years, there has also been a notable expansion of pertactin-deficient isolates (9), suggesting a need for new vaccine antigens that are less susceptible to vaccine pressure.
Current data suggest that aPV-induced immunity wanes over time.Furthermore, in contrast to cellular immunity from whole-cell pertussis vaccines (wPVs) and natural infection, animal studies have shown that the cellular immune responses to aPVs do not protect against colonization and transmission (10).Previous work on cellular immunity to pertussis infection has shown that recovery is linked with the induction of T helper type 1 (Th1) cells (11)(12)(13) that play a key role in bacterial clearance through induction of opsonizing antibodies, macrophages, and neutrophils (14)(15)(16).Immunization with aPVs primarily induces Th2-dominated responses (17)(18)(19), which promote the activation of eosinophils and production of neutralizing antibodies (15) but do not appear to be as effective as Th1 cells in initial clearance or memory responses to B. pertussis (20).Therefore, the failure of aPVs to induce memory T cells that are correlated with sustained protection may contribute to the increasing incidence of pertussis.
B. pertussis is the primary causative agent of pertussis, but closely related species Bordetella parapertussis and Bordetella holmesii also can cause similar pertussis-like illness that can be clinically indistinguishable from B. pertussis infection (21).Current aPVs do not appear to confer cross-protection against these related pathogens that are often detected co-circulating during pertussis outbreaks (22)(23)(24)(25).Although these agents are thought to cause less severe disease, B. parapertussis has been shown to cause severe symptoms of whooping cough (26), and B. holmesii has been associated with low susceptibility to antimicrobial treatment and is capable of causing disease in healthy individuals (27), suggesting that additional prevention measures would benefit protection against these infectious agents.New vaccine formulations containing T-cell antigens that can provide cross-protection against these additional Bordetella species may further reduce the burden of pertussis-like illness.
While pertussis resurgence is likely multifaceted, predicting the interaction between the host immune response and bacterial proteins may inform the design of revised vaccines to improve prevention and control measures.Pangenome-based approaches have successfully identified novel vaccine candidates against other bacterial pathogens (28) and recently identified new B. pertussis antigens that are recognized by T cells of people immunized with wPVs (29).Here, we used a unique collection of Bordetella genomic data derived from clinical isolates to predict immunogenicity of Bordetella proteins and select epitopes.We confirmed that these epitopes are recognized by wPV-primed human T cells, further suggesting that the addition of these targets to current aPVs may induce vaccine longevity and improve protection.

Immunogenic ortholog prediction in Bordetella genomes
Sequences for all encoded proteins predicted within the genomes of representative isolates of B. pertussis H762 (n = 3631), B. parapertussis H889 (n = 4429), and B. holmesii H785 (n = 3558) were compared to identify shared orthologs.Of these, 408 predicted proteins shared ≥90% amino acid sequence identity among the three species and were considered candidates for cross-protective vaccine targets (Fig. 1A).A flow chart of the methods for the selection of predicted proteins identified in the genomic data is presented in Fig. 2.
These 408 Bordetella orthologs were next screened with the iVax platform (EpiVax, Inc., Providence, RI, USA) in a representative B. pertussis isolate (H762) to evaluate their overall immunogenic potential.The EpiMatrix Class II scores, which model the likelihood that these peptides would be presented by major histocompatibility complex class II (MHC-II) molecules and recognized by common T-cell receptors, ranged from −107.03 to 346.71, with most orthologs producing low (≤20) or neutral (−20 to +20) immunogenic ity scores (Table S1).The remaining 76 candidate orthologs were considered to have strong immunogenic potential, with EpiMatrix Class II protein scores ≥20 (Fig. 1B).
Homology to predicted human protein sequences was also investigated with the JanusMatrix (EpiVax, Inc., Providence, RI, USA) algorithm.JanusMatrix scores in the 76 strongly immunogenic proteins ranged from 0.3 to 6.72.Seventy-four of the 76 proteins had JanusMatrix scores <5 and are not expected to contain significant numbers of potentially cross-reactive peptides to human sequences (Fig. 1B).
Of these 74 candidates, eight were excluded because matched, orthologous proteins in B. parapertussis or B. holmesii exhibited EpiMatrix Class II scores lower or JanusMatrix scores higher than the thresholds above.Two others were predicted paralogs, duplicated within the genome.In total, 64 orthologous protein candidates were selected for further analyses based on immunoinformatic predictions of high immunogenicity and low crossreactivity with human proteins (Fig. 1C; Table S2).

Confirming candidate expression and production in B. pertussis by transcrip tomics and proteomics
To confirm gene expression of the 64 predicted immunogenic orthologs, 15 isolates of B. pertussis were propagated under laboratory growth conditions and used for further investigation by RNA-sequencing (RNA-seq) and proteomic analyses.Twelve of these isolates were selected as representative types for the diversity seen in all 743 wholegenome sequences available for B. pertussis (30)(31)(32), including the U.S. and global isolates; the remaining three are a commonly used reference isolate (Tohama-1), a standard laboratory isolate (UT25), and a commonly used human clinical isolate (D420).
These data were analyzed by comparing reads per kilobase of transcript per million mapped reads (RPKM) calculations across the 15 B. pertussis isolates (Table S3).Gene transcripts with <30 RPKM in one or more isolates were excluded due to low expression levels, leaving 35 genes detectable in all 15 isolates at the mRNA level.Hierarchical clustering according to the expression of these 64 genes separated pertactin-deficient and pertactin-producing isolates, with the exception of isolate H762 (Fig. 3A).Protein production was determined by normalized total intensity per protein for 52 of the 64 predicted immunogenic orthologs across 14 B. pertussis isolates (Table S4).The production of 12 proteins was not detected by the proteomic analyses.Proteins undetectable in one or more isolates were excluded due to low production, leaving 32 of the 35 genes in all isolates also detectable at the protein level (Fig. 3B).
Overall, there was a significant correlation between the RNA-seq (median RPKM normalized to recA) and proteomic data (median intensity normalized to recA) for the 52 detectable immunogenic targets (r = 0.5365, ****P < 0.001) (Fig. 3C).However, seven orthologs exhibited lower than predicted median protein intensity given their mRNA expression in all isolates, but these seven genes were also ruled out by low RPKM levels (lrp, bp1709, sdhD, and ugpA had <30 RPKM in >1 isolate) or low protein intensity (CtaD, NuoK, and PssA had protein intensity = 0 in >1 isolate).
The PSTORb-predicted subcellular localizations and functional categories of the 32/64 targets were stratified based on the number of proteins in each category (Fig. 3D).The proteins were found localized to the cytoplasmic membrane or cytoplasm, with a majority involved in cell surface expression, transport/binding, energy metabolism, macromolecule synthesis, or ribosomal activity.bulleted numbers identify the number of proteins selected down at that level for additional analyses.

Limiting cross-reactivity with commensal bacteria
As ribosome constituents made up 31% of the 32 identified targets and are known to be well conserved across all bacterial species (33), the next approach in this stepdown process led to the determination of homology for these 32 targets to commen sal bacteria to avoid cross-reactive immune responses to other species and focus on Bordetella-specific immunogenic responses.Prior studies have suggested that immune reactivity to commensals is maintained through a balance between inflammatory and regulatory T cells (34); removing antigens that would interfere with this balance may improve the specific safety and reactogenicity to Bordetella pathogenic targets.
The amino acid sequences of the 32 proteins were aligned to 30 common bacterial species in the healthy human nasopharynx using BLASTp and to 82 common species in the gut microbiome using Pipeline Builder for Identification of Targets (PBIT) (pbit.bicnirrh.res.in).An identity value was calculated for each protein per species based on percent identity × query coverage (35,36) against the protein sequence from B. pertussis H762.The 50% identity cutoff was used in these analyses as a maximum threshold.Matches to 12/32 candidate proteins exhibited identity values of <50% in all species of the healthy nasopharynx and gut microbiota (Fig. 4), suggesting that these were unlikely to induce cross-reactivity to the healthy microbiome.The full table of identity values for each protein per species is provided in Table S5.

Gene expression validation by RT-qPCR
Validation of normalized RPKM expression of genes identified from RNA-seq data was performed by using reverse transcription quantitative PCR (RT-qPCR).A strong correla tion between the RPKM value and ΔCt was observed for B. pertussis (Pearson correlation r = 0.7852, **P < 0.0025), demonstrating the consistency of the results obtained (Fig. 5A).RT-qPCR was also performed on the same 12 genes in B. parapertussis and B. holmesii to confirm their expression.Expression of three genes fell below the limit of detection (<10% recA) in both B. pertussis and B. parapertussis (bp3051, bp2549, and nuoA), and one fell below the limit of detection in B. parapertussis (bp3346).The 8/12 genes (bp3067, FIG 4 Select immunogenic predicted proteins do not display high sequence similarity to bacterial species in healthy microbiota.The amino acid sequences of the top 32 detectable immunogenic proteins were aligned by BLASTp to 30 common bacterial species in the healthy human nasopharynx and 82 species in the PBIT against the gut microbiome (pbit.bicnirrh.res.in).Identity value was calculated for each protein per species based on percent identity and query coverage compared to the sequence in B. pertussis.Twelve proteins were identified with <50% identity to other bacterial species (green box).atpE, bp2424, atpB, lpxC, rpoP, mar, and bp2452) were expressed in vitro in all three species (Fig. 5B) and were further considered for epitope selection.

Immunogenic epitope prediction from top targets
Immunogenic and naturally processed T-cell epitopes were predicted within the eight candidate protein sequences using the Immune Epitope Database (IEDB) (iedb.org).The top peptides (15-20 amino acids each) from each protein with high predicted immuno genicity (IEDB immunogenicity and 7-allele combined score <50), high likelihood of being naturally processed MHC-II ligands (cleavage probability score >0.2 and cleavage probability percentile rank <10%), and conserved peptide sequences among all available isolates from the three Bordetella species (100% nine-amino acid peptide core sequence identity (37) in 743 isolates of B. pertussis, 82 isolates of B. parapertussis, and 64 isolates of B. holmesii) are presented in Fig. 6.Higher cleavage probability scores and inverse CD4 T-cell immunogenicity prediction represent increasingly immunogenic epitopes, with a 17-amino acid peptide from LpxC yielding the strongest predicted epitope.In total, 37 epitopes were identified across the eight proteins, with larger proteins producing more immunogenic targets; 36 were 100% conserved across all 889 available isolate genome sequences.A detailed list of all epitopes of interest, their predicted immunogenicity and cleavage probabilities, and protein function are presented in Table 1.

T-cell activation by selected epitopes against pertussis-primed cells
Finally, the activation-induced marker (AIM) assay was used to screen the 36 selec ted peptides with peripheral blood mononuclear cells (PBMCs) from wPV-vaccinated subjects to identify epitopes that are recognized by T cells elicited by vaccination.PBMCs were stimulated overnight with individual peptides, and dual expression of activation markers OX40 + CD25 + and PDL-1 + CD25 + was quantified via flow cytometry (gating strategy in Fig. S1A).CD4 + T cells in the PBMCs stimulated with 11 of the 36 peptides were OX40 + CD25 + or PDL + CD25 + with increased levels of marker expression compared to unstimulated controls (Fig. S1C) with a relatively small sample set of five wPV-primed individuals.Further testing identified six peptides (LB309, L187, L196, L203, LB307, and L177) that induced significant activation of OX40 + CD25 + and PDL + CD25 + CD4 + T cells compared to unstimulated controls in a larger cohort of wPV-primed subjects (Fig. 7A).To determine the phenotype of the CD4 + T cells, intracellular flow cytometry was used to determine the cytokine responses upon peptide stimulation (Fig. S1B).Three of the selected six peptides (L203, LB307, and L177) significantly induced interferon gamma (IFNγ) production, while LB307 also induced interleukin (IL)-4 cytokine compared to unstimulated controls (n = 10, *P < 0.05 by analysis of variance and Dunnett test) (Fig. 7B).Thus, at least six of the predicted immunogenic peptides were recognized by wPV-primed CD4 + T cells.

DISCUSSION
The increase in pertussis incidence in the aPV era represents a significant public health concern.Infection with B. pertussis can present many antigens that activate T cells, which are required for long-term immune responses, whereas aPV formulations only include three to five purified proteins and do not appear to induce as long-lasting T-cell immunity (38).We report here a survey of Bordetella protein and peptide sequences predicted from a large collection of whole-genome sequence data along with a strategy for prioritizing T-cell epitopes for vaccine development.
To recognize appropriate candidates for vaccine design, filters were sequentially applied to genomic data from recent clinical isolates of B. pertussis, B. parapertussis, and B. holmesii.Although not nationally notifiable in the United States, these other species have been identified during pertussis outbreaks and are not well protected against by current vaccines (22,23), and similar to B. pertussis, certain recent clinical strains of B. parapertussis are also pertactin-deficient (30).Cross-protection has improved vaccine coverage for other bacterial vaccines, such as the Bacille Calmette-Guérin vaccine that contains Mycobacterium bovis, but protects against Mycobacterium tuberculosis and Mycobacterium leprae (39).The antigens identified here as conserved across available isolates from all three species can facilitate the inclusion of targets capable of cross-pro tection and inform the development of improved aPVs that can cover multiple causes of pertussis-like illness.
Immunoinformatic approaches, such as analysis of EpiMatrix Class II scores and determination of immunogenic epitopes, can contribute to vaccine design, with algorithms increasing the efficiency of epitope discovery for vaccine research (40).Designed epitopes are desirable to add to vaccines because they are cost-effective in contrast to whole pathogens and may confer a similar safety profile to acellular vaccines, with few adverse reactions (41).These epitopes show good population coverage as the EpiMatrix algorithm considers proteins that are predicted to reach 95% of human leukocyte antigens (HLAs) in the world's population (40).This analysis also ruled out proteins with similar sequences to human proteins and proteins from the healthy human microbiome, thus identifying T-cell-inducing epitopes that would likely not have cross-reactive responses to human protein sequences and cause specific immune responses to Bordetella species.
Additional strategies to improve vaccine immunity through inducing T-cell responses include using novel adjuvants for aPVs (42)(43)(44)(45), which may be added in tandem with new antigens, or using live-attenuated wPVs (46), but these strategies do not address (B) IFNγ + , IL-4 + , and IL-17 + CD4 + T cells were significantly upregulated after stimulation with L203 (BP2424), LB307 (BP2452), and L177 (BP2452).*P < 0.05, **P < 0.01, and ****P < 0.0001, by analysis of variance and Dunnett test for multiple comparisons, compared to non-stimulated controls.the benefit of cross-protection for other Bordetella species.The proteins selected here are also predicted to be intracellular as only six surface targets were identified as highly immunogenic in B. pertussis (Table S1), but these six proteins were not conserved across species.Previous work has suggested that vaccination by aPVs seems to be more efficacious with a higher number of antigens included in the vaccine (47), so additional targets that can activate memory T-cell responses, such as those suggested here, may improve the levels of protection induced by aPVs.One limitation in the selection of these antigens includes using RNA and protein samples generated from in vitro bacterial cultures to evaluate gene expression and protein production as they may not represent the complexity of natural microbial environments.Although these complexities are not reflected, confirmation that the chosen target sequences are conserved across >700 whole-genome sequences of B. pertussis suggests these targets are maintained in the profiles of clinical isolates.
Our results show that eight different proteins with 36 epitopes are predicted candidates to induce long-term T-cell immunity by peptide vaccines.Of these eight proteins, five have previously been identified as essential to Bordetella function (AtpB, AtpE, BP2452, LpxC, and RpoP) (48), further suggesting that these proteins may be suitable targets because the pathogen is less likely to repress their production.Peptides from two of those essential proteins (BP2452 and LpxC) were shown to induce T-cell activation ex vivo here and may induce more protective and long-term immunity to Bordetella vaccines.CD4 + T-cell recognition of these peptides derived from intracellu lar proteins suggests that these antigen-specific responses to intracellular targets are still important in Bordetella-specific immunogenic responses.Peptides may also induce a more direct immune response (49), with epitope vaccines providing a strong way forward in the evolution of vaccine formulations (50).
Pertussis resurgence is likely multifaceted, but understanding the interaction between the host immune response and pathogen proteins may inform the design of revised acellular vaccines that maintain a strong safety profile while inducing a durable memory immune response.The duration of immunogenicity of the predicted targets here must be confirmed through in vivo models to determine the longevity of the T-cell responses.These T-cell-inducing peptides may help lead to the creation of a safe and durable vaccine to supplement prevention strategies and reverse the recent increase in pertussis incidence.

Genomics and immunoinformatics
Complete whole-genome sequence assemblies from 743 B. pertussis, 82 B. parapertussis, and 64 B. holmesii isolates were retrieved from the Centers for Disease Control and Prevention (CDC) Pertussis and Diphtheria Laboratory (PDL) whole-genome sequence database (30)(31)(32).Among these, H762 (B.pertussis), H889 (B.parapertussis), and H785 (B.holmesii) were used as representative isolates for immunogenicity analyses.A flow chart of the methods for the selection of predicted proteins identified in the genomic data is presented in Fig. 2.
Orthologous protein-coding genes shared among the three representative Borde tella species isolates (H762, H889, and H785) were identified through reciprocal bestmatch alignment using BLASTp and a minimum amino acid sequence threshold of >90% sequence similarity.The EpiMatrix and JanusMatrix algorithms, part of the iVax immunoinformatic toolkit (EpiVax, Inc.), were used to conduct immunogenicity prediction for all predicted proteins derived from Bordetella whole-genome sequences of the three representative isolates (40,51).The EpiMatrix algorithm was applied to model interactions between peptides and the MHC-II (encoded by HLA in humans).An EpiMatrix Class II score was applied to each protein.The EpiMatrix Class II scores model the likelihood that these peptides would be presented by MHC-II molecules to common T-cell receptors.Proteins with EpiMatrix Class II scores ≥20 are considered to have significant immunogenic potential.
To avoid cross-reaction with human peptides, the JanusMatrix tool was used to examine pathogen/host sequence similarity at the MHC:T-cell receptor interface, allowing candidate peptide sequences with potential host cross-conservation to be preferentially excluded from vaccine constructs.Proteins with JanusMatrix scores ≥5 were considered cross-reactive to human proteins and excluded as potential vaccine candidates.The presence of selected orthologs was confirmed in the full collection of B. pertussis, B. parapertussis, and B. holmesii isolate genome sequences using BLASTp.Subcellular localization and functional categories of proteins were predicted by PSORTb (https://www.psort.org/psortb/)and EggNOG v4.1 (52), respectively.
Protein candidates were further compared to the genome sequences from a selection of 30 bacterial species common in the healthy human nasopharynx using the BLASTp web server with default parameters (expected threshold = 0.05, word size = 6, max matches in a query range = 0, matrix = BLOSUM62, gap costs = Existence:11/Extension:1, and compositional adjustments = conditional compositional score matrix adjustment).Vaccine candidates were submitted to the target identification pipeline (PBIT, http:// www.pbit.bicnirrh.res.in/,V1.0) to perform a non-homologous analysis against the proteomes of 82 species in the healthy intestinal microbiota (53).The PBIT uses a cutoff of 50% identity, a value suggested to predict cross-reactive immune responses (54)(55)(56).An identity value was calculated for each protein per species based on percent identity × query coverage (35,36) against the protein sequence from B. pertussis H762.The 50% identity cutoff was used in these analyses as a maximum threshold; all proteins with >50% identity to commensal species were excluded from further analyses.

Transcriptomics and proteomics
Fifteen representative isolates of B. pertussis were previously selected through phyloge netic analysis of single-nucleotide polymorphisms and chromosome rearrangements.Ten pertactin negative isolates (H762, I127, I762, I977, J021, J083, J151, J172, J818, and J820) and five pertactin positive isolates (D420, H787, I420, Tohama-I, and UT25) were chosen for RNA-seq and shotgun proteomic analysis.All isolates were cultured first on Bordet Gengou agar for 48 h and then swabbed into in Stainer-Scholte broth (SSB) at an OD 600 of 0.1 SSB.B. pertussis cultures were then incubated for 24 h at 180 rpm at 36°C.RNA was isolated using Qiagen RNeasy Kits per manufacture protocols, and proteins were harvested from cell pellets that were separated on SDS-PAGE gels as previously described (57).
For RNA-seq, libraries were constructed using Illumina TruSeq library preparation with ribosomal depletion.All RNA samples were DNase treated as previously described (57).Each library was sequenced on a HiSeq 4000 instrument at Admera Health; 50-M 2 × 150-bp reads were devoted for each biological sample of each isolate.RPKMs were determined for the 3,631 genes in each of the 15 B. pertussis transcriptomes.Genes with low abundance (<30 RPKM) (58) in any of the 15 transcriptomes were excluded from additional analysis.RPKM values for each gene were normalized to the housekeeping gene recA (59) for comparative analyses and heat map preparation.
For proteomics, proteins were isolated from 14 representative B. pertussis isolates, and liquid chromatography with tandem mass spectrometry was performed by AVM Biomedical.Total protein and peptide intensity were determined for 2,100 proteins in each of the 14 B. pertussis proteomes.Proteins with low production (no detectable peaks in ≥1 isolate) were excluded as potential candidates.All intensity values were normalized to the housekeeping gene recA for comparative analyses and heat map preparation.

Gene expression confirmation by RT-qPCR
Expression levels of 12 candidate genes were confirmed using RT-qPCR.Liquid cultures of select B. pertussis (H762, H787, I420, and J818), B. parapertussis (H889), and B. holmesii (H785) isolates were prepared in triplicate as previously described (60).Approximately 10 9 cells were collected from each culture by centrifugation (5,000 × g for 5 min) and resuspended in RNAProtect Bacteria Reagent (Qiagen).RNA was extracted from samples following the Quick-Start Protocol for RNAProtect Bacteria (#76506) and the RNeasy Protect Bacteria Kits (#74104) (Qiagen).Potential DNA contamination was removed using TURBO DNase (Invitrogen), and RNA was further purified using RNAClean XP beads (Beckman Coulter).RNA purity was confirmed following protocols for RNA ScreenTape analysis on the 2200 Tapestation Controller Software (Agilent).Reverse transcription and cDNA preparation were performed on all quality RNA samples following the instructions for the High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems).
RT-qPCR was performed in duplicate with primers and TaqMan probes listed in Table S6 and synthesized in the Biotechnology Core Facility Branch at CDC.Amplification was performed in 25-µL reactions containing 12.5-µL Quanta PerfeCTa qPCR ToughMix UNG (Quanta Biosciences, Inc.), 1-µL forward primers, 1-µL reverse primers, 1-µL TaqMan probes, 5.5-µL PCR-grade water, and 4-µL cDNA using an Applied Biosystems 7500 Fast Dx Instrument (Life Technologies) by preincubation at 50°C for 2 min, denaturation at 95°C for 10 min, and then 45 cycles of 95°C for 15 s and 58°C for 1 min (61).ΔCt values were calculated by (Ct of vaccine candidate Gene)/(Ct of recA housekeeping gene) for each isolate and averaged across 2 technical and three biological replicates per species.

Epitope prediction
Further epitope prediction was performed on the top vaccine candidate proteins using tools in the IEDB (iedb.org)(62).CD4 T-cell immunogenicity prediction was performed on sequences from each Bordetella species for each candidate protein using the com bined 7-allele and immunogenicity prediction methods with a recommended combined threshold of 50.All epitopes with conserved peptide cores across the three species and a combined threshold of ≤50 were considered immunogenic regions in the selected proteins (37).Prediction of naturally processed ligands was performed on protein sequences from each Bordetella species for each candidate protein using the prediction of naturally processed MHC-II ligands (MHCII-NP) tool to determine those epitopes that would likely be presented to T-cell receptors.Peptides with a cleavage probability score of ≥0.2 and a cleavage probability percentile rank of ≤10% were considered likely to be processed ligands (63).Conservation of these epitope sequences across 743 B. pertussis, 82 B. parapertussis, and 64 B. holmesii isolates retrieved from the PDL whole-genome sequence database was determined by BLASTp.

AIM assay and intracellular cytokine detection
Expression of activation markers was evaluated essentially as described previously (64).Whole blood was collected from non-pregnant women who were born before January 1998 and received the wPV as children (under Ohio State University Institutional Review Board #2020H0404).PBMCs were isolated from the whole blood using a Ficoll-Paque gradient and cryopreserved in 90% FBS + 10% DMSO (Sigma-Aldrich, Cat.D8418) at −80°C until downstream analysis.For quantification of AIM + CD4 + T cells, PBMCs were thawed, washed, and resuspended in T-cell media [RPMI 1640 + 10% human AB sera (Sigma-Aldrich, Cat.H4522) + 10-µg/mL Gentamicin (Thermo Fisher Scientific, Cat.15710064) + 5 × 10 −5 M βeta-mercaptoethanol].Cells were plated (2 × 10 6 /well) in a sterile U bottom 96-well plate (Falcon, Ref. 353077) and stimulated with 2 µg/mL of selected peptide or 10 µg/mL of phytohemagglutinin (Sigma-Aldrich, Cat.L1668) as positive control or DMSO as negative control.Samples were incubated 16 h at 37°C + 5% CO 2 and then washed and resuspended in live/dead stain (Zombie Nir 1:10,000 dilution) (BioLegend, Cat.423105) and incubated for 30 min in the dark at 4°C.After washing, cells were blocked with PBS + 10% FBS for 5 min at 4°C and then stained with the indicated antibodies (Table S7) for 15 min at 4°C.For detection of cytokines, protein transport inhibitor (eBioscience, Cat.00-4980) was added to the well at the time of stimulation, and cells were incubated overnight ~16 h at 37°C + 5% CO 2 .Following stimulation, cells were washed and resuspended in live/dead stain (Zombie Nir 1:10,000 dilution) (BioLegend, Cat.423105) and then washed and blocked with 10% FBS + PBS for 5 min at 4°C followed by cell surface marker staining (Table S8).Samples were washed and fixed with intracellular fixation buffer (eBioscience, Cat.00-8222-49) 20 min at room temperature with subsequent permeabilization (eBioscience, Cat.00-8333-56).Intracellular cytokine antibodies were added for IFNγ, IL-4, and IL-17 and incubated for 30 min at 4°C.Fluorescence minus one samples were prepared using a pool of cells from each patient sample and used for gating controls.
Cells were analyzed using Cytek Aurora (four-laser) spectral flow cytometer.Data were analyzed via FlowJo software version 10.8.0, and GraphPad Prism software version 9.2 was used for figure preparation.

FIG 2
FIG 2Flow chart for selecting cross-protective, immunogenic, detectable, Bordetella-specific protein targets and epitopes.Boxes describe selection criteria;

FIG 3
FIG 3 Total RNA expression and protein production of top 64 immunogenic targets in B. pertussis.(A) Heat map of RNA-seq data (gene RPKM normalized to RPKM for the housekeeping gene recA) across a representative set of 15 B. pertussis isolates.Genes with low expression (<30 RPKM in ≥1 isolate) were excluded as potential candidates and are denoted by a red line through the gene name or fall below the blue line.A subset of 35/64 genes are detectable (≥30 RPKM) in all isolates at the mRNA level.(B) Heat map of proteomic data (measured by intensity per protein normalized to intensity for recA) across a representative set of 14 B. pertussis isolates.Proteins with low production (no detectable peaks in ≥1 isolate) were excluded as potential candidates and are denoted by a red line through the protein name or fall below the blue line.A subset of 32/52 proteins are detectable in all isolates at the protein level.Pertactin+ (PRN+) isolates are denoted with a + symbol.(C) Correlation between log-transformed RNA-seq (median RPKM normalized to recA) and proteomic (median intensity normalized to recA) data for the top 64 immunogenic targets with a linear regression curve (solid line) surrounded by 90% prediction intervals (dotted lines).Of the top 64 targets, five genes have a lower median protein intensity than predicted by the mRNA expression for all isolates (red), one additional gene for PRN+ isolates (green), and one additional gene for PRN− isolates (blue).Pearson correlation r = 0.5365, ****P < 0.0001.(D) The 32 targets with detectable mRNA expression and protein production were stratified by predicted functional categories (outer circle) and subcellular localization (inner circle).

FIG 5
FIG 5 Quantitative RT-qPCR data confirmed the expression of immunogenic targets in all three Bordetella species.(A) Correlation of RNA-seq data and RTq-PCR data for 12 candidate genes.Graph shows mean RPKM levels (normalized to the housekeeping gene, recA, ±SEM) across four select B. pertussis isolates versus mean ΔCt levels (normalized to recA, ±SEM) across the same four isolates.Pearson correlation r = 0.7852, **P = 0025.(B) For the same 12 genes, expression levels in B. parapertussis and B. holmesii were determined by RT-qPCR.Eight out of 12 genes had detectable expression in all three species (>10% recA).

FIG 6
FIG 6 Predicted immunogenic and naturally processed T-cell epitopes for top protein targets across Bordetella species.The top peptides from selected protein targets (listed in Table 1) with high predicted immunogenicity (IEDB immunogenicity and 7-allele combined score ≥50), high likelihood of being naturally processed MHC-II ligands (cleavage probability score >0.2 and cleavage probability percentile rank <10%), and conserved peptide sequences among three Bordetella species (100% peptide core sequence identity across 889 Bordetella species) are represented here by individual dots.Thirty-six conserved epitopes were identified with high scores for ligand prediction (cleavage probability score) and CD4 T-cell immunogenicity prediction (immunogenicity and 7-allele combined score).

TABLE 1
List of top 8 proteins with 36 immunogenic epitope targets

Protein Length a Epitope sequences b Epitope identifier for AIM assays IEDB immunogenicity score c IEDB cleavage probability score/% rank d
(Continued on next page)

TABLE 1
List of top 8 proteins with 36 immunogenic epitope targets (Continued)

Protein Length a Epitope sequences b Epitope identifier for AIM assays IEDB immunogenicity score c IEDB cleavage probability score/% rank d
a Length of each protein in number of amino acids (AA).b Amino acid sequences of immunogenic epitopes.c Predicted immunogenicity scores (lower values suggest higher predicted immunogenicity).d Predicted cleavage probability scores (higher values suggest higher probability of cleavage) and cleavage probability percentile ranks (lower percentile ranks suggest greater chance of cleavage).